IPEX-LLM
IPEX-LLM is a PyTorch library for running LLM on Intel CPU and GPU (e.g., local PC with iGPU, discrete GPU such as Arc, Flex and Max) with very low latency.
IPEX-LLM on Intel GPU
This example goes over how to use LangChain to interact with ipex-llm
for text generation on Intel GPU.
Note
It is recommended that only Windows users with Intel Arc A-Series GPU (except for Intel Arc A300-Series or Pro A60) run Jupyter notebook directly for section "IPEX-LLM on Intel GPU". For other cases (e.g. Linux users, Intel iGPU, etc.), it is recommended to run the code with Python scripts in terminal for best experiences.
Install Prerequisites
To benefit from IPEX-LLM on Intel GPUs, there are several prerequisite steps for tools installation and environment preparation.
If you are a Windows user, visit the Install IPEX-LLM on Windows with Intel GPU Guide, and follow Install Prerequisites to update GPU driver (optional) and install Conda.
If you are a Linux user, visit the Install IPEX-LLM on Linux with Intel GPU, and follow Install Prerequisites to install GPU driver, Intel® oneAPI Base Toolkit 2024.0, and Conda.
Setup
After the prerequisites installation, you should have created a conda environment with all prerequisites installed. Start the jupyter service in this conda environment:
%pip install -qU langchain langchain-community
Install IEPX-LLM for running LLMs locally on Intel GPU.
%pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
Note
You can also use
https://pytorch-extension.intel.com/release-whl/stable/xpu/cn/
as the extra-indel-url.
Runtime Configuration
For optimal performance, it is recommended to set several environment variables based on your device: