Running Ollama inside a LXC Container
We want to run our own local LLM on a AMD GPU inside a lxc container:
- Install AMD drivers.
- Setup gpu passtrough to the container.
- Install and run Ollama
Ollama is a software, which lifts the burden of maintaining llm model versions, dependencies and everything else required for questioning the large language model for answers.
Install AMD Drivers
On Arch linux the drivers for the gpu can be installed following the steps outlined in the Wiki:
$ pacman -S mesa xf86-video-amdgpu
Setup GPU Passthrough to Container
First create a container with lxc. The drivers installed inside the container should match the drivers installed on the host. It is therefore easiest to use the same OS for the container as for the host machine.
Next we need to know the required cgroups for the graphics devices. First look for the pci identifier:
$ lspci -d ::03xx
08:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Navi 21 [Radeon RX 6800/6800 XT / 6900 XT] (rev c1)
64:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Rembrandt [Radeon 680M] (rev d4)
For the machine at hand the 08:00.0 device must be selected.
Next select the correct card based on the information found in /dev/dri/by-path
:
$ ls -l /dev/dri/by-path/
lrwxrwxrwx 1 root root 8 Nov 14 19:00 pci-0000:08:00.0-card -> ../card0
lrwxrwxrwx 1 root root 13 Nov 14 19:00 pci-0000:08:00.0-render -> ../renderD129
lrwxrwxrwx 1 root root 8 Nov 14 18:26 pci-0000:64:00.0-card -> ../card1
lrwxrwxrwx 1 root root 13 Nov 14 18:26 pci-0000:64:00.0-render -> ../renderD128
We can now obtain the correct cgroups required for accessing the GPU:
$ ls -l /dev/kfd /dev/dri/{card0,renderD129}
crw-rw----+ 1 root video 226, 0 Nov 14 21:05 /dev/dri/card0
crw-rw-rw- 1 root render 226, 129 Nov 14 19:17 /dev/dri/renderD129
crw-rw-rw- 1 root render 511, 0 Nov 14 19:17 /dev/kfd
The devices can now be forwarded to our lxc container, edit the container’s configuration and add the following lines:
config:
...
raw.lxc: |
# Default unified cgroup configuration
#
# CGroup allowlist
lxc.cgroup2.devices.deny = a
## Allow any mknod (but not reading/writing the node)
lxc.cgroup2.devices.allow = c *:* m
lxc.cgroup2.devices.allow = b *:* m
## Allow specific devices
### /dev/null
lxc.cgroup2.devices.allow = c 1:3 rwm
### /dev/zero
lxc.cgroup2.devices.allow = c 1:5 rwm
### /dev/full
lxc.cgroup2.devices.allow = c 1:7 rwm
### /dev/tty
lxc.cgroup2.devices.allow = c 5:0 rwm
### /dev/console
lxc.cgroup2.devices.allow = c 5:1 rwm
### /dev/ptmx
lxc.cgroup2.devices.allow = c 5:2 rwm
### /dev/random
lxc.cgroup2.devices.allow = c 1:8 rwm
### /dev/urandom
lxc.cgroup2.devices.allow = c 1:9 rwm
### /dev/pts/*
lxc.cgroup2.devices.allow = c 136:* rwm
### fuse
lxc.cgroup2.devices.allow = c 10:229 rwm
### gpu passthrough
lxc.cgroup2.devices.allow = c 226:129 rwm
lxc.cgroup2.devices.allow = c 511:0 rwm
lxc.mount.entry = /dev/dri/card0 dev/dri/card0 none bind,optional,create=file
lxc.mount.entry = /dev/dri/renderD129 dev/dri/renderD129 none bind,optional,create=file
lxc.mount.entry = /dev/kfd dev/kfd none bind,optional,create=file
...
Restart the container for the configuration changes to take effect.
Inside the container you need to install the correct drivers as well:
$ pacman -S mesa xf86-video-amdgpu
Install and run Ollama
Install Ollama with support for an AMD GPU:
$ pacman -S ollama-rocm
Before running any other commands you need to start the ollama server first:
$ ollama serve
If the installation worked so far, lines similar to the following output can be expected:
time=2024-11-14T19:50:08.859Z level=INFO source=amd_linux.go:383 msg="amdgpu is supported" gpu=GPU-c2c9f0c7cc583ddd gpu_type=gfx1030
time=2024-11-14T19:50:08.860Z level=INFO source=types.go:123 msg="inference compute" id=GPU-c2c9f0c7cc583ddd library=rocm variant="" compute=gfx1030 driver=0.0 name=1002:73bf total="16.0 GiB" available="16.0 GiB"
You can now proceed to use Ollama, for more usage information see LLMs daheim mit Ollama