Claas' Blog

Running Ollama inside a LXC Container

CLAAS VONDERSCHEN - 2024-11-17

We want to run our own local LLM on a AMD GPU inside a lxc container:

  1. Install AMD drivers.
  2. Setup gpu passtrough to the container.
  3. Install and run Ollama

Ollama is a software, which lifts the burden of maintaining llm model versions, dependencies and everything else required for questioning the large language model for answers.

Install AMD Drivers

On Arch linux the drivers for the gpu can be installed following the steps outlined in the Wiki:

$ pacman -S mesa xf86-video-amdgpu

Setup GPU Passthrough to Container

First create a container with lxc. The drivers installed inside the container should match the drivers installed on the host. It is therefore easiest to use the same OS for the container as for the host machine.

Next we need to know the required cgroups for the graphics devices. First look for the pci identifier:

$ lspci -d ::03xx
08:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Navi 21 [Radeon RX 6800/6800 XT / 6900 XT] (rev c1)
64:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Rembrandt [Radeon 680M] (rev d4)

For the machine at hand the 08:00.0 device must be selected.

Next select the correct card based on the information found in /dev/dri/by-path:

$ ls -l /dev/dri/by-path/
lrwxrwxrwx 1 root root  8 Nov 14 19:00 pci-0000:08:00.0-card -> ../card0
lrwxrwxrwx 1 root root 13 Nov 14 19:00 pci-0000:08:00.0-render -> ../renderD129
lrwxrwxrwx 1 root root  8 Nov 14 18:26 pci-0000:64:00.0-card -> ../card1
lrwxrwxrwx 1 root root 13 Nov 14 18:26 pci-0000:64:00.0-render -> ../renderD128

We can now obtain the correct cgroups required for accessing the GPU:

$ ls -l /dev/kfd /dev/dri/{card0,renderD129}
crw-rw----+ 1 root video  226,   0 Nov 14 21:05 /dev/dri/card0
crw-rw-rw-  1 root render 226, 129 Nov 14 19:17 /dev/dri/renderD129
crw-rw-rw-  1 root render 511,   0 Nov 14 19:17 /dev/kfd

The devices can now be forwarded to our lxc container, edit the container’s configuration and add the following lines:

config:
  ...
  raw.lxc: |
    # Default unified cgroup configuration
    #
    # CGroup allowlist
    lxc.cgroup2.devices.deny = a
    ## Allow any mknod (but not reading/writing the node)
    lxc.cgroup2.devices.allow = c *:* m
    lxc.cgroup2.devices.allow = b *:* m
    ## Allow specific devices
    ### /dev/null
    lxc.cgroup2.devices.allow = c 1:3 rwm
    ### /dev/zero
    lxc.cgroup2.devices.allow = c 1:5 rwm
    ### /dev/full
    lxc.cgroup2.devices.allow = c 1:7 rwm
    ### /dev/tty
    lxc.cgroup2.devices.allow = c 5:0 rwm
    ### /dev/console
    lxc.cgroup2.devices.allow = c 5:1 rwm
    ### /dev/ptmx
    lxc.cgroup2.devices.allow = c 5:2 rwm
    ### /dev/random
    lxc.cgroup2.devices.allow = c 1:8 rwm
    ### /dev/urandom
    lxc.cgroup2.devices.allow = c 1:9 rwm
    ### /dev/pts/*
    lxc.cgroup2.devices.allow = c 136:* rwm
    ### fuse
    lxc.cgroup2.devices.allow = c 10:229 rwm
    ### gpu passthrough
    lxc.cgroup2.devices.allow = c 226:129 rwm
    lxc.cgroup2.devices.allow = c 511:0 rwm
    lxc.mount.entry = /dev/dri/card0 dev/dri/card0 none bind,optional,create=file
    lxc.mount.entry = /dev/dri/renderD129 dev/dri/renderD129 none bind,optional,create=file
    lxc.mount.entry = /dev/kfd dev/kfd none bind,optional,create=file
  ...

Restart the container for the configuration changes to take effect.

Inside the container you need to install the correct drivers as well:

$ pacman -S mesa xf86-video-amdgpu

Install and run Ollama

Install Ollama with support for an AMD GPU:

$ pacman -S ollama-rocm

Before running any other commands you need to start the ollama server first:

$ ollama serve

If the installation worked so far, lines similar to the following output can be expected:

time=2024-11-14T19:50:08.859Z level=INFO source=amd_linux.go:383 msg="amdgpu is supported" gpu=GPU-c2c9f0c7cc583ddd gpu_type=gfx1030
time=2024-11-14T19:50:08.860Z level=INFO source=types.go:123 msg="inference compute" id=GPU-c2c9f0c7cc583ddd library=rocm variant="" compute=gfx1030 driver=0.0 name=1002:73bf total="16.0 GiB" available="16.0 GiB"

You can now proceed to use Ollama, for more usage information see LLMs daheim mit Ollama

TAGS: