Using `node-llama-cpp` in Docker

When using node-llama-cpp in a docker image to run it with Docker or Podman, you will most likely want to use it together with a GPU for fast inference.

For that, you'll have to:

Configure support for your GPU on the host machine
Build an image with the necessary GPU libraries
Enable GPU support when running the container

Configuring the Host Machine

Metal: Using Metal in of a docker container is not supported.

CUDA: You need to install the NVIDIA Container Toolkit on the host machine to use NVIDIA GPUs.

Vulkan: You need to install the relevant GPU drives on the host machine, and configure Docker or Podman to use them.

No GPU (CPU only): No special configuration is needed.

Building an Image

WARNING

Do not attempt to use alpine as the base image as it doesn't work well with many GPU drivers.

The potential image size savings of using alpine images are not worth the hassle, especially considering that the models files you use will likely be much larger than the image itself anyway.

CUDAVulkanNo GPU (CPU only)

Dockerfile

FROM node:22

# Replace `x86_64` with `sbsa` for ARM64
ENV NVARCH=x86_64
ENV INSTALL_CUDA_VERSION=12.6

SHELL ["/bin/bash", "-c"]
RUN apt-get update && \
    apt-get install -y --no-install-recommends gnupg2 curl ca-certificates && \
    curl -fsSL https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2404/${NVARCH}/3bf863cc.pub | apt-key add - && \
    echo "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2404/${NVARCH} /" > /etc/apt/sources.list.d/cuda.list && \
    apt-get purge --autoremove -y curl && \
    rm -rf /var/lib/apt/lists/*

RUN apt-get update && apt-get install -y --no-install-recommends \
    "cuda-cudart-${INSTALL_CUDA_VERSION//./-}" \
    "cuda-compat-${INSTALL_CUDA_VERSION//./-}" \
    "cuda-libraries-${INSTALL_CUDA_VERSION//./-}" \
    "libnpp-${INSTALL_CUDA_VERSION//./-}" \
    "cuda-nvtx-${INSTALL_CUDA_VERSION//./-}" \
    "libcusparse-${INSTALL_CUDA_VERSION//./-}" \
    "libcublas-${INSTALL_CUDA_VERSION//./-}" \
    git cmake clang libgomp1 \
    && rm -rf /var/lib/apt/lists/*

RUN apt-mark hold "libcublas-${INSTALL_CUDA_VERSION//./-}"

RUN echo "/usr/local/nvidia/lib" >> /etc/ld.so.conf.d/nvidia.conf \
    && echo "/usr/local/nvidia/lib64" >> /etc/ld.so.conf.d/nvidia.conf

ENV NVIDIA_VISIBLE_DEVICES=all
ENV NVIDIA_DRIVER_CAPABILITIES=all


RUN mkdir -p /opt/app
WORKDIR /opt/app
COPY . /opt/app

RUN npm ci

CMD npm start

Dockerfile

FROM node:22

SHELL ["/bin/bash", "-c"]
RUN apt-get update && \
    apt-get install -y --no-install-recommends mesa-vulkan-drivers libegl1 git cmake clang libgomp1 && \
    rm -rf /var/lib/apt/lists/*

ENV NVIDIA_VISIBLE_DEVICES=all
ENV NVIDIA_DRIVER_CAPABILITIES=all


RUN mkdir -p /opt/app
WORKDIR /opt/app
COPY . /opt/app

RUN npm ci

CMD npm start

Dockerfile

FROM node:22

SHELL ["/bin/bash", "-c"]
RUN apt-get update && \
    apt-get install -y --no-install-recommends git cmake clang libgomp1 && \
    rm -rf /var/lib/apt/lists/*


RUN mkdir -p /opt/app
WORKDIR /opt/app
COPY . /opt/app

RUN npm ci

CMD npm start

Running the Container

To run the container with GPU support, use the following:

docker CLIpodman CLIdocker-compose.yml

shell

docker run --rm -it --gpus=all my-image:tag

shell

podman run --rm -it --gpus=all my-image:tag

yaml

services:
  my-service:
    image: my-image:tag
    deploy:
      resources:
        reservations:
          devices:
            - capabilities: [gpu]
              count: all

When using the CLI, you can test the GPU support by running this command

docker CLIpodman CLI

shell

docker run --rm -it --gpus=all my-image:tag npx -y node-llama-cpp inspect gpu

shell

podman run --rm -it --gpus=all my-image:tag npx -y node-llama-cpp inspect gpu

Troubleshooting

NVIDIA GPU Is Not Recognized by the Vulkan Driver Inside the Container

Make sure your Docker/Podman configuration has an nvidia runtime:

Docker /etc/docker/daemon.jsonPodman

json

{
    "runtimes": {
        "nvidia": {
            "args": [],
            "path": "nvidia-container-runtime"
        }
    }
}

shell

sudo nvidia-ctk cdi generate --output=/etc/cdi/nvidia.yaml
nvidia-ctk cdi list

And then run the container with the nvidia runtime:

docker CLIpodman CLI

shell

docker run --rm -it --runtime=nvidia --gpus=all my-image:tag

shell

podman run --rm -it --device nvidia.com/gpu=all --security-opt=label=disable --gpus=all my-image:tag

Last edited 18 days agoView full history

Using node-llama-cpp in Docker ​

Configuring the Host Machine ​

Building an Image ​

Running the Container ​

Troubleshooting ​

NVIDIA GPU Is Not Recognized by the Vulkan Driver Inside the Container ​

Using `node-llama-cpp` in Docker

Configuring the Host Machine

Building an Image

Running the Container

Troubleshooting

NVIDIA GPU Is Not Recognized by the Vulkan Driver Inside the Container