node-llama-cpp

Run AI models locally on your machine

node.js bindings for llama.cpp

Metal and CUDA support

Utilize the power of your GPU to run AI models faster

Learn more

Pre-built binaries are provided, with a fallback to building from source without node-gyp or Python

Learn more

Chat with AI models using one of the builtin chat wrappers, or create your own

Learn more

Force a model to generate output in a parseable format, like JSON, or even force it to follow a specific JSON schema

Learn more