Skip to content

inspect estimate command

Estimate the compatibility of a model with the current hardware

Usage

shell
npx --no node-llama-cpp inspect estimate [modelPath]

Options

Required

Option Description
-m [string], --modelPath [string], --model [string], --path [string], --url [string], --uri [string] The path or URI of the GGUF file to use. If a URI is provided, the metadata will be read from the remote file without downloading the entire file. (string) (required)

Optional

Option Description
-H [string], --header [string] Headers to use when reading a model file from a URL, in the format key: value. You can pass this option multiple times to add multiple headers. (string[])
--gpu [string] Compute layer implementation type to use for llama.cpp. If omitted, uses the latest local build, and fallbacks to "auto" (default: Uses the latest local build, and fallbacks to "auto") (string)

choices: auto, metal, cuda, vulkan, false

--gpuLayers <number>, --gl <number> number of layers to store in VRAM. Set to max to use all the layers the model has (default: Automatically determined based on the available VRAM) (number)
-c <number>, --contextSize <number> Context size to use for the model context. Set to max or train to use the training context size. Note that the train context size is not necessarily what you should use for inference, and a big context size will use a lot of memory (default: Automatically determined based on the available VRAM) (number)
-e, --embedding Whether to estimate for creating an embedding context (default: false) (boolean)

Other

Option Description
-h, --help Show help
-v, --version Show version number