inspect measure
command
Measure VRAM consumption of a GGUF model file with all possible combinations of gpu layers and context sizes
Usage
shell
npx --no node-llama-cpp inspect measure [modelPath]
Options
Option | Description |
---|---|
-m [string] , --modelPath [string] , --model [string] , --path [string] , --url [string] , --uri [string] |
Model file to use for the measurements. Can be a path to a local file or a URI of a model file to download. Leave empty to choose from a list of recommended models (string) |
-H [string] , --header [string] |
Headers to use when downloading a model from a URL, in the format key: value . You can pass this option multiple times to add multiple headers. (string[]) |
--gpu [string] |
Compute layer implementation type to use for llama.cpp. If omitted, uses the latest local build, and fallbacks to "auto" (default: Uses the latest local build, and fallbacks to "auto") (string)
|
--minLayers <number> , --mnl <number> |
Minimum number of layers to offload to the GPU (default: 1 ) (number) |
--maxLayers <number> , --mxl <number> |
Maximum number of layers to offload to the GPU (default: All layers) (number) |
--minContextSize <number> , --mncs <number> |
Minimum context size (default: 512 ) (number) |
--maxContextSize <number> , --mxcs <number> |
Maximum context size (default: Train context size) (number) |
--flashAttention , --fa |
Enable flash attention for the context (default: false ) (boolean) |
-n <number> , --measures <number> |
Number of context size measures to take for each gpu layers count (default: 10 ) (number) |
--printHeaderBeforeEachLayer , --ph |
Print header before each layer's measures (default: true ) (boolean) |
--evaluateText [string] , --evaluate [string] , --et [string] |
Text to evaluate with the model (string) |
--repeatEvaluateText <number> , --repeatEvaluate <number> , --ret <number> |
Number of times to repeat the evaluation text before sending it for evaluation, in order to make it longer (default: 1 ) (number) |
-h , --help |
Show help |
-v , --version |
Show version number |