Class: LlamaModel
Defined in: evaluator/LlamaModel/LlamaModel.ts:209
Properties
tokenizer
readonly tokenizer: Tokenizer;Defined in: evaluator/LlamaModel/LlamaModel.ts:238
onDispose
readonly onDispose: EventRelay<void>;Defined in: evaluator/LlamaModel/LlamaModel.ts:239
Accessors
disposed
Get Signature
get disposed(): boolean;Defined in: evaluator/LlamaModel/LlamaModel.ts:361
Returns
boolean
llama
Get Signature
get llama(): Llama;Defined in: evaluator/LlamaModel/LlamaModel.ts:365
Returns
tokens
Get Signature
get tokens(): LlamaModelTokens;Defined in: evaluator/LlamaModel/LlamaModel.ts:369
Returns
filename
Get Signature
get filename(): string | undefined;Defined in: evaluator/LlamaModel/LlamaModel.ts:373
Returns
string | undefined
fileInfo
Get Signature
get fileInfo(): GgufFileInfo;Defined in: evaluator/LlamaModel/LlamaModel.ts:377
Returns
fileInsights
Get Signature
get fileInsights(): GgufInsights;Defined in: evaluator/LlamaModel/LlamaModel.ts:381
Returns
gpuLayers
Get Signature
get gpuLayers(): number;Defined in: evaluator/LlamaModel/LlamaModel.ts:389
Number of layers offloaded to the GPU. If GPU support is disabled, this will always be 0.
Returns
number
useMmap
Get Signature
get useMmap(): boolean;Defined in: evaluator/LlamaModel/LlamaModel.ts:399
Whether the model is loaded using mmap (memory-mapped file) or not.
When Direct I/O (setting the useDirectIo option to true) is used it'll override mmap and this value may be out of sync with the actual usage of mmap for the loading of this model instance.
Returns
boolean
size
Get Signature
get size(): number;Defined in: evaluator/LlamaModel/LlamaModel.ts:408
Total model size in memory in bytes.
When using mmap, actual memory usage may be higher than this value due to llama.cpp's performance optimizations.
Returns
number
flashAttentionSupported
Get Signature
get flashAttentionSupported(): boolean;Defined in: evaluator/LlamaModel/LlamaModel.ts:414
Returns
boolean
defaultContextFlashAttention
Get Signature
get defaultContextFlashAttention(): boolean | "auto";Defined in: evaluator/LlamaModel/LlamaModel.ts:418
Returns
boolean | "auto"
defaultContextSwaFullCache
Get Signature
get defaultContextSwaFullCache(): boolean;Defined in: evaluator/LlamaModel/LlamaModel.ts:422
Returns
boolean
defaultContextKvCacheKeyType
Get Signature
get defaultContextKvCacheKeyType(): GgmlType;Defined in: evaluator/LlamaModel/LlamaModel.ts:426
Returns
defaultContextKvCacheValueType
Get Signature
get defaultContextKvCacheValueType(): GgmlType;Defined in: evaluator/LlamaModel/LlamaModel.ts:430
Returns
trainContextSize
Get Signature
get trainContextSize(): number;Defined in: evaluator/LlamaModel/LlamaModel.ts:719
The context size the model was trained on
Returns
number
embeddingVectorSize
Get Signature
get embeddingVectorSize(): number;Defined in: evaluator/LlamaModel/LlamaModel.ts:729
The size of an embedding vector the model can produce
Returns
number
vocabularyType
Get Signature
get vocabularyType(): LlamaVocabularyType;Defined in: evaluator/LlamaModel/LlamaModel.ts:738
Returns
Methods
dispose()
dispose(): Promise<void>;Defined in: evaluator/LlamaModel/LlamaModel.ts:347
Returns
Promise<void>
tokenize()
Call Signature
tokenize(
text: string,
specialTokens?: boolean,
options?: "trimLeadingSpace"): Token[];Defined in: evaluator/LlamaModel/LlamaModel.ts:444
Transform text into tokens that can be fed to the model
Parameters
| Parameter | Type | Description |
|---|---|---|
text | string | the text to tokenize |
specialTokens? | boolean | if set to true, text that correspond to special tokens will be tokenized to those tokens. For example, <s> will be tokenized to the BOS token if specialTokens is set to true, otherwise it will be tokenized to tokens that corresponds to the plaintext <s> string. |
options? | "trimLeadingSpace" | additional options for tokenization. If set to "trimLeadingSpace", a leading space will be trimmed from the tokenized output if the output has an additional space at the beginning. |
Returns
Token[]
Call Signature
tokenize(text: BuiltinSpecialTokenValue, specialTokens: "builtin"): Token[];Defined in: evaluator/LlamaModel/LlamaModel.ts:445
Transform text into tokens that can be fed to the model
Parameters
| Parameter | Type | Description |
|---|---|---|
text | BuiltinSpecialTokenValue | the text to tokenize |
specialTokens | "builtin" | if set to true, text that correspond to special tokens will be tokenized to those tokens. For example, <s> will be tokenized to the BOS token if specialTokens is set to true, otherwise it will be tokenized to tokens that corresponds to the plaintext <s> string. |
Returns
Token[]
detokenize()
detokenize(
tokens: readonly Token[],
specialTokens?: boolean,
lastTokens?: readonly Token[]): string;Defined in: evaluator/LlamaModel/LlamaModel.ts:559
Transform tokens into text
Parameters
| Parameter | Type | Default value | Description |
|---|---|---|---|
tokens | readonly Token[] | undefined | the tokens to detokenize. |
specialTokens? | boolean | false | if set to true, special tokens will be detokenized to their corresponding token text representation. Recommended for debugging purposes only. > Note: there may be additional spaces around special tokens that were not present in the original text - this is not a bug, this is how the tokenizer is supposed to work. Defaults to false. |
lastTokens? | readonly Token[] | undefined | the last few tokens that preceded the tokens to detokenize. If provided, the last few tokens will be used to determine whether a space has to be added before the current tokens or not, and apply other detokenizer-specific heuristics to provide the correct text continuation to the existing tokens. Using it may have no effect with some models, but it is still recommended. |
Returns
string
getTokenAttributes()
getTokenAttributes(token: Token): TokenAttributes;Defined in: evaluator/LlamaModel/LlamaModel.ts:580
Parameters
| Parameter | Type |
|---|---|
token | Token |
Returns
isSpecialToken()
isSpecialToken(token: Token | undefined): boolean;Defined in: evaluator/LlamaModel/LlamaModel.ts:591
Check whether the given token is a special token (a control-type token or a token with no normal text representation)
Parameters
| Parameter | Type |
|---|---|
token | Token | undefined |
Returns
boolean
iterateAllTokens()
iterateAllTokens(): Generator<Token, void, unknown>;Defined in: evaluator/LlamaModel/LlamaModel.ts:606
Returns
Generator<Token, void, unknown>
isEogToken()
isEogToken(token: Token | undefined): boolean;Defined in: evaluator/LlamaModel/LlamaModel.ts:619
Check whether the given token is an EOG (End Of Generation) token, like EOS or EOT.
Parameters
| Parameter | Type |
|---|---|
token | Token | undefined |
Returns
boolean
createContext()
createContext(options?: LlamaContextOptions): Promise<LlamaContext>;Defined in: evaluator/LlamaModel/LlamaModel.ts:626
Parameters
| Parameter | Type |
|---|---|
options | LlamaContextOptions |
Returns
createEmbeddingContext()
createEmbeddingContext(options?: LlamaEmbeddingContextOptions): Promise<LlamaEmbeddingContext>;Defined in: evaluator/LlamaModel/LlamaModel.ts:643
Parameters
| Parameter | Type |
|---|---|
options | LlamaEmbeddingContextOptions |
Returns
Promise<LlamaEmbeddingContext>
See
Using Embedding tutorial
createRankingContext()
createRankingContext(options?: LlamaRankingContextOptions): Promise<LlamaRankingContext>;Defined in: evaluator/LlamaModel/LlamaModel.ts:653
Parameters
| Parameter | Type |
|---|---|
options | LlamaRankingContextOptions |
Returns
See
Reranking Documents tutorial
getWarnings()
getWarnings(): string[];Defined in: evaluator/LlamaModel/LlamaModel.ts:665
Get warnings about the model file that would affect its usage.
These warnings include all the warnings generated by GgufInsights, but are more comprehensive.
Returns
string[]