Type Alias: LlamaRankingContextOptions

type LlamaRankingContextOptions = {
  contextSize?:   | "auto"
     | number
     | {
     min?: number;
     max?: number;
   };
  batchSize?: number;
  threads?: number;
  createSignal?: AbortSignal;
  template?:   | `${string}${string}${string}`
     | `${string}${string}${string}`;
  ignoreMemorySafetyChecks?: boolean;
};

Defined in: evaluator/LlamaRankingContext.ts:10

Properties

contextSize?

optional contextSize: 
  | "auto"
  | number
  | {
  min?: number;
  max?: number;
};

Defined in: evaluator/LlamaRankingContext.ts:23

The number of tokens the model can see at once.

"auto" - adapt to the current VRAM state and attempt to set the context size as high as possible up to the size the model was trained on.
number - set the context size to a specific number of tokens. If there's not enough VRAM, an error will be thrown. Use with caution.
{min?: number, max?: number} - adapt to the current VRAM state and attempt to set the context size as high as possible up to the size the model was trained on, but at least min and at most max.

Defaults to "auto".

batchSize?

optional batchSize: number;

Defined in: evaluator/LlamaRankingContext.ts:29

prompt processing batch size

threads?

optional threads: number;

Defined in: evaluator/LlamaRankingContext.ts:35

number of threads to use to evaluate tokens. set to 0 to use the maximum threads supported by the current machine hardware

createSignal?

optional createSignal: AbortSignal;

Defined in: evaluator/LlamaRankingContext.ts:38

An abort signal to abort the context creation

template?

optional template: 
  | `${string}${string}${string}`
  | `${string}${string}${string}`;

Defined in: evaluator/LlamaRankingContext.ts:54

The template to use for the ranking evaluation. If not provided, the model's template will be used by default.

The template is tokenized with special tokens enabled, but the provided query and document are not.

{{query}} is replaced with the query content.

{{document}} is replaced with the document content.

It's recommended to not set this option unless you know what you're doing.

Defaults to the model's template.

ignoreMemorySafetyChecks?

optional ignoreMemorySafetyChecks: boolean;

Defined in: evaluator/LlamaRankingContext.ts:62

Ignore insufficient memory errors and continue with the context creation. Can cause the process to crash if there's not enough VRAM for the new context.

Defaults to false.

LlamaModel

LlamaModelTokens

LlamaChatSession

LlamaText

GgufInsights

GbnfJsonSchema

ChatHistoryItem

ChatModelResponse

LlamaChatResponse

GgufFileInfo

GgufMetadata

LlamaContextOptions

BatchingOptions

LlamaChatSessionOptions

LLamaChatPromptOptions

Chat Wrapper Options

JinjaTemplateChatWrapperOptions

Type Alias: LlamaRankingContextOptions

Properties

contextSize?

batchSize?

threads?

createSignal?

template?

ignoreMemorySafetyChecks?

LlamaModelTokens

ChatModelResponse

GgufMetadata

LlamaContextOptions

BatchingOptions

LlamaChatSessionOptions

LLamaChatPromptOptions

JinjaTemplateChatWrapperOptions

Type Alias: LlamaRankingContextOptions ​

Properties ​

contextSize? ​

batchSize? ​

threads? ​

createSignal? ​

template? ​

ignoreMemorySafetyChecks? ​

Type Alias: LlamaRankingContextOptions

Properties

contextSize?

batchSize?

threads?

createSignal?

template?

ignoreMemorySafetyChecks?