Skip to content

Type alias: LLamaChatPromptOptions

ts
type LLamaChatPromptOptions: {
  grammar: LlamaGrammar;
  maxTokens: number;
  onToken: (tokens) => void;
  repeatPenalty: false | LlamaChatSessionRepeatPenalty;
  signal: AbortSignal;
  temperature: number;
  topK: number;
  topP: number;
  trimWhitespaceSuffix: boolean;
};
type LLamaChatPromptOptions: {
  grammar: LlamaGrammar;
  maxTokens: number;
  onToken: (tokens) => void;
  repeatPenalty: false | LlamaChatSessionRepeatPenalty;
  signal: AbortSignal;
  temperature: number;
  topK: number;
  topP: number;
  trimWhitespaceSuffix: boolean;
};

Type declaration

grammar

ts
grammar?: LlamaGrammar;
grammar?: LlamaGrammar;

maxTokens

ts
maxTokens?: number;
maxTokens?: number;

onToken

ts
onToken?: (tokens) => void;
onToken?: (tokens) => void;

Parameters

ParameterType
tokensToken[]

repeatPenalty

ts
repeatPenalty?: false | LlamaChatSessionRepeatPenalty;
repeatPenalty?: false | LlamaChatSessionRepeatPenalty;

signal

ts
signal?: AbortSignal;
signal?: AbortSignal;

temperature

ts
temperature?: number;
temperature?: number;

Temperature is a hyperparameter that controls the randomness of the generated text. It affects the probability distribution of the model's output tokens. A higher temperature (e.g., 1.5) makes the output more random and creative, while a lower temperature (e.g., 0.5) makes the output more focused, deterministic, and conservative. The suggested temperature is 0.8, which provides a balance between randomness and determinism. At the extreme, a temperature of 0 will always pick the most likely next token, leading to identical outputs in each run.

Set to 0 to disable. Disabled by default (set to 0).

topK

ts
topK?: number;
topK?: number;

Limits the model to consider only the K most likely next tokens for sampling at each step of sequence generation. An integer number between 1 and the size of the vocabulary. Set to 0 to disable (which uses the full vocabulary).

Only relevant when temperature is set to a value greater than 0.

topP

ts
topP?: number;
topP?: number;

Dynamically selects the smallest set of tokens whose cumulative probability exceeds the threshold P, and samples the next token only from this set. A float number between 0 and 1. Set to 1 to disable.

Only relevant when temperature is set to a value greater than 0.

trimWhitespaceSuffix

ts
trimWhitespaceSuffix?: boolean;
trimWhitespaceSuffix?: boolean;

Trim whitespace from the end of the generated text Disabled by default.

Source

llamaEvaluator/LlamaChatSession.ts:31