Skip to content

Class: LlamaContextSequence

Properties

onDispose

ts
readonly onDispose: EventRelay<void>;

Defined in

evaluator/LlamaContext/LlamaContext.ts:815

Accessors

disposed

ts
get disposed(): boolean

Returns

boolean

Defined in

evaluator/LlamaContext/LlamaContext.ts:862


context

ts
get context(): LlamaContext

Returns

LlamaContext

Defined in

evaluator/LlamaContext/LlamaContext.ts:866


model

ts
get model(): LlamaModel

Returns

LlamaModel

Defined in

evaluator/LlamaContext/LlamaContext.ts:870


nextTokenIndex

ts
get nextTokenIndex(): number

Returns

number

Defined in

evaluator/LlamaContext/LlamaContext.ts:874


contextTokens

ts
get contextTokens(): Token[]

Returns

Token[]

Defined in

evaluator/LlamaContext/LlamaContext.ts:878


tokenMeter

ts
get tokenMeter(): TokenMeter

Returns

TokenMeter

Defined in

evaluator/LlamaContext/LlamaContext.ts:882


isLoadedToMemory

ts
get isLoadedToMemory(): boolean

Returns

boolean

Defined in

evaluator/LlamaContext/LlamaContext.ts:886

Methods

dispose()

ts
dispose(): void

Returns

void

Defined in

evaluator/LlamaContext/LlamaContext.ts:846


compareContextTokens()

ts
compareContextTokens(tokens: Token[]): {
  firstDifferentIndex: number;
}

Parameters

ParameterType
tokensToken[]

Returns

ts
{
  firstDifferentIndex: number;
}
firstDifferentIndex
ts
firstDifferentIndex: number;

Defined in

evaluator/LlamaContext/LlamaContext.ts:890


clearHistory()

ts
clearHistory(): Promise<void>

Clear the history of the sequence. If prependBos was enabled, the BOS token will be prepended to the sequence again.

Returns

Promise<void>

Defined in

evaluator/LlamaContext/LlamaContext.ts:911


eraseContextTokenRanges()

ts
eraseContextTokenRanges(ranges: ContextTokensDeleteRange[]): Promise<void>

Erase context tokens in the provided ranges to free up space for new tokens to be generated. The start of each range is inclusive, and the end of each range is exclusive. For example, the range {start: 0, end: 1} will remove the token at the 0 index only.

Parameters

ParameterType
rangesContextTokensDeleteRange[]

Returns

Promise<void>

Defined in

evaluator/LlamaContext/LlamaContext.ts:922


evaluate()

ts
evaluate(tokens: Token[], options: {
  temperature: number;
  minP: number;
  topK: number;
  topP: number;
  seed: number;
  grammarEvaluationState: LlamaGrammarEvaluationState | () => undefined | LlamaGrammarEvaluationState;
  repeatPenalty: LlamaContextSequenceRepeatPenalty;
  tokenBias: TokenBias | () => TokenBias;
  evaluationPriority: EvaluationPriority;
  contextShift: ContextShiftOptions;
  yieldEogToken: boolean;
 }): AsyncGenerator<Token, void | Token, any>

Parameters

ParameterTypeDescription
tokensToken[]-
optionsobject-
options.temperature?number-
options.minP?number-
options.topK?number-
options.topP?number-
options.seed?numberUsed to control the randomness of the generated text. Change the seed to get different results. Defaults to the current epoch time. Only relevant when using temperature.
options.grammarEvaluationState?LlamaGrammarEvaluationState | () => undefined | LlamaGrammarEvaluationState-
options.repeatPenalty?LlamaContextSequenceRepeatPenalty-
options.tokenBias?TokenBias | () => TokenBiasAdjust the probability of tokens being generated. Can be used to bias the model to generate tokens that you want it to lean towards, or to avoid generating tokens that you want it to avoid.
options.evaluationPriority?EvaluationPriorityWhen a lot of tokens are queued for the next batch, more than the configured batchSize, the tokens for each sequence will be evaluated based on the strategy chosen for the context. By default, the "maximumParallelism" strategy is used, which will try to evaluate as many sequences in parallel as possible, but at some point, it'll have to choose which sequences to evaluate more tokens of, so it'll prioritize the sequences with the highest evaluation priority. Also, a custom strategy can be used to prioritize the sequences differently, but generally, the higher the evaluation priority is, the more likely and more tokens will be evaluated for that sequence in the next queued batch.
options.contextShift?ContextShiftOptionsOverride the sequence context shift options for this evaluation
options.yieldEogToken?booleanYield an EOG (End Of Generation) token (like EOS and EOT) when it's generated. When false the generation will stop when an EOG token is generated and the token won't be yielded. Defaults to false.

Returns

AsyncGenerator<Token, void | Token, any>

Defined in

evaluator/LlamaContext/LlamaContext.ts:996


evaluateWithoutGeneratingNewTokens()

ts
evaluateWithoutGeneratingNewTokens(tokens: Token[], options?: {
  evaluationPriority: 5;
  contextShift: {};
 }): Promise<void>

Evaluate the provided tokens into the context sequence without generating new tokens.

Parameters

ParameterTypeDescription
tokensToken[]
options?object
options.evaluationPriority?EvaluationPriorityWhen a lot of tokens are queued for the next batch, more than the configured batchSize, the tokens for each sequence will be evaluated based on the strategy chosen for the context. By default, the "maximumParallelism" strategy is used, which will try to evaluate as many sequences in parallel as possible, but at some point, it'll have to choose which sequences to evaluate more tokens of, so it'll prioritize the sequences with the highest evaluation priority. Also, a custom strategy can be used to prioritize the sequences differently, but generally, the higher the evaluation priority is, the more likely and more tokens will be evaluated for that sequence in the next queued batch.
options.contextShift?ContextShiftOptionsOverride the sequence context shift options for this evaluation

Returns

Promise<void>

Defined in

evaluator/LlamaContext/LlamaContext.ts:1087