Class: LlamaContextSequence
Defined in: evaluator/LlamaContext/LlamaContext.ts:911
Properties
onDispose
readonly onDispose: EventRelay<void>;
Defined in: evaluator/LlamaContext/LlamaContext.ts:934
Accessors
disposed
Get Signature
get disposed(): boolean
Defined in: evaluator/LlamaContext/LlamaContext.ts:986
Returns
boolean
context
Get Signature
get context(): LlamaContext
Defined in: evaluator/LlamaContext/LlamaContext.ts:990
Returns
model
Get Signature
get model(): LlamaModel
Defined in: evaluator/LlamaContext/LlamaContext.ts:994
Returns
contextSize
Get Signature
get contextSize(): number
Defined in: evaluator/LlamaContext/LlamaContext.ts:999
The maximum number of tokens that the sequence state can hold
Returns
number
nextTokenIndex
Get Signature
get nextTokenIndex(): number
Defined in: evaluator/LlamaContext/LlamaContext.ts:1004
The index where the next evaluated token will be placed in the context
Returns
number
contextTokens
Get Signature
get contextTokens(): Token[]
Defined in: evaluator/LlamaContext/LlamaContext.ts:1009
The current context state tokens
Returns
Token
[]
tokenMeter
Get Signature
get tokenMeter(): TokenMeter
Defined in: evaluator/LlamaContext/LlamaContext.ts:1016
Returns
tokenPredictor
Get Signature
get tokenPredictor(): undefined | TokenPredictor
Defined in: evaluator/LlamaContext/LlamaContext.ts:1023
The token predictor used when creating this sequence.
Returns
undefined
| TokenPredictor
tokenPredictions
Get Signature
get tokenPredictions(): {
used: number;
unused: number;
validated: number;
refuted: number;
}
Defined in: evaluator/LlamaContext/LlamaContext.ts:1036
Statistics of token predictions using the sequence's tokenPredictor
.
The statistics change only when token prediction is used in this sequence.
validated
+ refuted
= total number of evaluated predictions.
Prefer using validated
and refuted
to evaluate the effectiveness of token prediction.
Returns
{
used: number;
unused: number;
validated: number;
refuted: number;
}
used
used: number;
Number of token predictions that were actually used (tokens that were validated and then consumed)
unused
unused: number;
Number of token predictions that were not used (tokens that were validated and were not consumed)
validated
validated: number;
Number of token predictions that were validated successfully
refuted
refuted: number;
Number of token predictions that were refuted
isLoadedToMemory
Get Signature
get isLoadedToMemory(): boolean
Defined in: evaluator/LlamaContext/LlamaContext.ts:1057
Returns
boolean
Methods
dispose()
dispose(): void
Defined in: evaluator/LlamaContext/LlamaContext.ts:970
Returns
void
compareContextTokens()
compareContextTokens(tokens: Token[]): {
firstDifferentIndex: number;
}
Defined in: evaluator/LlamaContext/LlamaContext.ts:1061
Parameters
Parameter | Type |
---|---|
tokens | Token [] |
Returns
{
firstDifferentIndex: number;
}
firstDifferentIndex
firstDifferentIndex: number;
adaptStateToTokens()
adaptStateToTokens(tokens: Token[], allowShift: boolean): Promise<void>
Defined in: evaluator/LlamaContext/LlamaContext.ts:1088
Erase parts of the context state to align it with the given tokens.
If the given tokens do not align with the current context state, the context state will be erased to align with the given tokens.
To find the first different token index between the context state and the given tokens, access the nextTokenIndex
property.
If allowShift
is true
(the default), shifting tokens may happen to align the context state with the given tokens, which incurs token evaluation of the shifted tokens.
Parameters
Parameter | Type | Default value |
---|---|---|
tokens | Token [] | undefined |
allowShift | boolean | true |
Returns
Promise
<void
>
clearHistory()
clearHistory(): Promise<void>
Defined in: evaluator/LlamaContext/LlamaContext.ts:1140
Clear the history of the sequence. If prependBos
was enabled, the BOS token will be prepended to the sequence again.
Returns
Promise
<void
>
eraseContextTokenRanges()
eraseContextTokenRanges(ranges: ContextTokensDeleteRange[]): Promise<void>
Defined in: evaluator/LlamaContext/LlamaContext.ts:1151
Erase context tokens in the provided ranges to free up space for new tokens to be generated. The start of each range is inclusive, and the end of each range is exclusive. For example, the range {start: 0, end: 1}
will remove the token at the 0
index only.
Parameters
Parameter | Type |
---|---|
ranges | ContextTokensDeleteRange [] |
Returns
Promise
<void
>
evaluate()
evaluate(tokens: Token[], options: SequenceEvaluateOptions): AsyncGenerator<Token, void,
| void
| Token
| Token[]>
Defined in: evaluator/LlamaContext/LlamaContext.ts:1275
Evaluate the provided tokens into the context sequence, and continue generating new tokens on iterator iterations.
This method uses the token predictor (when provided) to generate new tokens faster.
Parameters
Parameter | Type |
---|---|
tokens | Token [] |
options | SequenceEvaluateOptions |
Returns
AsyncGenerator
<Token
, void
, | void
| Token
| Token
[]>
evaluateWithMetadata()
evaluateWithMetadata<Metadata>(
tokens: Token[],
metadata: Metadata,
options: SequenceEvaluateOptions): AsyncGenerator<SequenceEvaluateOutput<Metadata>, void,
| void
| Token
| Token[]>
Defined in: evaluator/LlamaContext/LlamaContext.ts:1297
Like `.evaluate(...)`, but with additional metadata for each generated token.
Configure the additional metadata options to choose which metadata to include.
Type Parameters
Type Parameter |
---|
Metadata extends SequenceEvaluateMetadataOptions |
Parameters
Parameter | Type |
---|---|
tokens | Token [] |
metadata | Metadata |
options | SequenceEvaluateOptions |
Returns
AsyncGenerator
<SequenceEvaluateOutput
<Metadata
>, void
, | void
| Token
| Token
[]>
evaluateWithoutGeneratingNewTokens()
evaluateWithoutGeneratingNewTokens(tokens: Token[], options: {
evaluationPriority: EvaluationPriority;
contextShift: ContextShiftOptions;
}): Promise<void>
Defined in: evaluator/LlamaContext/LlamaContext.ts:1363
Evaluate the provided tokens into the context sequence without generating new tokens.
Parameters
Parameter | Type | Description |
---|---|---|
tokens | Token [] | - |
options | { evaluationPriority : EvaluationPriority ; contextShift : ContextShiftOptions ; } | - |
options.evaluationPriority ? | EvaluationPriority | When a lot of tokens are queued for the next batch, more than the configured batchSize , the tokens for each sequence will be evaluated based on the strategy chosen for the context. By default, the "maximumParallelism" strategy is used, which will try to evaluate as many sequences in parallel as possible, but at some point, it'll have to choose which sequences to evaluate more tokens of, so it'll prioritize the sequences with the highest evaluation priority. Also, a custom strategy can be used to prioritize the sequences differently, but generally, the higher the evaluation priority is, the more likely and more tokens will be evaluated for that sequence in the next queued batch. |
options.contextShift ? | ContextShiftOptions | Override the sequence context shift options for this evaluation |
Returns
Promise
<void
>
controlledEvaluate()
controlledEvaluate(input: ControlledEvaluateInputItem[], options?: {
evaluationPriority: EvaluationPriority;
contextShift: ContextShiftOptions;
onTokenResult: void;
}): Promise<(
| undefined
| ControlledEvaluateIndexOutput)[]>
Defined in: evaluator/LlamaContext/LlamaContext.ts:1445
Evaluate the provided tokens into the context sequence with custom options for each token.
This method allows for more precise control of the generation process.
A next token will be generated for a given token only if any of the generateNext
options for it are used.
To generate more tokens after this method finishes, use it again with token(s) you selected to add to the context from the previous evaluation.
This method doesn't use the token predictor (when provided) since it cannot predict which tokens are actually needed. Use the evaluate
method when you need to use token prediction.
Parameters
Parameter | Type | Description |
---|---|---|
input | ControlledEvaluateInputItem [] | - |
options ? | { evaluationPriority : EvaluationPriority ; contextShift : ContextShiftOptions ; onTokenResult : void ; } | - |
options.evaluationPriority ? | EvaluationPriority | When a lot of tokens are queued for the next batch, more than the configured batchSize , the tokens for each sequence will be evaluated based on the strategy chosen for the context. By default, the "maximumParallelism" strategy is used, which will try to evaluate as many sequences in parallel as possible, but at some point, it'll have to choose which sequences to evaluate more tokens of, so it'll prioritize the sequences with the highest evaluation priority. Also, a custom strategy can be used to prioritize the sequences differently, but generally, the higher the evaluation priority is, the more likely and more tokens will be evaluated for that sequence in the next queued batch. |
options.contextShift ? | ContextShiftOptions | Override the sequence context shift options for this evaluation |
options.onTokenResult ? | - |
Returns
Promise
<( | undefined
| ControlledEvaluateIndexOutput
)[]>
An array where for each token in the input array, there can be an output item at the same index in the output array. For indexes that have no output, there won't be any value at the corresponding index in the output array.
It's recommended to iterate from 0
up to the length of the input array to check the results in the output array.