Class: LlamaContextSequence
Properties
onDispose
readonly onDispose: EventRelay<void>;
Defined in
evaluator/LlamaContext/LlamaContext.ts:815
Accessors
disposed
get disposed(): boolean
Returns
boolean
Defined in
evaluator/LlamaContext/LlamaContext.ts:862
context
get context(): LlamaContext
Returns
Defined in
evaluator/LlamaContext/LlamaContext.ts:866
model
get model(): LlamaModel
Returns
Defined in
evaluator/LlamaContext/LlamaContext.ts:870
nextTokenIndex
get nextTokenIndex(): number
Returns
number
Defined in
evaluator/LlamaContext/LlamaContext.ts:874
contextTokens
get contextTokens(): Token[]
Returns
Token
[]
Defined in
evaluator/LlamaContext/LlamaContext.ts:878
tokenMeter
get tokenMeter(): TokenMeter
Returns
Defined in
evaluator/LlamaContext/LlamaContext.ts:882
isLoadedToMemory
get isLoadedToMemory(): boolean
Returns
boolean
Defined in
evaluator/LlamaContext/LlamaContext.ts:886
Methods
dispose()
dispose(): void
Returns
void
Defined in
evaluator/LlamaContext/LlamaContext.ts:846
compareContextTokens()
compareContextTokens(tokens: Token[]): {
firstDifferentIndex: number;
}
Parameters
Parameter | Type |
---|---|
tokens | Token [] |
Returns
{
firstDifferentIndex: number;
}
firstDifferentIndex
firstDifferentIndex: number;
Defined in
evaluator/LlamaContext/LlamaContext.ts:890
clearHistory()
clearHistory(): Promise<void>
Clear the history of the sequence. If prependBos
was enabled, the BOS token will be prepended to the sequence again.
Returns
Promise
<void
>
Defined in
evaluator/LlamaContext/LlamaContext.ts:911
eraseContextTokenRanges()
eraseContextTokenRanges(ranges: ContextTokensDeleteRange[]): Promise<void>
Erase context tokens in the provided ranges to free up space for new tokens to be generated. The start of each range is inclusive, and the end of each range is exclusive. For example, the range {start: 0, end: 1}
will remove the token at the 0
index only.
Parameters
Parameter | Type |
---|---|
ranges | ContextTokensDeleteRange [] |
Returns
Promise
<void
>
Defined in
evaluator/LlamaContext/LlamaContext.ts:922
evaluate()
evaluate(tokens: Token[], options: {
temperature: number;
minP: number;
topK: number;
topP: number;
seed: number;
grammarEvaluationState: LlamaGrammarEvaluationState | () => undefined | LlamaGrammarEvaluationState;
repeatPenalty: LlamaContextSequenceRepeatPenalty;
tokenBias: TokenBias | () => TokenBias;
evaluationPriority: EvaluationPriority;
contextShift: ContextShiftOptions;
yieldEogToken: boolean;
}): AsyncGenerator<Token, void | Token, any>
Parameters
Parameter | Type | Description |
---|---|---|
tokens | Token [] | - |
options | object | - |
options.temperature ? | number | - |
options.minP ? | number | - |
options.topK ? | number | - |
options.topP ? | number | - |
options.seed ? | number | Used to control the randomness of the generated text. Change the seed to get different results. Defaults to the current epoch time. Only relevant when using temperature . |
options.grammarEvaluationState ? | LlamaGrammarEvaluationState | () => undefined | LlamaGrammarEvaluationState | - |
options.repeatPenalty ? | LlamaContextSequenceRepeatPenalty | - |
options.tokenBias ? | TokenBias | () => TokenBias | Adjust the probability of tokens being generated. Can be used to bias the model to generate tokens that you want it to lean towards, or to avoid generating tokens that you want it to avoid. |
options.evaluationPriority ? | EvaluationPriority | When a lot of tokens are queued for the next batch, more than the configured batchSize , the tokens for each sequence will be evaluated based on the strategy chosen for the context. By default, the "maximumParallelism" strategy is used, which will try to evaluate as many sequences in parallel as possible, but at some point, it'll have to choose which sequences to evaluate more tokens of, so it'll prioritize the sequences with the highest evaluation priority. Also, a custom strategy can be used to prioritize the sequences differently, but generally, the higher the evaluation priority is, the more likely and more tokens will be evaluated for that sequence in the next queued batch. |
options.contextShift ? | ContextShiftOptions | Override the sequence context shift options for this evaluation |
options.yieldEogToken ? | boolean | Yield an EOG (End Of Generation) token (like EOS and EOT) when it's generated. When false the generation will stop when an EOG token is generated and the token won't be yielded. Defaults to false . |
Returns
AsyncGenerator
<Token
, void
| Token
, any
>
Defined in
evaluator/LlamaContext/LlamaContext.ts:996
evaluateWithoutGeneratingNewTokens()
evaluateWithoutGeneratingNewTokens(tokens: Token[], options?: {
evaluationPriority: 5;
contextShift: {};
}): Promise<void>
Evaluate the provided tokens into the context sequence without generating new tokens.
Parameters
Parameter | Type | Description |
---|---|---|
tokens | Token [] | |
options ? | object | |
options.evaluationPriority ? | EvaluationPriority | When a lot of tokens are queued for the next batch, more than the configured batchSize , the tokens for each sequence will be evaluated based on the strategy chosen for the context. By default, the "maximumParallelism" strategy is used, which will try to evaluate as many sequences in parallel as possible, but at some point, it'll have to choose which sequences to evaluate more tokens of, so it'll prioritize the sequences with the highest evaluation priority. Also, a custom strategy can be used to prioritize the sequences differently, but generally, the higher the evaluation priority is, the more likely and more tokens will be evaluated for that sequence in the next queued batch. |
options.contextShift ? | ContextShiftOptions | Override the sequence context shift options for this evaluation |
Returns
Promise
<void
>