Class: DraftSequenceTokenPredictor
Defined in: evaluator/LlamaContext/tokenPredictors/DraftSequenceTokenPredictor.ts:20
Predicts the next tokens by evaluating the current state of the target sequence on a draft sequence from a smaller and faster draft model.
See
Using Token Predictors: Draft Model Token Predictor
Extends
Constructors
Constructor
new DraftSequenceTokenPredictor(draftSequence: LlamaContextSequence, options: {
minTokens?: number;
maxTokens?: number;
evaluateOptions?: Pick<SequenceEvaluateOptions,
| "contextShift"
| "evaluationPriority"
| "temperature"
| "minP"
| "topK"
| "topP"
| "seed"
| "repeatPenalty"
| "tokenBias">;
minConfidence?: number;
}): DraftSequenceTokenPredictor;Defined in: evaluator/LlamaContext/tokenPredictors/DraftSequenceTokenPredictor.ts:41
Parameters
| Parameter | Type | Description |
|---|---|---|
draftSequence | LlamaContextSequence | - |
options | { minTokens?: number; maxTokens?: number; evaluateOptions?: Pick<SequenceEvaluateOptions, | "contextShift" | "evaluationPriority" | "temperature" | "minP" | "topK" | "topP" | "seed" | "repeatPenalty" | "tokenBias">; minConfidence?: number; } | - |
options.minTokens? | number | The minimum number of tokens to draft. Defaults to 0. |
options.maxTokens? | number | Maximum number of tokens to draft. Defaults to 16. |
options.evaluateOptions? | Pick<SequenceEvaluateOptions, | "contextShift" | "evaluationPriority" | "temperature" | "minP" | "topK" | "topP" | "seed" | "repeatPenalty" | "tokenBias"> | Evaluate options default to the values of the target sequence. You can override any of the options for the prediction here. |
options.minConfidence? | number | Minimum token confidence (probability of the token to be generated, assigned by the model) to consider the token as a prediction. When the generated token confidence is lower than this value, the prediction process will stop until all the predicted tokens are exhausted (either by a token that was not predicted being pushed, or all the generated predictions are consumed). A number between 0 and 1 representing the minimum probability of the token to be generated. Set to 0 to disable. Defaults to 0.6. |
Returns
DraftSequenceTokenPredictor
Overrides
Accessors
draftSequence
Get Signature
get draftSequence(): LlamaContextSequence;Defined in: evaluator/LlamaContext/tokenPredictors/DraftSequenceTokenPredictor.ts:88
Returns
minTokens
Get Signature
get minTokens(): number;Defined in: evaluator/LlamaContext/tokenPredictors/DraftSequenceTokenPredictor.ts:92
Returns
number
maxTokens
Get Signature
get maxTokens(): number;Defined in: evaluator/LlamaContext/tokenPredictors/DraftSequenceTokenPredictor.ts:96
Returns
number
minConfidence
Get Signature
get minConfidence(): undefined | number;Defined in: evaluator/LlamaContext/tokenPredictors/DraftSequenceTokenPredictor.ts:100
Returns
undefined | number
Methods
updateInputTokens()
updateInputTokens(tokens: Token[]): void;Defined in: evaluator/LlamaContext/TokenPredictor.ts:57
Called with the input tokens before the generation starts when using LlamaChatSession, LlamaChat, and LlamaCompletion.
Parameters
| Parameter | Type |
|---|---|
tokens | Token[] |
Returns
void
Inherited from
TokenPredictor.updateInputTokens
reset()
reset(__namedParameters: {
targetSequence: LlamaContextSequence;
stateTokens: Token[];
evaluateOptions: Readonly<SequenceEvaluateOptions>;
}): Promise<void>;Defined in: evaluator/LlamaContext/tokenPredictors/DraftSequenceTokenPredictor.ts:104
Resets the state of the predictor.
Called before the generation starts.
Parameters
| Parameter | Type |
|---|---|
__namedParameters | { targetSequence: LlamaContextSequence; stateTokens: Token[]; evaluateOptions: Readonly<SequenceEvaluateOptions>; } |
__namedParameters.targetSequence | LlamaContextSequence |
__namedParameters.stateTokens | Token[] |
__namedParameters.evaluateOptions | Readonly<SequenceEvaluateOptions> |
Returns
Promise<void>
Overrides
pushTokens()
pushTokens(tokens: Token[]): void;Defined in: evaluator/LlamaContext/tokenPredictors/DraftSequenceTokenPredictor.ts:156
Parameters
| Parameter | Type |
|---|---|
tokens | Token[] |
Returns
void
Overrides
predictTokens()
predictTokens():
| Token[]
| Promise<Token[]>;Defined in: evaluator/LlamaContext/tokenPredictors/DraftSequenceTokenPredictor.ts:192
Predicts the next tokens based on the current state.
If the generation should wait until the minimum predications are ready, this method should return a promise that resolves when the minimum predictions are ready.
A background prediction process can be started when this function is called, so that the next predictions will be ready when this function is called again.
Returns
Overrides
stop()
stop(untilPredictionsExhausted: boolean): void;Defined in: evaluator/LlamaContext/tokenPredictors/DraftSequenceTokenPredictor.ts:221
Stops the prediction process when it runs in the background.
Parameters
| Parameter | Type | Default value | Description |
|---|---|---|---|
untilPredictionsExhausted | boolean | false | If true, the prediction process should not resume until the current predictions are exhausted. |
Returns
void
Overrides
dispose()
dispose(): void;Defined in: evaluator/LlamaContext/tokenPredictors/DraftSequenceTokenPredictor.ts:235
Returns
void