Class: DraftSequenceTokenPredictor
Defined in: evaluator/LlamaContext/tokenPredictors/DraftSequenceTokenPredictor.ts:20
Predicts the next tokens by evaluating the current state of the target sequence on a draft sequence from a smaller and faster draft model.
See
Using Token Predictors: Draft Model Token Predictor
Extends
Constructors
new DraftSequenceTokenPredictor()
new DraftSequenceTokenPredictor(draftSequence: LlamaContextSequence, options: {
minTokens: number;
maxTokens: number;
evaluateOptions: Pick<SequenceEvaluateOptions,
| "contextShift"
| "evaluationPriority"
| "temperature"
| "minP"
| "topK"
| "topP"
| "seed"
| "repeatPenalty"
| "tokenBias">;
minConfidence: number;
}): DraftSequenceTokenPredictor
Defined in: evaluator/LlamaContext/tokenPredictors/DraftSequenceTokenPredictor.ts:41
Parameters
Parameter | Type | Description |
---|---|---|
draftSequence | LlamaContextSequence | - |
options | { minTokens : number ; maxTokens : number ; evaluateOptions : Pick <SequenceEvaluateOptions , | "contextShift" | "evaluationPriority" | "temperature" | "minP" | "topK" | "topP" | "seed" | "repeatPenalty" | "tokenBias" >; minConfidence : number ; } | - |
options.minTokens ? | number | The minimum number of tokens to draft. Defaults to 0 . |
options.maxTokens ? | number | Maximum number of tokens to draft. Defaults to 16 . |
options.evaluateOptions ? | Pick <SequenceEvaluateOptions , | "contextShift" | "evaluationPriority" | "temperature" | "minP" | "topK" | "topP" | "seed" | "repeatPenalty" | "tokenBias" > | Evaluate options default to the values of the target sequence. You can override any of the options for the prediction here. |
options.minConfidence ? | number | Minimum token confidence (probability of the token to be generated, assigned by the model) to consider the token as a prediction. When the generated token confidence is lower than this value, the prediction process will stop until all the predicted tokens are exhausted (either by a token that was not predicted being pushed, or all the generated predictions are consumed). A number between 0 and 1 representing the minimum probability of the token to be generated. Set to 0 to disable. Defaults to 0.6 . |
Returns
Overrides
Accessors
draftSequence
Get Signature
get draftSequence(): LlamaContextSequence
Defined in: evaluator/LlamaContext/tokenPredictors/DraftSequenceTokenPredictor.ts:88
Returns
minTokens
Get Signature
get minTokens(): number
Defined in: evaluator/LlamaContext/tokenPredictors/DraftSequenceTokenPredictor.ts:92
Returns
number
maxTokens
Get Signature
get maxTokens(): number
Defined in: evaluator/LlamaContext/tokenPredictors/DraftSequenceTokenPredictor.ts:96
Returns
number
minConfidence
Get Signature
get minConfidence(): undefined | number
Defined in: evaluator/LlamaContext/tokenPredictors/DraftSequenceTokenPredictor.ts:100
Returns
undefined
| number
Methods
updateInputTokens()
updateInputTokens(tokens: Token[]): void
Defined in: evaluator/LlamaContext/TokenPredictor.ts:57
Called with the input tokens before the generation starts when using LlamaChatSession
, LlamaChat
, and LlamaCompletion
.
Parameters
Parameter | Type |
---|---|
tokens | Token [] |
Returns
void
Inherited from
TokenPredictor
.updateInputTokens
reset()
reset(__namedParameters: {
targetSequence: LlamaContextSequence;
stateTokens: Token[];
evaluateOptions: Readonly<SequenceEvaluateOptions>;
}): Promise<void>
Defined in: evaluator/LlamaContext/tokenPredictors/DraftSequenceTokenPredictor.ts:104
Resets the state of the predictor.
Called before the generation starts.
Parameters
Parameter | Type |
---|---|
__namedParameters | { targetSequence : LlamaContextSequence ; stateTokens : Token []; evaluateOptions : Readonly <SequenceEvaluateOptions >; } |
__namedParameters.targetSequence | LlamaContextSequence |
__namedParameters.stateTokens | Token [] |
__namedParameters.evaluateOptions | Readonly <SequenceEvaluateOptions > |
Returns
Promise
<void
>
Overrides
pushTokens()
pushTokens(tokens: Token[]): void
Defined in: evaluator/LlamaContext/tokenPredictors/DraftSequenceTokenPredictor.ts:156
Parameters
Parameter | Type |
---|---|
tokens | Token [] |
Returns
void
Overrides
predictTokens()
predictTokens():
| Token[]
| Promise<Token[]>
Defined in: evaluator/LlamaContext/tokenPredictors/DraftSequenceTokenPredictor.ts:192
Predicts the next tokens based on the current state.
If the generation should wait until the minimum predications are ready, this method should return a promise that resolves when the minimum predictions are ready.
A background prediction process can be started when this function is called, so that the next predictions will be ready when this function is called again.
Returns
Overrides
stop()
stop(untilPredictionsExhausted: boolean): void
Defined in: evaluator/LlamaContext/tokenPredictors/DraftSequenceTokenPredictor.ts:221
Stops the prediction process when it runs in the background.
Parameters
Parameter | Type | Default value | Description |
---|---|---|---|
untilPredictionsExhausted | boolean | false | If true, the prediction process should not resume until the current predictions are exhausted. |
Returns
void
Overrides
dispose()
dispose(): void
Defined in: evaluator/LlamaContext/tokenPredictors/DraftSequenceTokenPredictor.ts:235
Returns
void