Class: LlamaContext
Properties
onDispose
readonly onDispose: EventRelay<void>;
Defined in
evaluator/LlamaContext/LlamaContext.ts:60
Accessors
disposed
get disposed(): boolean
Returns
boolean
Defined in
evaluator/LlamaContext/LlamaContext.ts:170
model
get model(): LlamaModel
Returns
Defined in
evaluator/LlamaContext/LlamaContext.ts:174
contextSize
get contextSize(): number
Returns
number
Defined in
evaluator/LlamaContext/LlamaContext.ts:178
batchSize
get batchSize(): number
Returns
number
Defined in
evaluator/LlamaContext/LlamaContext.ts:182
flashAttention
get flashAttention(): boolean
Returns
boolean
Defined in
evaluator/LlamaContext/LlamaContext.ts:186
stateSize
get stateSize(): number
The actual size of the state in the memory in bytes. This value is provided by llama.cpp
and doesn't include all the memory overhead of the context.
Returns
number
Defined in
evaluator/LlamaContext/LlamaContext.ts:194
currentThreads
get currentThreads(): number
The number of threads currently used to evaluate tokens
Returns
number
Defined in
evaluator/LlamaContext/LlamaContext.ts:201
idealThreads
get idealThreads(): number
The number of threads that are preferred to be used to evaluate tokens.
The actual number of threads used may be lower when other evaluations are running in parallel.
Returns
number
Defined in
evaluator/LlamaContext/LlamaContext.ts:212
totalSequences
get totalSequences(): number
Returns
number
Defined in
evaluator/LlamaContext/LlamaContext.ts:225
sequencesLeft
get sequencesLeft(): number
Returns
number
Defined in
evaluator/LlamaContext/LlamaContext.ts:229
Methods
dispose()
dispose(): Promise<void>
Returns
Promise
<void
>
Defined in
evaluator/LlamaContext/LlamaContext.ts:156
getAllocatedContextSize()
getAllocatedContextSize(): number
Returns
number
Defined in
evaluator/LlamaContext/LlamaContext.ts:216
getSequence()
getSequence(options: {
contextShift: ContextShiftOptions;
}): LlamaContextSequence
Before calling this method, make sure to call sequencesLeft
to check if there are any sequences left. When there are no sequences left, this method will throw an error.
Parameters
Parameter | Type |
---|---|
options | object |
options.contextShift ? | ContextShiftOptions |
Returns
Defined in
evaluator/LlamaContext/LlamaContext.ts:237
dispatchPendingBatch()
dispatchPendingBatch(): void
Returns
void
Defined in
evaluator/LlamaContext/LlamaContext.ts:269
printTimings()
printTimings(): Promise<void>
Print the timings of token evaluation since that last print for this context.
Requires the performanceTracking
option to be enabled.
Note: it prints on the
LlamaLogLevel.info
level, so if you set the level of yourLlama
instance higher than that, it won't print anything.
Returns
Promise
<void
>