Skip to content

Class: LlamaContext

Properties

onDispose

ts
readonly onDispose: EventRelay<void>;

Defined in

evaluator/LlamaContext/LlamaContext.ts:60

Accessors

disposed

ts
get disposed(): boolean

Returns

boolean

Defined in

evaluator/LlamaContext/LlamaContext.ts:170


model

ts
get model(): LlamaModel

Returns

LlamaModel

Defined in

evaluator/LlamaContext/LlamaContext.ts:174


contextSize

ts
get contextSize(): number

Returns

number

Defined in

evaluator/LlamaContext/LlamaContext.ts:178


batchSize

ts
get batchSize(): number

Returns

number

Defined in

evaluator/LlamaContext/LlamaContext.ts:182


flashAttention

ts
get flashAttention(): boolean

Returns

boolean

Defined in

evaluator/LlamaContext/LlamaContext.ts:186


stateSize

ts
get stateSize(): number

The actual size of the state in the memory in bytes. This value is provided by llama.cpp and doesn't include all the memory overhead of the context.

Returns

number

Defined in

evaluator/LlamaContext/LlamaContext.ts:194


currentThreads

ts
get currentThreads(): number

The number of threads currently used to evaluate tokens

Returns

number

Defined in

evaluator/LlamaContext/LlamaContext.ts:201


idealThreads

ts
get idealThreads(): number

The number of threads that are preferred to be used to evaluate tokens.

The actual number of threads used may be lower when other evaluations are running in parallel.

Returns

number

Defined in

evaluator/LlamaContext/LlamaContext.ts:212


totalSequences

ts
get totalSequences(): number

Returns

number

Defined in

evaluator/LlamaContext/LlamaContext.ts:225


sequencesLeft

ts
get sequencesLeft(): number

Returns

number

Defined in

evaluator/LlamaContext/LlamaContext.ts:229

Methods

dispose()

ts
dispose(): Promise<void>

Returns

Promise<void>

Defined in

evaluator/LlamaContext/LlamaContext.ts:156


getAllocatedContextSize()

ts
getAllocatedContextSize(): number

Returns

number

Defined in

evaluator/LlamaContext/LlamaContext.ts:216


getSequence()

ts
getSequence(options: {
  contextShift: ContextShiftOptions;
 }): LlamaContextSequence

Before calling this method, make sure to call sequencesLeft to check if there are any sequences left. When there are no sequences left, this method will throw an error.

Parameters

ParameterType
optionsobject
options.contextShift?ContextShiftOptions

Returns

LlamaContextSequence

Defined in

evaluator/LlamaContext/LlamaContext.ts:237


dispatchPendingBatch()

ts
dispatchPendingBatch(): void

Returns

void

Defined in

evaluator/LlamaContext/LlamaContext.ts:269


printTimings()

ts
printTimings(): Promise<void>

Print the timings of token evaluation since that last print for this context.

Requires the performanceTracking option to be enabled.

Note: it prints on the LlamaLogLevel.info level, so if you set the level of your Llama instance higher than that, it won't print anything.

Returns

Promise<void>

Defined in

evaluator/LlamaContext/LlamaContext.ts:505