Class: Llama

The llama WASM library, provides a simplified wrapper around the llama.cpp library.

See llama.cpp for more details.

import { Llama, WebBlob } from "@hpcc-js/wasm-llama";

let llama = await Llama.load();
const model = "https://huggingface.co/CompendiumLabs/bge-base-en-v1.5-gguf/resolve/main/bge-base-en-v1.5-q4_k_m.gguf";
const webBlob: Blob = await WebBlob.create(new URL(model));

const data: ArrayBuffer = await webBlob.arrayBuffer();

const embeddings = llama.embedding("Hello and Welcome!", new Uint8Array(data));

Methods

load()

static load(): Promise<Llama>

Defined in: llama/src/llama.ts:41

Compiles and instantiates the raw wasm.

INFO

In general WebAssembly compilation is disallowed on the main thread if the buffer size is larger than 4KB, hence forcing load to be asynchronous;

Returns

Promise<Llama>

A promise to an instance of the Llama class.

unload()

static unload(): void

Defined in: llama/src/llama.ts:50

Unloades the compiled wasm instance.

Returns

void

version()

version(): string

Defined in: llama/src/llama.ts:57

Returns

string

The Llama c++ version

embedding()

embedding(text, model, format): number[][]

Defined in: llama/src/llama.ts:69

Calculates the vector representation of the input text.

Parameters

text

string

The input text.

model

Uint8Array

The model to use for the embedding.

format

string = "array"

Returns

number[][]

The embedding of the text using the model.

Class: Llama ​

Methods ​

load() ​

Returns ​

unload() ​

Returns ​

version() ​

Returns ​

embedding() ​

Parameters ​

text ​

model ​

format ​

Returns ​

Class: Llama

Methods

load()

Returns

unload()

Returns

version()

Returns

embedding()

Parameters

text

model

format

Returns