erllama_model_llama (erllama v0.2.0)

View Source

Real-llama.cpp backend for erllama_model.

Owns a model_ref and a context_ref from erllama_nif. The gen_statem hands its decode/kv operations through this module; this module forwards to the NIF.

Config (passed through erllama_model:start_link/2): model_path :: file:name() | binary() (required) model_opts :: map() (forwarded to erllama_nif:load_model/2) context_opts :: map() (forwarded to erllama_nif:new_context/2)

model_opts and context_opts flow through to the NIF unchanged. See erllama_nif:load_model/2 and erllama_nif:new_context/2 for the full set of recognised keys, including the llama.cpp option passthroughs split_mode, main_gpu, tensor_split, flash_attn, type_k, and type_v.

Summary

Functions

apply_adapters/2

apply_chat_template/2

clear_sampler/1

configure_sampler/2

decode_one/2

detokenize/2

embed/2

extra_metadata/1

init(Config)

kv_pack/2

kv_pack/3

kv_unpack/2

kv_unpack/3

load_adapter/2

prefill/2

sampler_free(SamplerRef)

sampler_new/2

seq_clear/1

seq_rm/2

seq_rm_last/2

seq_rm_last/3

set_grammar/2

step/2

terminate/1

tokenize/2

unload_adapter/2

verify/4