erllama_inflight (erllama v0.1.0)

View Source

Tracks in-flight streaming inference requests.

Each erllama:infer/4 admit produces a unique reference() and an entry in a public ETS table mapping that reference back to the serving erllama_model gen_statem. erllama:cancel/1 looks up the ref here to find which model owns the request, then casts a {cancel, Ref} event at it.

The table is a fixed-name public ETS so lookups are lock-free from any process. The owning gen_server (this module) is here only to keep the table alive across releases and to clean up entries when a model dies unexpectedly.

Summary

Functions

all()

-spec all() -> [{reference(), pid()}].

handle_call/3

handle_cast/2

handle_info/2

init/1

lookup(Ref)

-spec lookup(reference()) -> {ok, pid()} | {error, not_found}.

register(Ref, ModelPid)

-spec register(reference(), pid()) -> ok.

start_link()

terminate/2

unregister(Ref)

-spec unregister(reference()) -> ok.