AgentSea. Voice. STT. Whisper
(agentsea_bumblebee v0.1.0)
Copy Markdown
In-process speech-to-text via Whisper (Bumblebee + Nx) — a local
AgentSea.Voice.STT with no transcription API.
Build a serving once (it loads the model), then pass it per call:
serving = AgentSea.Voice.STT.Whisper.serving("openai/whisper-tiny")
{:ok, %{text: text}} = AgentSea.Voice.STT.Whisper.transcribe(audio, serving: serving)The serving accepts what Bumblebee's Whisper serving accepts (raw PCM samples
as an Nx.Tensor, or a file path when ffmpeg is available). The serving call
is injectable (:run) so the transcription plumbing is testable without a
model. Add :exla for real throughput (see AgentSea.Embedder.Bumblebee).
Summary
Functions
Build an Nx.Serving for a Whisper model. Downloads the model on first use.
Serving options from app config (e.g. an EXLA compiler), overridable per call.
Functions
Build an Nx.Serving for a Whisper model. Downloads the model on first use.
Serving options from app config (e.g. an EXLA compiler), overridable per call.