Shared type definitions for the HuggingFace Inference API.
All structured response shapes are documented here as Elixir types.
These are primarily for @spec annotations and Dialyzer analysis.
Because provider JSON responses use string keys, the runtime values are
%{String.t() => term()} maps. The types here document the expected
shape using atoms for readability — see the field name notes in each typedoc.
Summary
Types
ASR transcription result: %{"text" => string}
Audio-to-audio output element with label, content-type, and base64 blob
Authentication method resolved from the access token
Raw binary content (image, audio, or video bytes)
Object detection bounding box: xmin, ymin, xmax, ymax
A streaming chat completion delta chunk
OpenAI-compatible chat completion response.
A message in a chat conversation.
Object detection result: label, score, and bounding box
Dense embedding vector (flat or nested list of floats)
Fill-mask prediction with score, sequence, token id, and token string
Image segmentation result: label, base64 mask, optional score
Image-to-text / captioning result: %{"generated_text" => string}
Label + confidence score pair: %{"label" => string, "score" => float}
A model ID on the HF Hub, e.g. "meta-llama/Llama-3.1-8B-Instruct"
Image / video output format
A provider identifier, e.g. "groq", "together", "hf-inference"
A single entry from the HF Hub's inferenceProviderMapping
Extractive QA result: answer, score, start/end character offsets
Summarisation result: %{"summary_text" => string}
Table QA result: answer, aggregator type, matched cells, coordinates
A task identifier, e.g. "conversational", "text-to-image"
Non-streaming text generation result: %{"generated_text" => string}
A named entity span from token classification (NER)
A tool definition for function calling
A function tool call returned in an assistant message
Translation result: %{"translation_text" => string}
Token usage statistics returned by the provider
Visual QA result: answer string and confidence score
Types
@type asr_output() :: map()
ASR transcription result: %{"text" => string}
@type audio_output() :: map()
Audio-to-audio output element with label, content-type, and base64 blob
@type auth_method() :: :hf_token | :provider_key | :credentials_include | :none
Authentication method resolved from the access token
@type binary_output() :: binary()
Raw binary content (image, audio, or video bytes)
@type bounding_box() :: map()
Object detection bounding box: xmin, ymin, xmax, ymax
@type chat_completion_chunk() :: map()
A streaming chat completion delta chunk
@type chat_completion_output() :: map()
OpenAI-compatible chat completion response.
Keys (at runtime): "id", "object", "created", "model", "choices", "usage".
@type chat_message() :: map()
A message in a chat conversation.
Runtime keys are strings: "role", "content", etc.
@type detection_result() :: map()
Object detection result: label, score, and bounding box
Dense embedding vector (flat or nested list of floats)
@type fill_mask_prediction() :: map()
Fill-mask prediction with score, sequence, token id, and token string
@type image_segment() :: map()
Image segmentation result: label, base64 mask, optional score
@type image_to_text_output() :: map()
Image-to-text / captioning result: %{"generated_text" => string}
@type label_score() :: map()
Label + confidence score pair: %{"label" => string, "score" => float}
@type model_id() :: String.t()
A model ID on the HF Hub, e.g. "meta-llama/Llama-3.1-8B-Instruct"
@type output_type() :: :blob | :url | :data_url | :json
Image / video output format
@type provider() :: String.t()
A provider identifier, e.g. "groq", "together", "hf-inference"
A single entry from the HF Hub's inferenceProviderMapping
@type qa_result() :: map()
Extractive QA result: answer, score, start/end character offsets
@type summarization_output() :: map()
Summarisation result: %{"summary_text" => string}
@type table_qa_result() :: map()
Table QA result: answer, aggregator type, matched cells, coordinates
@type task() :: String.t()
A task identifier, e.g. "conversational", "text-to-image"
@type text_generation_output() :: map()
Non-streaming text generation result: %{"generated_text" => string}
@type token_classification_entity() :: map()
A named entity span from token classification (NER)
@type tool() :: map()
A tool definition for function calling
@type tool_call() :: map()
A function tool call returned in an assistant message
@type translation_output() :: map()
Translation result: %{"translation_text" => string}
@type usage_stats() :: map()
Token usage statistics returned by the provider
@type visual_qa_result() :: map()
Visual QA result: answer string and confidence score