Dsxir.LM.Sycophant (dsxir v0.4.0)

Copy Markdown

Sycophant-backed implementation of the Dsxir.LM behaviour.

Only compiled when the optional :sycophant dependency is present.

Config shape:

[model: "openai:gpt-4o-mini", api_key: nil | binary,
 base_url: nil | binary, temperature: float, max_tokens: integer,
 top_p: float, num_retries: integer]

Unknown config keys pass through to Sycophant; Sycophant validates them against the resolved wire protocol's param schema. Dsxir's own settings keys (:lm, :adapter, :cache, :metadata, :hints, ...) are stripped first, so a polluted config never makes the wire protocol warn about them.

Per-call opts override per-config opts via Keyword.merge/2. api_key and base_url are lifted into credentials: %{...} for Sycophant. The :headers config key is reserved for future use and intentionally ignored.

Streaming

The :stream opt is forwarded to Sycophant.generate_text/3 unchanged. Sycophant invokes the 1-arity callback with %Sycophant.StreamChunk{} values (:text_delta, :tool_call_delta, :reasoning_delta, :usage, :failed, :incomplete, :cancelled, :done) as the response streams in; the final assembled {:ok, text, usage} tuple is still returned by this callback so Dsxir.Predictor.Predict can build its %Dsxir.Prediction{} normally.

Usage extraction

On a successful response, the %Sycophant.Usage{} struct is mapped into a %Dsxir.Cost{} with calls: 1. When Sycophant reports nil usage, Dsxir.LM.empty_usage/0 (a zero-valued Dsxir.Cost) is returned.

Dsxir.Cost fieldSycophant.Usage field
:input_tokens:input_tokens
:output_tokens:output_tokens
:cache_read_tokens:cache_read_input_tokens
:cache_write_tokens:cache_creation_input_tokens
:reasoning_tokens:reasoning_tokens
:input_cost:input_cost
:output_cost:output_cost
:cache_read_cost:cache_read_cost
:cache_write_cost:cache_write_cost
:reasoning_cost:reasoning_cost
:total_cost:total_cost
:currency:pricing.currency

Error translation

Provider errors are translated into typed Dsxir.Errors.LM.* structs. A %Sycophant.Error.Provider.BadRequest{status: 400} is classified against a small set of regexes (case-insensitive) on its :body to detect upstream context-window-exceeded responses:

  • ~r/context length/i
  • ~r/maximum context/i
  • ~r/prompt is too long/i
  • ~r/too many tokens/i
  • ~r/exceeds.*token/i
  • ~r/request is too large/i

When any of those match the body, the error becomes a Dsxir.Errors.LM.ContextWindow (with prompt_tokens/limit extracted when the body carries them). All other BadRequest bodies stay as Dsxir.Errors.LM.RequestFailed with status: 400.