Modules
Entry point for the inference layer.
Inference adapter for the Anthropic Messages API.
Inference adapter for Ollama.
Inference adapter for OpenAI-compatible APIs.
GitHub Copilot OAuth device code authentication.
Ensures a model is loaded with the correct configuration on local providers (LM Studio, Ollama, vLLM) before inference begins.
Single entry point for all inference calls.
Canonical request struct for inference calls.
Declares model capabilities upfront so adapters serialize correctly on the first attempt — no runtime detection, no retry-on-error branching.
Fetches and caches model capabilities from models.dev.
Resolves a ModelProfile for a given provider and model.
Probes inference providers to determine availability.
Behaviour for LLM inference providers.
Canonical response struct from inference calls.
Profile-driven post-processing of inference responses.