The outcome of running one eval: the eval, the captured transcript, and the
list of %SkillKit.Eval.Check{} scored against it.
An eval passes only when every check passes. failure_message/1 renders the
failing checks and transcript into the message ExUnit shows when a generated
eval test fails.
Summary
Functions
A human-readable explanation of why the eval failed, for ExUnit output.
The checks that failed.
True when every check passed.
Non-fatal warning notes from checks that passed (e.g. the judge's).
Types
@type t() :: %SkillKit.Eval.Result{ cached: boolean(), checks: [SkillKit.Eval.Check.t()], eval: SkillKit.Eval.t(), transcript: SkillKit.Eval.Transcript.t() }
Functions
A human-readable explanation of why the eval failed, for ExUnit output.
Renders the failing checks (including the judge's verdict and reasoning) plus the captured transcript — the prompt sent, the tools the agent called, and its response — so a failure shows both what was judged and why.
@spec failures(t()) :: [SkillKit.Eval.Check.t()]
The checks that failed.
True when every check passed.
Non-fatal warning notes from checks that passed (e.g. the judge's).