AgentSea. Evaluate. Metric behaviour
(agentsea_evaluate v0.1.0)
Copy Markdown
A scoring metric. Given an example (the :output under test, plus optional
:input/:expected), it returns a score in [0, 1] and a pass/fail. Built-in
metrics: ExactMatch, Contains, and LLMJudge (provider-backed).