# agentsea_evaluate v0.1.0 - Table of Contents

> AgentSea evaluate: concurrent evaluation metrics, including an LLM-as-judge metric.

## Modules

- [AgentSea.Evaluate](AgentSea.Evaluate.md): Run scoring metrics over a dataset, concurrently, and aggregate the results.
- [AgentSea.Evaluate.Metric](AgentSea.Evaluate.Metric.md): A scoring metric. Given an example (the `:output` under test, plus optional
`:input`/`:expected`), it returns a score in `[0, 1]` and a pass/fail. Built-in
metrics: `ExactMatch`, `Contains`, and `LLMJudge` (provider-backed).

- [AgentSea.Evaluate.Metric.Contains](AgentSea.Evaluate.Metric.Contains.md): Scores 1.0 when the output contains the expected value (case-insensitive substring).
- [AgentSea.Evaluate.Metric.ExactMatch](AgentSea.Evaluate.Metric.ExactMatch.md): Scores 1.0 when the output equals the expected value (trimmed, case-insensitive).
- [AgentSea.Evaluate.Metric.LLMJudge](AgentSea.Evaluate.Metric.LLMJudge.md): Uses an LLM to score an output against a rubric — "LLM-as-judge". Runs over any
`AgentSea.Provider` (so it can go through the gateway).

