agent_eval
Last updated
Last updated
Agents for running the pipeline.
AgentEval is a process for evaluating a LLM-based system's performance on a given task.
When given a task to evaluate and a few example runs, the critic and subcritic agents create evaluation criteria for evaluating a system's solution. Once the criteria has been created, the quantifier agent can evaluate subsequent task solutions based on the generated criteria.
See our for usage examples and general explanations.