Skip to content
Agentic AI for serious engineers
Pattern

Eval Loop

Build the rubric and the gold dataset before you build the agent.

When to use

Always. There is no agent project where this is the wrong pattern. The rubric defines what "good" means; the gold dataset is the test bench; the agent is iterated against both.

When not to use

When the rubric isn't built yet. (That's the signal to stop and build the rubric first.)

Used in