Lesson 052: Eval harness hardening
Focus
Assume an auditor replays your claims; narrate checkpoints aloud. Token Eval harness hardening:52 keeps neighbouring lessons differentiable.
Key ideas
- Thread: Eval harness hardening · drill v2 · spin
370211. - Habit: attach a trace_id to every completion you would paste into an ops dashboard.
- Guardrail: add one RACI bullet for prompt or index changes before tomorrow's standup.
Deep dive notebook
Synthetic drill artefacts
Eval harness snippet
case_id: LO-9001
route: support_rag_v2
must_include_patterns:
- "\[chunk_"
forbid_patterns:
- "guaranteed SLA"
judge_profile: tempered_2
Practice
Practice Attach rollback steps if cost-per-request crosses your guardrail. — 52 Bump 6.