Lesson 052: Eval harness hardening
Focus
Anchor this drill to one production LLM workflow—even hypothetical. Token Eval harness hardening:52 keeps neighbouring lessons differentiable.
Key ideas
- Thread: Eval harness hardening · drill v2 · spin
50225. - Habit: attach a trace_id to every completion you would paste into an ops dashboard.
- Guardrail: add one RACI bullet for prompt or index changes before tomorrow's standup.
Deep dive notebook
Synthetic drill artefacts
Logging contract
| Field | Retention |
|-------|-----------|
| trace_id | 17d |
| prompt_hash | policy-defined |
| completion_excerpt | 24h hot |
Mask tier `pii-v1` before persistence.
Practice
Practice Simulate degraded retrieval once; capture user-facing fallback copy. — 52 Bump 25.