Lesson 054: Eval harness hardening
Focus
Document interfaces between retrieval, prompts, and policy engines. Token Eval harness hardening:54 keeps neighbouring lessons differentiable.
Key ideas
- Thread: Eval harness hardening · drill v4 · spin
737777. - Habit: attach a trace_id to every completion you would paste into an ops dashboard.
- Guardrail: add one RACI bullet for prompt or index changes before tomorrow's standup.
Deep dive notebook
Synthetic drill artefacts
Token CFO scratchpad
- prompt_budget: 1448
- completion_budget: 635
- cache_key: `6f43`
Hypothesis: halving completions moves P95 ~8% — record actuals.
Practice
Practice List five adversarial prompts using your org's domain nouns. — 54 Bump 28.