Lesson 053: Eval harness hardening
Focus
Prefer explicit failure rehearsals over aspirational wording. Token Eval harness hardening:53 keeps neighbouring lessons differentiable.
Key ideas
- Thread: Eval harness hardening · drill v3 · spin
468421. - Habit: attach a trace_id to every completion you would paste into an ops dashboard.
- Guardrail: add one RACI bullet for prompt or index changes before tomorrow's standup.
Deep dive notebook
Synthetic drill artefacts
Red-team tableau
1. Actor **partner_api**
2. Injection `INJECT-11`
3. Detector `policy-tier-1` → pager `llmops-oncall-1`
4. Log fields `trace_id,tenant_id,redaction_tier`
Practice
Practice List five adversarial prompts using your org's domain nouns. — 53 Bump 14.