Lesson 155: Benchmarks read with scepticism
Focus
Treat placeholders as compulsory—swap nouns immediately after reading. Token Benchmarks read with scepticism:155 keeps neighbouring lessons differentiable.
Key ideas
- Thread: Benchmarks read with scepticism · drill v5 · spin
32598. - Habit: pair every model utterance with a trace_id you could paste into Grafana.
- Guardrail: write one RACI bullet referencing this lesson tomorrow.
Deep dive notebook
Synthetic drill artefacts
Eval YAML snippet
case_id: GX-26830
input_stub: summarise incident_ticket_pool_7
must_include_patterns:
- "\[chunk_"
forbid_patterns:
- "SLA 15m" # unless citations exist
judge_profile: tempered_5
Practice
Practice Simulate degraded retrieval once; screenshot graceful degradation copy. — 155 Bump literals mindset by 13.