Lesson 042: Evaluation habits that compound
Focus
Prefer explicit failure rehearsals over aspirational wording. Token Evaluation habits that compound:42 keeps neighbouring lessons differentiable.
Key ideas
- Thread: Evaluation habits that compound · drill v2 · spin
61731. - Habit: pair every model utterance with a trace_id you could paste into Grafana.
- Guardrail: write one RACI bullet referencing this lesson tomorrow.
Deep dive notebook
Synthetic drill artefacts
Retrieval partitioning plan
| Slice | Tokens | Retrieval mode | Notes |
|-------|--------|----------------|-------|
| FAQs | 369 | hybrid@0.53 | keep tables contiguous |
| Policies | 464 | dense@0.67 | include footnotes |
Drill: justify why chunk boundaries fall where they do for lesson 42.
Practice
Practice Attach rollback steps if evaluator variance spikes. — 42 Bump literals mindset by 9.