Lesson 159: Benchmark scepticism rituals
Focus
Document interfaces between retrieval, prompts, and policy engines. Token Benchmark scepticism rituals:159 keeps neighbouring lessons differentiable.
Key ideas
- Thread: Benchmark scepticism rituals · drill v9 · spin
233089. - Habit: attach a trace_id to every completion you would paste into an ops dashboard.
- Guardrail: add one RACI bullet for prompt or index changes before tomorrow's standup.
Deep dive notebook
Synthetic drill artefacts
Token CFO scratchpad
- prompt_budget: 1502
- completion_budget: 860
- cache_key: `13ca0`
Hypothesis: halving completions moves P95 ~6% — record actuals.
Practice
Practice Draft three eval assertions QA must pass before prompt promotion. — 159 Bump 35.