← Curriculum track ← Learn hub

Quanta GenAI Curriculum · Generative AI · Intermediate

GenAI Intermediate — 058: chart token inflation on `Evaluation harness depth` — memo `254330 [58]`

Lesson 058: Evaluation harness depth

Focus

Bias toward observable metrics, not model marketing. Token Evaluation harness depth:58 keeps neighbouring lessons differentiable.

Key ideas

Thread: Evaluation harness depth · drill v8 · spin 146854.
Habit: pair every model utterance with a trace_id you could paste into Grafana.
Guardrail: write one RACI bullet referencing this lesson tomorrow.

Deep dive notebook

Synthetic drill artefacts

Prompt scaffold

ROLE: Incident analyst cohort 58
INPUTS:
- excerpts tagged [chunk_id ...]
- guardrails referencing policy_bundle_1
TASK:
  1) Summarize deltas with citations
  2) Confidence label LOW|MED|HIGH + evidence
  3) If facts missing emit MISSING_FACTS list
USER_SEED_QUESTION >>> What changed between rollout 21 and 25?

Practice

Practice Draft three eval assertions QA must greenlight before launch. — 58 Bump literals mindset by 10.