Agent QA Gates

A field-tested validation system for AI agent output. Born from production failures, not theory.

Quick Start

Before any agent delivers output, run the Pre-Ship Checklist:

Accurate? — every number/date/metric has a source. Unsourced → prefix "estimated"
Complete? — no missing pieces, no "I'll do that next"
Actionable? — ends with clear next step or decision point
Fits the channel? — check character limits for your delivery surface
No leaks? — no internal context, private data, or secrets
Not a duplicate? — verify no recent identical send
Would the human be embarrassed? — if yes, don't ship

Four ascending tiers by risk level:

Gate	Scope	Key Checks
Gate 0	Internal (files, config, memory)	Mechanism changed not just text, no placeholders, file exists
Gate 1	Human-facing (briefings, summaries)	Key info in first 2 lines, ≤3-line paragraphs, channel length limits
Gate 2	External (email, public content, client materials)	No internal context leaked, recipient-appropriate tone, dedup check
Gate 3	Code & technical	Builds clean, no secrets in code, error handling, tests pass

See references/gates-detail.md for full gate checklists.

Not all failures are equal:

🔴 BLOCK — cannot ship (secrets, privacy, hallucinated data, wrong recipient)
🟡 FIX — fix before shipping, <2 min (formatting, too long, missing citation)
🟢 NOTE — log and ship (style preference, minor optimization)

Recurring failure modes need dedicated gates. These are the most common:

Binary output: alert text ONLY or status-OK ONLY. Never mixed.
Every data point verified by current-session tool call. No hallucinated metrics.
No stale data from previous cycles or pre-compaction sessions.

Gates should evolve based on real failures, not imagination:

Gates that sound good but never catch anything → kill them
Per-agent checklists that duplicate general gates → merge or reference
"ADHD-friendly" or "high-quality" as gate items → not testable, replace with mechanical checks
Aspirational gates nobody runs → either automate or cut

This skill provides the pattern. Adapt it:

For the full reference implementation, see references/gates-detail.md. For automation scripts, see scripts/qa-check.sh.