Files
Claude-Code-Game-Studios/.claude/skills/regression-suite/SKILL.md
Donchitos 984023ddac Release v1.0.0 — concept-prototype/vertical-slice split, workflow restructure, polish (#50)
* Add /vertical-slice skill, prototype overhaul, and workflow integration

- Add /vertical-slice skill for pre-production validation (Phase 4 gate)
- Overhaul /prototype skill with two-mode design: concept prototype (Phase 1)
  vs vertical slice (Phase 4), with clearer differentiation and higher standards for VS
- Update prototyper agent to own both prototype and vertical-slice workflows
- Add prototype-report.md and vertical-slice-report.md output templates
- Update WORKFLOW-GUIDE, quick-start, skills-reference, agent-coordination-map,
  and skill-flow-diagrams to fully integrate both skills into the 7-phase pipeline
- Remove orphaned empty quick-prototype/ directory

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* sync v1 counts + polish

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Add entity inventory flow, relax vertical-slice gate, improve UX authoring prompts

- /asset-spec: new Phase 0b entity & screen inventory when no argument and no
  existing inventory — reads GDDs/art-bible, proposes categorized list, writes
  design/assets/entity-inventory.md collaboratively
- /asset-spec: entity/character target falls back to inline user description
  when no source doc exists, rather than failing
- /gate-check: vertical slice changed from blocking to CONCERNS-only when
  absent; built-but-broken slice still fails; adds entity inventory as gate artifact
- /ux-design: convert inline approval prompts to AskUserQuestion for structured
  option capture at key authoring decision points
- workflow-catalog.yaml: entity-inventory step added to pre-production; UX spec
  min_count raised to 3; vertical-slice and prototype marked required: false with
  updated descriptions
- .gitignore: exclude marrow/ eval tooling directory

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Add missing AskUserQuestion widgets to 7 skills

Audit found 11 decision points across 7 skills where structured option
prompts were missing — using plain text, auto-selection, or no gate at all.

Skills patched:
- create-epics: per-epic approval + producer CONCERNS verdict
- sprint-plan: producer CONCERNS verdict with scope/timeline options
- milestone-review: AT RISK / OFF TRACK producer verdicts require acknowledgement
- retrospective: existing-retro handling converted from plain text [A]/[B]
- quick-design: classification confirmation + draft approve/revise/redirect
- tech-debt add mode: category (6 options) + effort (S/M/L/XL) structured capture
- regression-suite: no-arg mode selection instead of silent auto-detect
- hotfix: severity confirmation gate before workflow begins

Also added AskUserQuestion to allowed-tools headers for retrospective,
quick-design, tech-debt, regression-suite, and hotfix.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Prep v1 stable: fix WORKFLOW-GUIDE counts, stale agent names, and skill model fields

- WORKFLOW-GUIDE.md: correct agent count (48→49), skill count (66/68→73),
  add 6 missing skills to Appendix B, fix Creative category count (2→4),
  replace 3 non-existent agent names with correct ue-*/unity-* specialists,
  add missing godot-csharp/gdextension specialists to hierarchy,
  fix production/stories/ paths → production/epics/
- coordination-rules.md: replace "not yet used" with opt-in env var note
- quick-start.md: rename duplicate "Validate the concept" label → "Prototype the mechanic"
- skill-flow-diagrams.md: remove duplicate legacy UX pipeline section
- All 62 skills missing model: field now have explicit model: sonnet

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: comprehensive skill audit — consistency, UX, and flow gaps

Two-pass audit fixing ~35 bugs across 41 files.

Pre-production flow:
- Brainstorm next-steps split into Path A (design-first) and Path B
  (prototype-first) — eliminates "prototype after architecture" confusion
- /architecture-review added to pre-production flow in brainstorm and
  create-architecture handoffs
- gate-check traceability check corrected to requirements-traceability.md
- dev-story TR registry error now points to /architecture-review (not /create-epics)
- start now writes production/stage.txt on first onboarding

AskUserQuestion gaps filled:
- balance-check, code-review, hotfix, day-one-patch, consistency-check
  all gain closing widgets and/or missing allowed-tools declarations
- hotfix git branch creation now requires user confirmation
- sprint-plan review-mode setup moved to Phase 0 (before gates run)
- team-combat gains architecture→implementation approval gate
- design-review APPROVED path consolidated from 3 widgets to 1 multiSelect

All 9 team-* skills:
- Phase 0 review-mode resolution added (solo/lean/full now respected)
- team-audio output path fixed (design/gdd/ → design/audio/)
- team-level final doc compilation delegated to level-designer subagent
- team-narrative localization-lead added to composition list
- team-qa sprint path fixed (flat files, not directories)
- team-release NO-GO override captures written justification
- team-live-ops Cancel verdict now explicitly BLOCKED

Other fixes:
- Art bible path standardized to design/art/art-bible.md (3 wrong refs)
- AD-PHASE-GATE added to lean-mode skip list in director-gates.md
- design-system duplicate 5d heading fixed; skeleton decline path added;
  mandatory agent spawns now respect review mode
- story-readiness acceptance criteria thresholds now type-aware
- create-stories gains multi-ADR and no-ADR handling guidance
- consistency-check creates docs/consistency-failures.md on first run
- retrospective frontmatter bash injection replaced with explicit Bash call
- smoke-check ls -t gains PowerShell fallback
- Conventional Commits format documented in coding-standards.md
- gate-check: ADR acceptance gate, QA plan check, chain-of-verification
  tool-action requirement all added

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: expose --review flag in argument-hints for all team-* skills

All 9 team-* skills already implement Phase 0 review-mode resolution
internally (full/lean/solo), but none advertised [--review full|lean|solo]
in their argument-hint. Users had no way to discover the per-run override.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs: add SECURITY.md with coordinated disclosure policy

Defines scope, reporting process (GitHub private vulnerability reporting),
contributor security guidelines for hooks/skills/agents, and 90-day
coordinated disclosure timeline.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs: add CONTRIBUTING.md with framework contribution guidelines

Covers what PRs are welcome, skill/hook/agent technical requirements,
the collaborative principle, testing expectations, commit format,
and platform compatibility requirements.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs: add v1.0.0-beta → v1.0 upgrade section to UPGRADING.md

Documents the 17 commits since the beta tag: new /vertical-slice gate,
entity inventory flow in /map-systems, AskUserQuestion widgets across
7 skills, --review flag exposure on team-* skills, bug fixes
(#21, #36, #42, #43, #45), and the new CONTRIBUTING.md and SECURITY.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-13 20:15:08 +10:00

257 lines
9.0 KiB
Markdown

---
name: regression-suite
description: "Map test coverage to GDD critical paths, identify fixed bugs without regression tests, flag coverage drift from new features, and maintain tests/regression-suite.md. Run after implementing a bug fix or before a release gate."
argument-hint: "[update | audit | report]"
user-invocable: true
allowed-tools: Read, Glob, Grep, Write, Edit, AskUserQuestion
model: sonnet
---
# Regression Suite
This skill ensures that every bug fix is backed by a test that would have
caught the original bug — and that the regression suite stays current as the
game evolves. It also detects when new features have been added without
corresponding regression coverage.
A regression suite is not a new test category — it is a **curated list of
tests already in `tests/`** that collectively cover the game's critical paths
and known failure points. This skill maintains that list.
**Output:** `tests/regression-suite.md`
**When to run:**
- After fixing a bug (confirm a regression test was written or identify gap)
- Before a release gate (`/gate-check polish` requires regression suite exists)
- As part of sprint close to detect coverage drift
---
## 1. Parse Arguments
**Modes:**
- `/regression-suite update` — scan new bug fixes this sprint and check
for regression test presence; add new tests to the suite manifest
- `/regression-suite audit` — full audit of all GDD critical paths vs.
existing test coverage; flag paths with no regression test
- `/regression-suite report` — read-only status report (no writes); suitable
for sprint reviews
- No argument — if a sprint is clearly active (sprint plan exists with in-progress stories), run `update`. If ambiguous or no active sprint is detected, use `AskUserQuestion`:
- Prompt: "No subcommand specified. Which mode do you want to run?"
- Options:
- `[A] update — scan new bug fixes this sprint and add missing regression tests`
- `[B] audit — full audit of all GDD critical paths vs. existing test coverage`
- `[C] report — read-only status report (no writes)`
---
## 2. Load Context
### Step 2a — Load existing regression suite
Read `tests/regression-suite.md` if it exists. Extract:
- Total registered regression tests
- Last updated date
- Any tests flagged as `STALE` or `QUARANTINED`
If it does not exist: note "No regression suite found — will create one."
### Step 2b — Load test inventory
Glob all test files:
```
tests/unit/**/*_test.*
tests/integration/**/*_test.*
tests/regression/**/*
```
For each file, note the system (from directory path) and file name.
Do not read test file contents unless needed for name-to-test mapping.
### Step 2c — Load GDD critical paths
For `audit` mode: read `design/gdd/systems-index.md` to get all systems.
For each MVP-tier system, read its GDD and extract:
- Acceptance Criteria (these define the critical paths)
- Formulas section (formulas must have regression tests)
- Edge Cases section (known edge cases should have regression tests)
For `update` mode: skip full GDD scan. Instead read the current sprint plan
and story files to find stories with Status: Complete this sprint.
### Step 2d — Load closed bugs
Glob `production/qa/bugs/*.md` and filter for bugs with a `Status: Closed`
or `Status: Fixed` field. Note:
- Which story or system the bug was in
- Whether a regression test was mentioned in the fix description
---
## 3. Map Coverage — Critical Paths
For `audit` mode only:
For each GDD acceptance criterion, determine whether a test exists:
1. Grep `tests/unit/[system]/` and `tests/integration/[system]/` for file names
and function names related to the criterion's key noun/verb
2. Assign coverage:
| Status | Meaning |
|--------|---------|
| **COVERED** | A test file exists that targets this criterion's logic |
| **PARTIAL** | A test exists but doesn't cover all cases (e.g. happy path only) |
| **MISSING** | No test found for this critical path |
| **EXEMPT** | Visual/Feel or UI criterion — not automatable by design |
3. Elevate MISSING items that correspond to formulas or state machines to
**HIGH PRIORITY** gap — these are the most likely regression sources.
---
## 4. Map Coverage — Fixed Bugs
For each closed bug:
1. Extract the system slug from the bug's metadata
2. Grep `tests/unit/[system]/` and `tests/integration/[system]/` for a test
that references the bug ID or the specific failure scenario
3. Assign:
- **HAS REGRESSION TEST** — a test was found that would catch this bug
- **MISSING REGRESSION TEST** — bug was fixed but no test guards against recurrence
For MISSING REGRESSION TEST items:
- Flag them as regression gaps
- Suggest the test file path: `tests/unit/[system]/[bug-slug]_regression_test.[ext]`
- Note: "Without this test, this bug can silently return in a future sprint."
---
## 5. Detect Coverage Drift
Coverage drift occurs when the game grows but the regression suite doesn't.
Check for drift indicators:
- Stories completed this sprint with no corresponding test files in `tests/`
- New systems added to `systems-index.md` since the last regression-suite update
- GDD sections added or revised since the regression suite was last updated
(use Grep on GDD file modification hints if available, or ask the user)
- `tests/regression-suite.md` last-updated date vs. current date — if gap >
2 sprints, flag as likely stale
---
## 6. Generate Report and Suite Manifest
### Report format (in conversation)
```
## Regression Suite Status
**Mode**: [update | audit | report]
**Existing registered tests**: [N]
**Test files scanned**: [N]
### Critical Path Coverage (audit mode only)
| System | Total ACs | Covered | Partial | Missing | Exempt |
|--------|-----------|---------|---------|---------|--------|
| [name] | [N] | [N] | [N] | [N] | [N] |
**Coverage rate (non-exempt)**: [N]%
### Bug Regression Coverage
| Bug ID | System | Severity | Has Regression Test? |
|--------|--------|----------|----------------------|
| BUG-NNN | [system] | S[N] | YES / NO ⚠ |
**Bugs without regression tests**: [N]
### Coverage Drift Indicators
[List new systems or stories with no test coverage, or "None detected."]
### Recommended New Regression Tests
| Priority | System | Suggested Test File | Covers |
|----------|--------|---------------------|--------|
| HIGH | [system] | `tests/unit/[system]/[slug]_regression_test.[ext]` | BUG-NNN / AC-[N] |
| MEDIUM | [system] | `tests/unit/[system]/[slug]_test.[ext]` | [criterion] |
```
### Suite manifest format (`tests/regression-suite.md`)
The manifest is a curated index — not the tests themselves, but a registry
of which tests should always pass before a release:
```markdown
# Regression Suite Manifest
> Last Updated: [date]
> Total registered tests: [N]
> Coverage: [N]% of GDD critical paths
## How to run
[Engine-specific command to run all regression tests]
## Registered Regression Tests
### [System Name]
| Test File | Test Function (if known) | Covers | Added |
|-----------|--------------------------|--------|-------|
| `tests/unit/[system]/[file]_test.[ext]` | `test_[scenario]` | AC-N / BUG-NNN | [date] |
## Known Gaps
Tests that should exist but don't yet:
| Priority | System | Suggested Path | Covers | Reason Not Yet Written |
|----------|--------|----------------|--------|------------------------|
| HIGH | [system] | `tests/unit/[system]/[path]` | BUG-NNN | Bug fixed without test |
## Quarantined Tests
Tests that are flaky or disabled (do not run in CI):
| Test File | Function | Reason | Quarantined Since |
|-----------|----------|--------|-------------------|
| (none) | | | |
```
---
## 7. Write Output
Ask: "May I write/update `tests/regression-suite.md` with the current
regression suite manifest?"
For `update` mode: append new entries; never remove existing entries
(use `Edit` with targeted insertions).
For `audit` mode: rewrite the full manifest with updated coverage data.
For `report` mode: do not write anything.
After writing (if approved):
- For each HIGH priority gap: "Consider creating the missing regression test
before the next sprint. Run `/test-helpers` to scaffold the test file."
- If bug regression gaps > 0: "These bugs can silently return without regression
tests. The next sprint should include a story to write the missing tests."
- If coverage drift detected: "Regression suite may be drifting. Consider
running `/regression-suite audit` at the next sprint boundary."
Verdict: **COMPLETE** — regression suite updated. (If user declined write: Verdict: **BLOCKED**.)
---
## Collaborative Protocol
- **Never remove existing regression tests from the manifest** without
explicit user approval — removing a test that was deliberately written is a
regression risk itself
- **Gaps are advisory, not blocking** — surface them clearly but do not prevent
other work from proceeding (except at release gate where regression suite is required)
- **Quarantine is not deletion** — tests with intermittent failures should be
quarantined (noted in manifest) but not removed; they should be fixed by
`/test-flakiness`
- **Ask before writing** — always confirm before creating or updating the manifest