Files
Donchitos 984023ddac Release v1.0.0 — concept-prototype/vertical-slice split, workflow restructure, polish (#50)
* Add /vertical-slice skill, prototype overhaul, and workflow integration

- Add /vertical-slice skill for pre-production validation (Phase 4 gate)
- Overhaul /prototype skill with two-mode design: concept prototype (Phase 1)
  vs vertical slice (Phase 4), with clearer differentiation and higher standards for VS
- Update prototyper agent to own both prototype and vertical-slice workflows
- Add prototype-report.md and vertical-slice-report.md output templates
- Update WORKFLOW-GUIDE, quick-start, skills-reference, agent-coordination-map,
  and skill-flow-diagrams to fully integrate both skills into the 7-phase pipeline
- Remove orphaned empty quick-prototype/ directory

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* sync v1 counts + polish

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Add entity inventory flow, relax vertical-slice gate, improve UX authoring prompts

- /asset-spec: new Phase 0b entity & screen inventory when no argument and no
  existing inventory — reads GDDs/art-bible, proposes categorized list, writes
  design/assets/entity-inventory.md collaboratively
- /asset-spec: entity/character target falls back to inline user description
  when no source doc exists, rather than failing
- /gate-check: vertical slice changed from blocking to CONCERNS-only when
  absent; built-but-broken slice still fails; adds entity inventory as gate artifact
- /ux-design: convert inline approval prompts to AskUserQuestion for structured
  option capture at key authoring decision points
- workflow-catalog.yaml: entity-inventory step added to pre-production; UX spec
  min_count raised to 3; vertical-slice and prototype marked required: false with
  updated descriptions
- .gitignore: exclude marrow/ eval tooling directory

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Add missing AskUserQuestion widgets to 7 skills

Audit found 11 decision points across 7 skills where structured option
prompts were missing — using plain text, auto-selection, or no gate at all.

Skills patched:
- create-epics: per-epic approval + producer CONCERNS verdict
- sprint-plan: producer CONCERNS verdict with scope/timeline options
- milestone-review: AT RISK / OFF TRACK producer verdicts require acknowledgement
- retrospective: existing-retro handling converted from plain text [A]/[B]
- quick-design: classification confirmation + draft approve/revise/redirect
- tech-debt add mode: category (6 options) + effort (S/M/L/XL) structured capture
- regression-suite: no-arg mode selection instead of silent auto-detect
- hotfix: severity confirmation gate before workflow begins

Also added AskUserQuestion to allowed-tools headers for retrospective,
quick-design, tech-debt, regression-suite, and hotfix.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Prep v1 stable: fix WORKFLOW-GUIDE counts, stale agent names, and skill model fields

- WORKFLOW-GUIDE.md: correct agent count (48→49), skill count (66/68→73),
  add 6 missing skills to Appendix B, fix Creative category count (2→4),
  replace 3 non-existent agent names with correct ue-*/unity-* specialists,
  add missing godot-csharp/gdextension specialists to hierarchy,
  fix production/stories/ paths → production/epics/
- coordination-rules.md: replace "not yet used" with opt-in env var note
- quick-start.md: rename duplicate "Validate the concept" label → "Prototype the mechanic"
- skill-flow-diagrams.md: remove duplicate legacy UX pipeline section
- All 62 skills missing model: field now have explicit model: sonnet

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: comprehensive skill audit — consistency, UX, and flow gaps

Two-pass audit fixing ~35 bugs across 41 files.

Pre-production flow:
- Brainstorm next-steps split into Path A (design-first) and Path B
  (prototype-first) — eliminates "prototype after architecture" confusion
- /architecture-review added to pre-production flow in brainstorm and
  create-architecture handoffs
- gate-check traceability check corrected to requirements-traceability.md
- dev-story TR registry error now points to /architecture-review (not /create-epics)
- start now writes production/stage.txt on first onboarding

AskUserQuestion gaps filled:
- balance-check, code-review, hotfix, day-one-patch, consistency-check
  all gain closing widgets and/or missing allowed-tools declarations
- hotfix git branch creation now requires user confirmation
- sprint-plan review-mode setup moved to Phase 0 (before gates run)
- team-combat gains architecture→implementation approval gate
- design-review APPROVED path consolidated from 3 widgets to 1 multiSelect

All 9 team-* skills:
- Phase 0 review-mode resolution added (solo/lean/full now respected)
- team-audio output path fixed (design/gdd/ → design/audio/)
- team-level final doc compilation delegated to level-designer subagent
- team-narrative localization-lead added to composition list
- team-qa sprint path fixed (flat files, not directories)
- team-release NO-GO override captures written justification
- team-live-ops Cancel verdict now explicitly BLOCKED

Other fixes:
- Art bible path standardized to design/art/art-bible.md (3 wrong refs)
- AD-PHASE-GATE added to lean-mode skip list in director-gates.md
- design-system duplicate 5d heading fixed; skeleton decline path added;
  mandatory agent spawns now respect review mode
- story-readiness acceptance criteria thresholds now type-aware
- create-stories gains multi-ADR and no-ADR handling guidance
- consistency-check creates docs/consistency-failures.md on first run
- retrospective frontmatter bash injection replaced with explicit Bash call
- smoke-check ls -t gains PowerShell fallback
- Conventional Commits format documented in coding-standards.md
- gate-check: ADR acceptance gate, QA plan check, chain-of-verification
  tool-action requirement all added

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: expose --review flag in argument-hints for all team-* skills

All 9 team-* skills already implement Phase 0 review-mode resolution
internally (full/lean/solo), but none advertised [--review full|lean|solo]
in their argument-hint. Users had no way to discover the per-run override.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs: add SECURITY.md with coordinated disclosure policy

Defines scope, reporting process (GitHub private vulnerability reporting),
contributor security guidelines for hooks/skills/agents, and 90-day
coordinated disclosure timeline.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs: add CONTRIBUTING.md with framework contribution guidelines

Covers what PRs are welcome, skill/hook/agent technical requirements,
the collaborative principle, testing expectations, commit format,
and platform compatibility requirements.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs: add v1.0.0-beta → v1.0 upgrade section to UPGRADING.md

Documents the 17 commits since the beta tag: new /vertical-slice gate,
entity inventory flow in /map-systems, AskUserQuestion widgets across
7 skills, --review flag exposure on team-* skills, bug fixes
(#21, #36, #42, #43, #45), and the new CONTRIBUTING.md and SECURITY.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-13 20:15:08 +10:00

423 lines
16 KiB
Markdown

---
name: smoke-check
description: "Run the critical path smoke test gate before QA hand-off. Executes the automated test suite, verifies core functionality, and produces a PASS/FAIL report. Run after a sprint's stories are implemented and before manual QA begins. A failed smoke check means the build is not ready for QA."
argument-hint: "[sprint | quick | --platform pc|console|mobile|all]"
user-invocable: true
allowed-tools: Read, Glob, Grep, Bash, Write, AskUserQuestion
model: sonnet
---
# Smoke Check
This skill is the gate between "implementation done" and "ready for QA
hand-off". It runs the automated test suite, checks for test coverage gaps,
batch-verifies critical paths with the developer, and produces a PASS/FAIL
report.
The rule is simple: **a build that fails smoke check does not go to QA.**
Handing a broken build to QA wastes their time and demoralises the team.
**Output:** `production/qa/smoke-[date].md`
---
## Parse Arguments
Arguments can be combined: `/smoke-check sprint --platform console`
**Base mode** (first argument, default: `sprint`):
- `sprint` — full smoke check against the current sprint's stories
- `quick` — skip coverage scan (Phase 3) and Batch 3; use for rapid re-checks
**Platform flag** (`--platform`, default: none):
- `--platform pc` — add PC-specific checks (keyboard, mouse, windowed mode)
- `--platform console` — add console-specific checks (gamepad, TV safe zones,
platform certification requirements)
- `--platform mobile` — add mobile-specific checks (touch, portrait/landscape,
battery/thermal behaviour)
- `--platform all` — add all platform variants; output per-platform verdict table
If `--platform` is provided, Phase 4 adds platform-specific batches and
Phase 5 outputs a per-platform verdict table in addition to the overall verdict.
---
## Phase 1: Detect Test Setup
Before running anything, understand the environment:
1. **Test framework check**: verify `tests/` directory exists.
If it does not: "No test directory found at `tests/`. Run `/test-setup`
to scaffold the testing infrastructure, or create the directory manually
if tests live elsewhere." Then stop.
2. **CI check**: check whether `.github/workflows/` contains a workflow file
referencing tests. Note in the report whether CI is configured.
3. **Engine detection**: read `.claude/docs/technical-preferences.md` and
extract the `Engine:` value. Store this for test command selection in
Phase 2.
4. **Smoke test list**: check whether `production/qa/smoke-tests.md` or
`tests/smoke/` exists. If a smoke test list is found, load it for use in
Phase 4. If neither exists, smoke tests will be drawn from the current QA
plan (Phase 4 fallback).
5. **QA plan check**: glob `production/qa/qa-plan-*.md` and take the most
recently modified file. If found, note the path — it will be used in
Phase 3 and Phase 4. If not found, note: "No QA plan found. Run
`/qa-plan sprint` before smoke-checking for best results."
Report findings before proceeding: "Environment: [engine]. Test directory:
[found / not found]. CI configured: [yes / no]. QA plan: [path / not found]."
---
## Phase 2: Run Automated Tests
Attempt to run the test suite via Bash. Select the command based on the engine
detected in Phase 1:
**Godot 4:**
```bash
godot --headless --script tests/gdunit4_runner.gd 2>&1
```
If the GDUnit4 runner script does not exist at that path, try:
```bash
godot --headless -s addons/gdunit4/GdUnitRunner.gd 2>&1
```
If neither path exists, note: "GDUnit4 runner not found — confirm the runner
path for your test framework."
**Unity:**
Unity tests require the editor and cannot be run headlessly via shell in most
environments. Check for recent test result artifacts:
```bash
# List most recent test results (bash) — on Windows PowerShell use the fallback below
ls -t test-results/ 2>/dev/null | head -5 \
|| powershell -Command "Get-ChildItem test-results/ -ErrorAction SilentlyContinue | Sort-Object LastWriteTime -Descending | Select-Object -First 5 -ExpandProperty Name"
```
If test result files exist (XML or JSON), read the most recent one and parse
PASS/FAIL counts. If no artifacts exist: "Unity tests must be run from the
editor or CI pipeline. Please confirm test status manually before proceeding."
**Unreal Engine:**
```bash
# List most recent Unreal automation logs (bash) — on Windows PowerShell use the fallback below
ls -t Saved/Logs/ 2>/dev/null | grep -i "test\|automation" | head -5 \
|| powershell -Command "Get-ChildItem Saved/Logs/ -ErrorAction SilentlyContinue | Where-Object { $_.Name -match 'test|automation' } | Sort-Object LastWriteTime -Descending | Select-Object -First 5 -ExpandProperty Name"
```
If no matching log found: "UE automation tests must be run via the Session
Frontend or CI pipeline. Please confirm test status manually."
**Unknown engine / not configured:**
"Engine not configured in `.claude/docs/technical-preferences.md`. Run
`/setup-engine` to specify the engine, then re-run `/smoke-check`."
**If the test runner is not available in this environment** (engine binary not
on PATH, runner script not found, etc.), report clearly:
"Automated tests could not be executed — engine binary not found on PATH.
Status will be recorded as NOT RUN. Confirm test results from your local IDE
or CI pipeline. Unconfirmed NOT RUN is treated as PASS WITH WARNINGS, not
FAIL — the developer must manually confirm results."
Do not treat NOT RUN as an automatic FAIL. Record it as a warning. The
developer's manual confirmation in Phase 4 can resolve it.
Parse runner output and extract:
- Total tests run
- Passing count
- Failing count
- Names of any failing tests (up to 10; if more, note the count)
- Any crash or error output from the runner itself
---
## Phase 3: Check Test Coverage
Draw the story list from, in priority order:
1. The QA plan found in Phase 1 (its Test Summary table lists expected test
file paths per story)
2. The current sprint plan from `production/sprints/` (most recently modified
file)
3. If the `quick` argument was passed, skip this phase entirely and note:
"Coverage scan skipped — run `/smoke-check sprint` for full coverage
analysis."
For each story in scope:
1. Extract the system slug from the story's file path
(e.g., `production/epics/combat/story-001.md``combat`)
2. Glob `tests/unit/[system]/` and `tests/integration/[system]/` for files
whose name contains the story slug or a closely related term
3. Check the story file itself for a `Test file:` header field or a
"Test Evidence" section
Assign a coverage status to each story:
| Status | Meaning |
|--------|---------|
| **COVERED** | A test file was found matching this story's system and scope |
| **MANUAL** | Story type is Visual/Feel or UI; a test evidence document was found |
| **MISSING** | Logic or Integration story with no matching test file |
| **EXPECTED** | Config/Data story — no test file required; spot-check is sufficient |
| **UNKNOWN** | Story file missing or unreadable |
MISSING entries are advisory gaps. They do not cause a FAIL verdict but must
appear prominently in the report and must be resolved before `/story-done` can
fully close those stories.
---
## Phase 4: Run Manual Smoke Checks
Draw the smoke test checklist from, in priority order:
1. The QA plan's "Smoke Test Scope" section (if QA plan was found in Phase 1)
2. `production/qa/smoke-tests.md` (if it exists)
3. `tests/smoke/` directory contents (if it exists)
4. The standard fallback list below (used only when none of the above exist)
Tailor batches 2 and 3 to the actual systems identified from the sprint or QA
plan. Replace bracketed placeholders with real mechanic names from the current
sprint's stories.
Use `AskUserQuestion` to batch-verify. Keep to at most 3 calls.
**Batch 1 — Core stability (always run):**
```
question: "Core stability — select any items that FAILED (leave all unselected if everything passed):"
multiSelect: true
options:
- "Game does not launch or crashes before reaching the main menu"
- "New game / session fails to start"
- "Main menu does not respond to inputs"
- "Crash or hang observed during basic navigation"
```
For any selected item, ask the user to briefly describe what failed before generating the report.
**Batch 2 — Sprint changes and regression (always run):**
```
question: "Sprint changes and regression — select any items that FAILED (leave all unselected if everything passed):"
multiSelect: true
options:
- "[Primary mechanic this sprint] — FAILED"
- "[Second notable change this sprint, if any] — FAILED"
- "Regression in a previous sprint's feature — FAILED"
- "Other unexpected breakage observed — FAILED"
```
For any selected item, ask the user to briefly describe what broke before generating the report.
**Batch 3 — Data integrity and performance (run unless `quick` argument):**
```
question: "Data integrity and performance — select any items that FAILED or were skipped (leave all unselected if everything passed):"
multiSelect: true
options:
- "Save / load — FAILED (data loss or corruption observed)"
- "Save / load — N/A (save system not yet implemented)"
- "Frame rate drops or hitches observed — FAILED"
- "Performance not checked this session"
```
For any FAILED item selected, ask the user to describe what broke before generating the report.
Record each response verbatim for the Phase 5 report.
**Platform Batches** *(run only if `--platform` argument was provided)*:
**PC platform** (`--platform pc` or `--platform all`):
```
question: "PC Platform — select any items that FAILED (leave all unselected if everything passed):"
multiSelect: true
options:
- "Keyboard controls — FAILED (describe issue after)"
- "Mouse input or cursor visibility — FAILED (describe issue after)"
- "Windowed / fullscreen mode — FAILED (describe issue after)"
- "Resolution change — FAILED (describe issue after)"
```
For any selected item, ask the user to briefly describe what failed before generating the report.
**Console platform** (`--platform console` or `--platform all`):
```
question: "Console Platform — select any items that FAILED (leave all unselected if everything passed):"
multiSelect: true
options:
- "Gamepad input — FAILED (describe issue after)"
- "UI outside TV safe zone / text clipped — FAILED (describe what is clipped after)"
- "Keyboard/mouse fallback shown to gamepad user — FAILED (describe after)"
- "Cold start (no prior save) — FAILED (describe issue after)"
```
For any selected item, ask the user to briefly describe what failed before generating the report.
**Mobile platform** (`--platform mobile` or `--platform all`):
```
question: "Mobile Platform — select any items that FAILED (leave all unselected if everything passed):"
multiSelect: true
options:
- "Touch controls — FAILED (describe issue after)"
- "Orientation change (portrait ↔ landscape) — FAILED (describe what breaks after)"
- "Background / foreground transition (home button) — FAILED (describe issue after)"
- "Performance / thermal throttling on target device — FAILED (describe after)"
```
For any selected item, ask the user to briefly describe what failed before generating the report.
---
## Phase 5: Generate Report
Assemble the full smoke check report:
````markdown
## Smoke Check Report
**Date**: [date]
**Sprint**: [sprint name / number, or "Not identified"]
**Engine**: [engine]
**QA Plan**: [path, or "Not found — run /qa-plan first"]
**Argument**: [sprint | quick | blank]
---
### Automated Tests
**Status**: [PASS ([N] tests, [N] passing) | FAIL ([N] failures) |
NOT RUN ([reason])]
[If FAIL, list failing tests:]
- `[test name]` — [brief failure description from runner output]
[If NOT RUN:]
"Manual confirmation required: did tests pass in your local IDE or CI? This
will determine whether the automated test row contributes to a FAIL verdict."
---
### Test Coverage
| Story | Type | Test File | Coverage Status |
|-------|------|-----------|----------------|
| [title] | Logic | `tests/unit/[system]/[slug]_test.[ext]` | COVERED |
| [title] | Visual/Feel | `tests/evidence/[slug]-screenshots.md` | MANUAL |
| [title] | Logic | — | MISSING ⚠ |
| [title] | Config/Data | — | EXPECTED |
**Summary**: [N] covered, [N] manual, [N] missing, [N] expected.
---
### Manual Smoke Checks
- [x] Game launches without crash — PASS
- [x] New game starts — PASS
- [x] [Core mechanic] — PASS
- [ ] [Other check] — FAIL: [user's description]
- [x] Save / load — PASS
- [-] Performance — not checked this session
---
### Missing Test Evidence
Stories that must have test evidence before they can be marked COMPLETE via
`/story-done`:
- **[story title]** (`[path]`) — Logic story has no test file.
Expected location: `tests/unit/[system]/[story-slug]_test.[ext]`
[If none:] "All Logic and Integration stories have test coverage."
---
### Platform-Specific Results *(only if `--platform` was provided)*
| Platform | Checks Run | Passed | Failed | Platform Verdict |
|----------|-----------|--------|--------|-----------------|
| PC | [N] | [N] | [N] | PASS / FAIL |
| Console | [N] | [N] | [N] | PASS / FAIL |
| Mobile | [N] | [N] | [N] | PASS / FAIL |
**Platform notes**: [any platform-specific observations not captured in pass/fail]
Any platform with one or more FAIL checks contributes to the overall FAIL verdict.
---
### Verdict: [PASS | PASS WITH WARNINGS | FAIL]
[Verdict rules — first matching rule wins:]
**FAIL** if ANY of:
- Automated test suite ran and reported one or more test failures
- Any Batch 1 (core stability) check returned FAIL
- Any Batch 2 (primary sprint mechanic or regression check) returned FAIL
**PASS WITH WARNINGS** if ALL of:
- Automated tests PASS or NOT RUN (developer has not yet confirmed)
- All Batch 1 and Batch 2 smoke checks PASS
- One or more Logic/Integration stories have MISSING test evidence
**PASS** if ALL of:
- Automated tests PASS
- All smoke checks in all batches PASS or N/A
- No MISSING test evidence entries
````
---
## Phase 6: Write and Gate
Present the full report in conversation, then ask:
"May I write this smoke check report to `production/qa/smoke-[date].md`?"
Write only after approval.
After writing, deliver the gate verdict:
**If verdict is FAIL:**
"The smoke check failed. Do not hand off to QA until these failures are
resolved:
[List each failing automated test or smoke check with a one-line description]
Fix the failures and run `/smoke-check` again to re-gate before QA hand-off."
**If verdict is PASS WITH WARNINGS:**
"Smoke check passed with warnings. The build is ready for manual QA.
Advisory items to resolve before running `/story-done` on affected stories:
[list MISSING test evidence entries]
QA hand-off: share `production/qa/qa-plan-[sprint].md` with the qa-tester
agent to begin manual verification."
**If verdict is PASS:**
"Smoke check passed cleanly. The build is ready for manual QA.
QA hand-off: share `production/qa/qa-plan-[sprint].md` with the qa-tester
agent to begin manual verification."
---
## Collaborative Protocol
- **Never treat NOT RUN as automatic FAIL** — record it as NOT RUN and let
the developer confirm status manually. Unconfirmed NOT RUN contributes to
PASS WITH WARNINGS, not FAIL.
- **Never auto-fix failures** — report them and state what must be resolved.
Do not attempt to edit source code or test files.
- **PASS WITH WARNINGS does not block QA hand-off** — it records advisory
gaps for `/story-done` to follow up on.
- **`quick` argument** skips Phase 3 (coverage scan) and Phase 4 Batch 3.
Use it for rapid re-checks after fixing a specific failure.
- Use `AskUserQuestion` for all manual smoke check verification.
- **Never write the report without asking** — Phase 6 requires explicit
approval before any file is created.