Add comprehensive QA and testing framework (52→56 skills)

Introduces a full shift-left QA pipeline with Story Type classification as the backbone of the Definition of Done: New skills: - /test-setup: scaffold test framework + CI/CD per engine (Godot/Unity/Unreal) - /qa-plan: generate sprint test plan classifying stories by type - /smoke-check: critical path gate (PASS/PASS WITH WARNINGS/FAIL) before QA hand-off - /team-qa: orchestrate qa-lead + qa-tester through full QA cycle Story Type classification (Logic/Integration/Visual/Feel/UI/Config/Data): - Logic and Integration: BLOCKING DoD gate — unit/integration test required - Visual/Feel and UI: ADVISORY — screenshot + sign-off evidence required - Config/Data: ADVISORY — smoke check pass sufficient Updated skills: story-done (test evidence gate), story-readiness (Story Type check), gate-check (test framework at Technical Setup, test evidence at Polish/Release), create-epics-stories (Type field + Test Evidence section) Updated agents: qa-lead (shift-left philosophy + evidence table), qa-tester (automated test patterns for Godot/Unity/Unreal) New templates: test-evidence.md (manual sign-off record), test-plan.md (sprint-oriented QA plan replacing generic feature template) Updated coding-standards.md: Testing Standards section with DoD table, test rules, what NOT to automate, and engine-specific CI/CD commands Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-27 04:51:46 +00:00 · 2026-03-16 13:48:32 +11:00
parent a2f8ed93ff
commit 168ac96c3a
13 changed files with 1704 additions and 87 deletions
--- a/.claude/docs/templates/test-evidence.md
+++ b/.claude/docs/templates/test-evidence.md
@@ -0,0 +1,86 @@
+# Test Evidence: [Story Title]
+
+> **Story**: `[path to story file]`
+> **Story Type**: [Visual/Feel | UI]
+> **Date**: [date]
+> **Tester**: [who performed the test]
+> **Build / Commit**: [version or git hash]
+
+---
+
+## What Was Tested
+
+[One paragraph describing the feature or behaviour that was validated. Include
+the acceptance criteria numbers from the story that this evidence covers.]
+
+**Acceptance criteria covered**: [AC-1, AC-2, AC-3]
+
+---
+
+## Acceptance Criteria Results
+
+| # | Criterion (from story) | Result | Notes |
+|---|----------------------|--------|-------|
+| AC-1 | [exact criterion text] | PASS / FAIL | [any observations] |
+| AC-2 | [exact criterion text] | PASS / FAIL | |
+| AC-3 | [exact criterion text] | PASS / FAIL | |
+
+---
+
+## Screenshots / Video
+
+List all captured evidence below. Store files in the same directory as this
+document or in `production/qa/evidence/[story-slug]/`.
+
+| # | Filename | What It Shows | Acceptance Criterion |
+|---|----------|--------------|----------------------|
+| 1 | `[filename.png]` | [brief description of what is visible] | AC-1 |
+| 2 | `[filename.png]` | | AC-2 |
+
+*If video: note the timestamp and what it demonstrates.*
+
+---
+
+## Test Conditions
+
+- **Game state at start**: [e.g., "fresh save, player at level 1, no items"]
+- **Platform / hardware**: [e.g., "Windows 11, GTX 1080, 1080p"]
+- **Framerate during test**: [e.g., "stable 60fps" or "~45fps — within budget"]
+- **Any special setup required**: [e.g., "dev menu used to trigger specific state"]
+
+---
+
+## Observations
+
+[Anything noteworthy that didn't cause a FAIL but should be recorded. Examples:
+minor visual jitter, frame dip under load, behaviour that technically passes
+but felt slightly off. These become candidates for polish work.]
+
+- [Observation 1]
+- [Observation 2]
+
+If nothing notable: *No significant observations.*
+
+---
+
+## Sign-Off
+
+All three sign-offs are required before the story can be marked COMPLETE via
+`/story-done`. Visual/Feel stories require the designer or art-lead sign-off.
+UI stories require the UX lead or designer sign-off.
+
+| Role | Name | Date | Signature |
+|------|------|------|-----------|
+| Developer (implemented) | | | [ ] Approved |
+| Designer / Art Lead / UX Lead | | | [ ] Approved |
+| QA Lead | | | [ ] Approved |
+
+**Any sign-off can be marked "Deferred — [reason]"** if the person is
+unavailable. Deferred sign-offs must be resolved before the story advances
+past the sprint review.
+
+---
+
+*Template: `.claude/docs/templates/test-evidence.md`*
+*Used for: Visual/Feel and UI story type evidence records*
+*Location: `production/qa/evidence/[story-slug]-evidence.md`*
--- a/.claude/docs/templates/test-plan.md
+++ b/.claude/docs/templates/test-plan.md
@@ -1,97 +1,144 @@
-# Test Plan: [Feature/System Name]
+# QA Plan: [Sprint/Feature Name]

-## Overview
+> **Date**: [date]
+> **Generated by**: /qa-plan
+> **Scope**: [N stories across N systems]
+> **Engine**: [engine name and version]
+> **Sprint file**: [path to sprint plan]

- **Feature**: [Name]
- **Design Doc**: [Link to design document]
- **Implementation**: [Link to code or PR]
- **Author**: [QA owner]
- **Date**: [Date]
- **Priority**: [Critical / High / Medium / Low]
+---

-## Scope
+## Story Coverage Summary

-### In Scope
+| Story | Type | Automated Test Required | Manual Verification Required |
+|-------|------|------------------------|------------------------------|
+| [story title] | Logic | Unit test — `tests/unit/[system]/` | None |
+| [story title] | Integration | Integration test — `tests/integration/[system]/` | Smoke check |
+| [story title] | Visual/Feel | None (not automatable) | Screenshot + lead sign-off |
+| [story title] | UI | None (not automatable) | Manual step-through |
+| [story title] | Config/Data | Data validation (optional) | Spot-check in-game values |

- [What is being tested]
+**Totals**: [N] Logic, [N] Integration, [N] Visual/Feel, [N] UI, [N] Config/Data

-### Out of Scope
+---

- [What is explicitly NOT being tested and why]
+## Automated Tests Required

-### Dependencies
+### [Story Title] — Logic

- [Other systems that must be working for these tests to be valid]
+**Test file path**: `tests/unit/[system]/[story-slug]_test.[ext]`

-## Test Environment
+**What to test**:
+- [Formula or rule from GDD Formulas section — e.g., "damage = base * multiplier where multiplier ∈ [0.5, 3.0]"]
+- [Each named state transition]
+- [Each side effect that should / should not occur]

- **Build**: [Minimum build version]
- **Platform**: [Target platforms]
- **Preconditions**: [Required game state, save files, etc.]
+**Edge cases to cover**:
+- Zero / minimum input values
+- Maximum / boundary input values
+- Invalid or null input
+- [GDD-specified edge cases]

-## Test Cases
+**Estimated test count**: ~[N] unit tests

-### Functional Tests -- Happy Path
+---

-| ID | Test Case | Steps | Expected Result | Status |
-|----|-----------|-------|----------------|--------|
-| TC-001 | [Description] | 1. [Step] 2. [Step] | [Expected] | [ ] |
-| TC-002 | [Description] | 1. [Step] 2. [Step] | [Expected] | [ ] |
+### [Story Title] — Integration

-### Functional Tests -- Edge Cases
+**Test file path**: `tests/integration/[system]/[story-slug]_test.[ext]`

-| ID | Test Case | Steps | Expected Result | Status |
-|----|-----------|-------|----------------|--------|
-| TC-010 | [Boundary value] | 1. [Step] | [Expected] | [ ] |
-| TC-011 | [Zero/null input] | 1. [Step] | [Expected] | [ ] |
-| TC-012 | [Maximum values] | 1. [Step] | [Expected] | [ ] |
+**What to test**:
+- [Cross-system interaction — e.g., "applying buff updates CharacterStats and triggers UI refresh"]
+- [Round-trip — e.g., "save → load restores all fields"]

-### Negative Tests
+---

-| ID | Test Case | Steps | Expected Result | Status |
-|----|-----------|-------|----------------|--------|
-| TC-020 | [Invalid input] | 1. [Step] | [Graceful handling] | [ ] |
-| TC-021 | [Interrupted action] | 1. [Step] | [No corruption] | [ ] |
+## Manual QA Checklist

-### Integration Tests
+### [Story Title] — Visual/Feel

-| ID | Test Case | Systems Involved | Steps | Expected Result | Status |
-|----|-----------|-----------------|-------|----------------|--------|
-| TC-030 | [Cross-system interaction] | [System A, System B] | 1. [Step] | [Expected] | [ ] |
+**Verification method**: Screenshot + [designer / art-lead] sign-off
+**Evidence file**: `production/qa/evidence/[story-slug]-evidence.md`
+**Who must sign off**: [designer / lead-programmer / art-lead]

-### Performance Tests
+- [ ] [Specific observable condition — e.g., "hit flash appears on frame of impact, not the frame after"]
+- [ ] [Another falsifiable condition]

-| ID | Test Case | Metric | Budget | Steps | Status |
-|----|-----------|--------|--------|-------|--------|
-| TC-040 | [Load time] | Seconds | [X]s | 1. [Step] | [ ] |
-| TC-041 | [Frame rate] | FPS | [X] | 1. [Step] | [ ] |
-| TC-042 | [Memory usage] | MB | [X]MB | 1. [Step] | [ ] |
+### [Story Title] — UI

-### Regression Tests
+**Verification method**: Manual step-through
+**Evidence file**: `production/qa/evidence/[story-slug]-evidence.md`

-| ID | Related Bug | Test Case | Steps | Expected Result | Status |
-|----|------------|-----------|-------|----------------|--------|
-| TC-050 | BUG-[XXXX] | [Verify fix holds] | 1. [Step] | [Expected] | [ ] |
+- [ ] [Every acceptance criterion translated into a manual check item]

-## Test Results Summary
+---

-| Category | Total | Passed | Failed | Blocked | Skipped |
-|----------|-------|--------|--------|---------|---------|
-| Happy Path | | | | | |
-| Edge Cases | | | | | |
-| Negative | | | | | |
-| Integration | | | | | |
-| Performance | | | | | |
-| Regression | | | | | |
-| **Total** | | | | | |
+## Smoke Test Scope
+
+Critical paths to verify before QA hand-off (run via `/smoke-check`):
+
+1. Game launches to main menu without crash
+2. New game / session can be started
+3. [Primary mechanic introduced or changed this sprint]
+4. [System with regression risk from this sprint's changes]
+5. Save / load cycle completes without data loss (if save system exists)
+6. Performance is within budget on target hardware
+
+---
+
+## Playtest Requirements
+
+| Story | Playtest Goal | Min Sessions | Target Player Type |
+|-------|--------------|--------------|-------------------|
+| [story] | [What question must be answered?] | [N] | [new player / experienced / etc.] |
+
+Sign-off requirement: Playtest notes → `production/session-logs/playtest-[sprint]-[story-slug].md`
+
+If no playtest sessions required: *No playtest sessions required for this sprint.*
+
+---
+
+## Definition of Done — This Sprint
+
+A story is DONE when ALL of the following are true:
+
+- [ ] All acceptance criteria verified — automated test result OR documented manual evidence
+- [ ] Test file exists for all Logic and Integration stories and passes
+- [ ] Manual evidence document exists for all Visual/Feel and UI stories
+- [ ] Smoke check passes (run `/smoke-check sprint` before QA hand-off)
+- [ ] No regressions introduced — previous sprint's features still pass
+- [ ] Code reviewed (via `/code-review` or documented peer review)
+- [ ] Story file updated to `Status: Complete` via `/story-done`
+
+**Stories requiring playtest sign-off before close**: [list, or "None"]
+
+---
+
+## Test Results
+
+*Fill in after testing is complete.*
+
+| Story | Automated | Manual | Result | Notes |
+|-------|-----------|--------|--------|-------|
+| [title] | PASS | — | PASS | |
+| [title] | — | PASS | PASS | |
+| [title] | FAIL | — | BLOCKED | [describe failure] |
+
+---

 ## Bugs Found

-| Bug ID | Severity | Test Case | Description | Status |
-|--------|----------|-----------|-------------|--------|
+| ID | Story | Severity | Description | Status |
+|----|-------|----------|-------------|--------|
+| BUG-001 | | S[1-4] | | Open |
+
+---

 ## Sign-Off

- **QA Tester**: [Name] -- [Date]
- **QA Lead**: [Name] -- [Date]
- **Feature Owner**: [Name] -- [Date]
+- **QA Tester**: [name] — [date]
+- **QA Lead**: [name] — [date]
+- **Sprint Owner**: [name] — [date]
+
+*Template: `.claude/docs/templates/test-plan.md`*
+*Generated by: `/qa-plan` — do not edit this line*