Add comprehensive QA and testing framework (52→56 skills)

Introduces a full shift-left QA pipeline with Story Type classification
as the backbone of the Definition of Done:

New skills:
- /test-setup: scaffold test framework + CI/CD per engine (Godot/Unity/Unreal)
- /qa-plan: generate sprint test plan classifying stories by type
- /smoke-check: critical path gate (PASS/PASS WITH WARNINGS/FAIL) before QA hand-off
- /team-qa: orchestrate qa-lead + qa-tester through full QA cycle

Story Type classification (Logic/Integration/Visual/Feel/UI/Config/Data):
- Logic and Integration: BLOCKING DoD gate — unit/integration test required
- Visual/Feel and UI: ADVISORY — screenshot + sign-off evidence required
- Config/Data: ADVISORY — smoke check pass sufficient

Updated skills: story-done (test evidence gate), story-readiness (Story Type
check), gate-check (test framework at Technical Setup, test evidence at
Polish/Release), create-epics-stories (Type field + Test Evidence section)

Updated agents: qa-lead (shift-left philosophy + evidence table),
qa-tester (automated test patterns for Godot/Unity/Unreal)

New templates: test-evidence.md (manual sign-off record), test-plan.md
(sprint-oriented QA plan replacing generic feature template)

Updated coding-standards.md: Testing Standards section with DoD table,
test rules, what NOT to automate, and engine-specific CI/CD commands

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
Donchitos
2026-03-16 13:48:32 +11:00
parent a2f8ed93ff
commit 168ac96c3a
13 changed files with 1704 additions and 87 deletions

86
.claude/docs/templates/test-evidence.md vendored Normal file
View File

@@ -0,0 +1,86 @@
# Test Evidence: [Story Title]
> **Story**: `[path to story file]`
> **Story Type**: [Visual/Feel | UI]
> **Date**: [date]
> **Tester**: [who performed the test]
> **Build / Commit**: [version or git hash]
---
## What Was Tested
[One paragraph describing the feature or behaviour that was validated. Include
the acceptance criteria numbers from the story that this evidence covers.]
**Acceptance criteria covered**: [AC-1, AC-2, AC-3]
---
## Acceptance Criteria Results
| # | Criterion (from story) | Result | Notes |
|---|----------------------|--------|-------|
| AC-1 | [exact criterion text] | PASS / FAIL | [any observations] |
| AC-2 | [exact criterion text] | PASS / FAIL | |
| AC-3 | [exact criterion text] | PASS / FAIL | |
---
## Screenshots / Video
List all captured evidence below. Store files in the same directory as this
document or in `production/qa/evidence/[story-slug]/`.
| # | Filename | What It Shows | Acceptance Criterion |
|---|----------|--------------|----------------------|
| 1 | `[filename.png]` | [brief description of what is visible] | AC-1 |
| 2 | `[filename.png]` | | AC-2 |
*If video: note the timestamp and what it demonstrates.*
---
## Test Conditions
- **Game state at start**: [e.g., "fresh save, player at level 1, no items"]
- **Platform / hardware**: [e.g., "Windows 11, GTX 1080, 1080p"]
- **Framerate during test**: [e.g., "stable 60fps" or "~45fps — within budget"]
- **Any special setup required**: [e.g., "dev menu used to trigger specific state"]
---
## Observations
[Anything noteworthy that didn't cause a FAIL but should be recorded. Examples:
minor visual jitter, frame dip under load, behaviour that technically passes
but felt slightly off. These become candidates for polish work.]
- [Observation 1]
- [Observation 2]
If nothing notable: *No significant observations.*
---
## Sign-Off
All three sign-offs are required before the story can be marked COMPLETE via
`/story-done`. Visual/Feel stories require the designer or art-lead sign-off.
UI stories require the UX lead or designer sign-off.
| Role | Name | Date | Signature |
|------|------|------|-----------|
| Developer (implemented) | | | [ ] Approved |
| Designer / Art Lead / UX Lead | | | [ ] Approved |
| QA Lead | | | [ ] Approved |
**Any sign-off can be marked "Deferred — [reason]"** if the person is
unavailable. Deferred sign-offs must be resolved before the story advances
past the sprint review.
---
*Template: `.claude/docs/templates/test-evidence.md`*
*Used for: Visual/Feel and UI story type evidence records*
*Location: `production/qa/evidence/[story-slug]-evidence.md`*

View File

@@ -1,97 +1,144 @@
# Test Plan: [Feature/System Name]
# QA Plan: [Sprint/Feature Name]
## Overview
> **Date**: [date]
> **Generated by**: /qa-plan
> **Scope**: [N stories across N systems]
> **Engine**: [engine name and version]
> **Sprint file**: [path to sprint plan]
- **Feature**: [Name]
- **Design Doc**: [Link to design document]
- **Implementation**: [Link to code or PR]
- **Author**: [QA owner]
- **Date**: [Date]
- **Priority**: [Critical / High / Medium / Low]
---
## Scope
## Story Coverage Summary
### In Scope
| Story | Type | Automated Test Required | Manual Verification Required |
|-------|------|------------------------|------------------------------|
| [story title] | Logic | Unit test — `tests/unit/[system]/` | None |
| [story title] | Integration | Integration test — `tests/integration/[system]/` | Smoke check |
| [story title] | Visual/Feel | None (not automatable) | Screenshot + lead sign-off |
| [story title] | UI | None (not automatable) | Manual step-through |
| [story title] | Config/Data | Data validation (optional) | Spot-check in-game values |
- [What is being tested]
**Totals**: [N] Logic, [N] Integration, [N] Visual/Feel, [N] UI, [N] Config/Data
### Out of Scope
---
- [What is explicitly NOT being tested and why]
## Automated Tests Required
### Dependencies
### [Story Title] — Logic
- [Other systems that must be working for these tests to be valid]
**Test file path**: `tests/unit/[system]/[story-slug]_test.[ext]`
## Test Environment
**What to test**:
- [Formula or rule from GDD Formulas section — e.g., "damage = base * multiplier where multiplier ∈ [0.5, 3.0]"]
- [Each named state transition]
- [Each side effect that should / should not occur]
- **Build**: [Minimum build version]
- **Platform**: [Target platforms]
- **Preconditions**: [Required game state, save files, etc.]
**Edge cases to cover**:
- Zero / minimum input values
- Maximum / boundary input values
- Invalid or null input
- [GDD-specified edge cases]
## Test Cases
**Estimated test count**: ~[N] unit tests
### Functional Tests -- Happy Path
---
| ID | Test Case | Steps | Expected Result | Status |
|----|-----------|-------|----------------|--------|
| TC-001 | [Description] | 1. [Step] 2. [Step] | [Expected] | [ ] |
| TC-002 | [Description] | 1. [Step] 2. [Step] | [Expected] | [ ] |
### [Story Title] — Integration
### Functional Tests -- Edge Cases
**Test file path**: `tests/integration/[system]/[story-slug]_test.[ext]`
| ID | Test Case | Steps | Expected Result | Status |
|----|-----------|-------|----------------|--------|
| TC-010 | [Boundary value] | 1. [Step] | [Expected] | [ ] |
| TC-011 | [Zero/null input] | 1. [Step] | [Expected] | [ ] |
| TC-012 | [Maximum values] | 1. [Step] | [Expected] | [ ] |
**What to test**:
- [Cross-system interaction — e.g., "applying buff updates CharacterStats and triggers UI refresh"]
- [Round-trip — e.g., "save → load restores all fields"]
### Negative Tests
---
| ID | Test Case | Steps | Expected Result | Status |
|----|-----------|-------|----------------|--------|
| TC-020 | [Invalid input] | 1. [Step] | [Graceful handling] | [ ] |
| TC-021 | [Interrupted action] | 1. [Step] | [No corruption] | [ ] |
## Manual QA Checklist
### Integration Tests
### [Story Title] — Visual/Feel
| ID | Test Case | Systems Involved | Steps | Expected Result | Status |
|----|-----------|-----------------|-------|----------------|--------|
| TC-030 | [Cross-system interaction] | [System A, System B] | 1. [Step] | [Expected] | [ ] |
**Verification method**: Screenshot + [designer / art-lead] sign-off
**Evidence file**: `production/qa/evidence/[story-slug]-evidence.md`
**Who must sign off**: [designer / lead-programmer / art-lead]
### Performance Tests
- [ ] [Specific observable condition — e.g., "hit flash appears on frame of impact, not the frame after"]
- [ ] [Another falsifiable condition]
| ID | Test Case | Metric | Budget | Steps | Status |
|----|-----------|--------|--------|-------|--------|
| TC-040 | [Load time] | Seconds | [X]s | 1. [Step] | [ ] |
| TC-041 | [Frame rate] | FPS | [X] | 1. [Step] | [ ] |
| TC-042 | [Memory usage] | MB | [X]MB | 1. [Step] | [ ] |
### [Story Title] — UI
### Regression Tests
**Verification method**: Manual step-through
**Evidence file**: `production/qa/evidence/[story-slug]-evidence.md`
| ID | Related Bug | Test Case | Steps | Expected Result | Status |
|----|------------|-----------|-------|----------------|--------|
| TC-050 | BUG-[XXXX] | [Verify fix holds] | 1. [Step] | [Expected] | [ ] |
- [ ] [Every acceptance criterion translated into a manual check item]
## Test Results Summary
---
| Category | Total | Passed | Failed | Blocked | Skipped |
|----------|-------|--------|--------|---------|---------|
| Happy Path | | | | | |
| Edge Cases | | | | | |
| Negative | | | | | |
| Integration | | | | | |
| Performance | | | | | |
| Regression | | | | | |
| **Total** | | | | | |
## Smoke Test Scope
Critical paths to verify before QA hand-off (run via `/smoke-check`):
1. Game launches to main menu without crash
2. New game / session can be started
3. [Primary mechanic introduced or changed this sprint]
4. [System with regression risk from this sprint's changes]
5. Save / load cycle completes without data loss (if save system exists)
6. Performance is within budget on target hardware
---
## Playtest Requirements
| Story | Playtest Goal | Min Sessions | Target Player Type |
|-------|--------------|--------------|-------------------|
| [story] | [What question must be answered?] | [N] | [new player / experienced / etc.] |
Sign-off requirement: Playtest notes → `production/session-logs/playtest-[sprint]-[story-slug].md`
If no playtest sessions required: *No playtest sessions required for this sprint.*
---
## Definition of Done — This Sprint
A story is DONE when ALL of the following are true:
- [ ] All acceptance criteria verified — automated test result OR documented manual evidence
- [ ] Test file exists for all Logic and Integration stories and passes
- [ ] Manual evidence document exists for all Visual/Feel and UI stories
- [ ] Smoke check passes (run `/smoke-check sprint` before QA hand-off)
- [ ] No regressions introduced — previous sprint's features still pass
- [ ] Code reviewed (via `/code-review` or documented peer review)
- [ ] Story file updated to `Status: Complete` via `/story-done`
**Stories requiring playtest sign-off before close**: [list, or "None"]
---
## Test Results
*Fill in after testing is complete.*
| Story | Automated | Manual | Result | Notes |
|-------|-----------|--------|--------|-------|
| [title] | PASS | — | PASS | |
| [title] | — | PASS | PASS | |
| [title] | FAIL | — | BLOCKED | [describe failure] |
---
## Bugs Found
| Bug ID | Severity | Test Case | Description | Status |
|--------|----------|-----------|-------------|--------|
| ID | Story | Severity | Description | Status |
|----|-------|----------|-------------|--------|
| BUG-001 | | S[1-4] | | Open |
---
## Sign-Off
- **QA Tester**: [Name] -- [Date]
- **QA Lead**: [Name] -- [Date]
- **Feature Owner**: [Name] -- [Date]
- **QA Tester**: [name] [date]
- **QA Lead**: [name] [date]
- **Sprint Owner**: [name] [date]
*Template: `.claude/docs/templates/test-plan.md`*
*Generated by: `/qa-plan` — do not edit this line*