mirror of
https://github.com/Donchitos/Claude-Code-Game-Studios.git
synced 2026-06-27 04:51:46 +00:00
Add /skill-test suite: linter, behavioral specs, and coverage catalog for 52 skills
- New skill: /skill-test (static | spec | audit modes) - static: 7-check structural linter per skill file - spec: Claude-evaluated behavioral assertions against test specs - audit: coverage report across all 52 skills with priority gaps - New hook: validate-skill-change.sh — advisory reminder to lint after skill edits - New template: skill-test-spec.md — standard structure for authoring test specs - New: tests/skills/catalog.yaml — machine-readable coverage index (52 skills) - New: tests/skills/_fixtures/ — shared fixtures (complete concept, incomplete GDD) - New: 4 seed test specs for critical gate skills (gate-check, design-review, story-readiness, story-done) — 4 cases each - Modified: settings.json — validate-skill-change.sh added to PostToolUse hook Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
144
tests/skills/design-review.md
Normal file
144
tests/skills/design-review.md
Normal file
@@ -0,0 +1,144 @@
|
||||
# Skill Test Spec: /design-review
|
||||
|
||||
## Skill Summary
|
||||
|
||||
`/design-review` reads a game design document (GDD) and evaluates it against
|
||||
the project's 8-section design standard (Overview, Player Fantasy, Detailed
|
||||
Rules, Formulas, Edge Cases, Dependencies, Tuning Knobs, Acceptance Criteria).
|
||||
It checks for internal consistency, implementability, and cross-system
|
||||
conflicts. It produces a verdict of APPROVED, NEEDS REVISION, or MAJOR
|
||||
REVISION NEEDED. It is a read-only skill (no file writes) and runs as a
|
||||
`context: fork` subagent.
|
||||
|
||||
---
|
||||
|
||||
## Static Assertions (Structural)
|
||||
|
||||
Verified automatically by `/skill-test static` — no fixture needed.
|
||||
|
||||
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||
- [ ] Has ≥2 phase headings or numbered steps
|
||||
- [ ] Contains verdict keywords: APPROVED, NEEDS REVISION, MAJOR REVISION NEEDED
|
||||
- [ ] Does NOT require "May I write" language (read-only skill — `allowed-tools` excludes Write/Edit)
|
||||
- [ ] Output format is documented (review template shown in skill body)
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Case 1: Happy Path — Complete GDD, all 8 sections present
|
||||
|
||||
**Fixture:**
|
||||
- `design/gdd/light-manipulation.md` exists (use `_fixtures/minimal-game-concept.md`
|
||||
as a stand-in — represents a complete document with all required content)
|
||||
- All 8 required sections are populated with substantive content
|
||||
- Formulas section contains at least one formula with defined variables
|
||||
- Acceptance Criteria section contains at least 3 testable criteria
|
||||
|
||||
**Input:** `/design-review design/gdd/light-manipulation.md`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads the target document in full
|
||||
2. Skill reads CLAUDE.md for project context and standards
|
||||
3. Skill evaluates all 8 required sections (present/absent check)
|
||||
4. Skill checks internal consistency (formulas match described behavior)
|
||||
5. Skill checks implementability (rules are precise enough to code)
|
||||
6. Skill outputs structured review with section-by-section status
|
||||
7. Skill outputs APPROVED verdict
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill reads the target file before producing any output
|
||||
- [ ] Output includes a "Completeness" section showing X/8 sections present
|
||||
- [ ] Output includes an "Internal Consistency" section
|
||||
- [ ] Output includes an "Implementability" section
|
||||
- [ ] Output ends with a verdict line: APPROVED / NEEDS REVISION / MAJOR REVISION NEEDED
|
||||
- [ ] APPROVED verdict is given when all 8 sections are present and consistent
|
||||
|
||||
---
|
||||
|
||||
### Case 2: Failure Path — Incomplete GDD (4/8 sections)
|
||||
|
||||
**Fixture:**
|
||||
- `design/gdd/light-manipulation.md` exists using content from
|
||||
`tests/skills/_fixtures/incomplete-gdd.md` (4 of 8 sections populated;
|
||||
Formulas, Edge Cases, Tuning Knobs, Acceptance Criteria are missing)
|
||||
|
||||
**Input:** `/design-review design/gdd/light-manipulation.md`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill reads the document
|
||||
2. Skill identifies 4 missing sections
|
||||
3. Skill outputs "Completeness: 4/8 sections present"
|
||||
4. Skill lists specifically which 4 sections are missing
|
||||
5. Skill outputs MAJOR REVISION NEEDED verdict (not APPROVED or NEEDS REVISION)
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Output shows "4/8" in the completeness section (not a higher number)
|
||||
- [ ] Output explicitly names each missing section (Formulas, Edge Cases, Tuning Knobs, Acceptance Criteria)
|
||||
- [ ] Verdict is MAJOR REVISION NEEDED (not APPROVED or NEEDS REVISION) when ≥3 sections are missing
|
||||
- [ ] Output does not suggest the document is implementation-ready
|
||||
- [ ] Skill does not write any files (read-only enforcement)
|
||||
|
||||
---
|
||||
|
||||
### Case 3: Partial Path — 7/8 sections, minor inconsistency
|
||||
|
||||
**Fixture:**
|
||||
- GDD has all sections except Formulas
|
||||
- The described behavior mentions numeric values but no formulas are defined
|
||||
- Acceptance Criteria exist but are vague ("feels good" rather than measurable)
|
||||
|
||||
**Input:** `/design-review design/gdd/[document].md`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill identifies missing Formulas section
|
||||
2. Skill flags vague acceptance criteria as an implementability issue
|
||||
3. Skill outputs NEEDS REVISION verdict (not APPROVED, not MAJOR REVISION NEEDED)
|
||||
4. Skill provides specific remediation notes for each issue
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Verdict is NEEDS REVISION (not APPROVED, not MAJOR REVISION NEEDED) for 7/8 with issues
|
||||
- [ ] Output identifies the missing Formulas section specifically
|
||||
- [ ] Output flags the vague acceptance criteria as an implementability gap
|
||||
- [ ] Each flagged issue has a specific, actionable remediation note
|
||||
|
||||
---
|
||||
|
||||
### Case 4: Edge Case — File not found
|
||||
|
||||
**Fixture:**
|
||||
- The path provided does not exist in the project
|
||||
|
||||
**Input:** `/design-review design/gdd/nonexistent.md`
|
||||
|
||||
**Expected behavior:**
|
||||
1. Skill attempts to read the file
|
||||
2. File not found
|
||||
3. Skill outputs an error message naming the missing file
|
||||
4. Skill suggests checking the path or listing files in `design/gdd/`
|
||||
5. Skill does NOT produce a verdict
|
||||
|
||||
**Assertions:**
|
||||
- [ ] Skill outputs a clear error when the file is not found
|
||||
- [ ] Skill does NOT output APPROVED, NEEDS REVISION, or MAJOR REVISION NEEDED when file is missing
|
||||
- [ ] Skill suggests a corrective action (check path, list available GDDs)
|
||||
|
||||
---
|
||||
|
||||
## Protocol Compliance
|
||||
|
||||
- [ ] Does NOT use Write or Edit tools (read-only skill)
|
||||
- [ ] Presents complete findings before any verdict
|
||||
- [ ] Does not ask for approval before producing output (no writes to approve)
|
||||
- [ ] Ends with recommended next step (e.g., fix issues and re-run, or proceed to `/map-systems`)
|
||||
|
||||
---
|
||||
|
||||
## Coverage Notes
|
||||
|
||||
- Cross-system consistency checking (Case 3 in the skill's own phase list) is
|
||||
not directly tested here because it requires multiple GDD files to compare;
|
||||
this is covered by the `/review-all-gdds` spec instead.
|
||||
- The skill's `context: fork` behavior (running as a subagent) is not tested
|
||||
at the spec level — this is a runtime behavior verified manually.
|
||||
- Performance and edge cases involving very large GDD files are not in scope.
|
||||
Reference in New Issue
Block a user