Files
Claude-Code-Game-Studios/CCGS Skill Testing Framework/README.md
Donchitos 984023ddac Release v1.0.0 — concept-prototype/vertical-slice split, workflow restructure, polish (#50)
* Add /vertical-slice skill, prototype overhaul, and workflow integration

- Add /vertical-slice skill for pre-production validation (Phase 4 gate)
- Overhaul /prototype skill with two-mode design: concept prototype (Phase 1)
  vs vertical slice (Phase 4), with clearer differentiation and higher standards for VS
- Update prototyper agent to own both prototype and vertical-slice workflows
- Add prototype-report.md and vertical-slice-report.md output templates
- Update WORKFLOW-GUIDE, quick-start, skills-reference, agent-coordination-map,
  and skill-flow-diagrams to fully integrate both skills into the 7-phase pipeline
- Remove orphaned empty quick-prototype/ directory

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* sync v1 counts + polish

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Add entity inventory flow, relax vertical-slice gate, improve UX authoring prompts

- /asset-spec: new Phase 0b entity & screen inventory when no argument and no
  existing inventory — reads GDDs/art-bible, proposes categorized list, writes
  design/assets/entity-inventory.md collaboratively
- /asset-spec: entity/character target falls back to inline user description
  when no source doc exists, rather than failing
- /gate-check: vertical slice changed from blocking to CONCERNS-only when
  absent; built-but-broken slice still fails; adds entity inventory as gate artifact
- /ux-design: convert inline approval prompts to AskUserQuestion for structured
  option capture at key authoring decision points
- workflow-catalog.yaml: entity-inventory step added to pre-production; UX spec
  min_count raised to 3; vertical-slice and prototype marked required: false with
  updated descriptions
- .gitignore: exclude marrow/ eval tooling directory

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Add missing AskUserQuestion widgets to 7 skills

Audit found 11 decision points across 7 skills where structured option
prompts were missing — using plain text, auto-selection, or no gate at all.

Skills patched:
- create-epics: per-epic approval + producer CONCERNS verdict
- sprint-plan: producer CONCERNS verdict with scope/timeline options
- milestone-review: AT RISK / OFF TRACK producer verdicts require acknowledgement
- retrospective: existing-retro handling converted from plain text [A]/[B]
- quick-design: classification confirmation + draft approve/revise/redirect
- tech-debt add mode: category (6 options) + effort (S/M/L/XL) structured capture
- regression-suite: no-arg mode selection instead of silent auto-detect
- hotfix: severity confirmation gate before workflow begins

Also added AskUserQuestion to allowed-tools headers for retrospective,
quick-design, tech-debt, regression-suite, and hotfix.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Prep v1 stable: fix WORKFLOW-GUIDE counts, stale agent names, and skill model fields

- WORKFLOW-GUIDE.md: correct agent count (48→49), skill count (66/68→73),
  add 6 missing skills to Appendix B, fix Creative category count (2→4),
  replace 3 non-existent agent names with correct ue-*/unity-* specialists,
  add missing godot-csharp/gdextension specialists to hierarchy,
  fix production/stories/ paths → production/epics/
- coordination-rules.md: replace "not yet used" with opt-in env var note
- quick-start.md: rename duplicate "Validate the concept" label → "Prototype the mechanic"
- skill-flow-diagrams.md: remove duplicate legacy UX pipeline section
- All 62 skills missing model: field now have explicit model: sonnet

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: comprehensive skill audit — consistency, UX, and flow gaps

Two-pass audit fixing ~35 bugs across 41 files.

Pre-production flow:
- Brainstorm next-steps split into Path A (design-first) and Path B
  (prototype-first) — eliminates "prototype after architecture" confusion
- /architecture-review added to pre-production flow in brainstorm and
  create-architecture handoffs
- gate-check traceability check corrected to requirements-traceability.md
- dev-story TR registry error now points to /architecture-review (not /create-epics)
- start now writes production/stage.txt on first onboarding

AskUserQuestion gaps filled:
- balance-check, code-review, hotfix, day-one-patch, consistency-check
  all gain closing widgets and/or missing allowed-tools declarations
- hotfix git branch creation now requires user confirmation
- sprint-plan review-mode setup moved to Phase 0 (before gates run)
- team-combat gains architecture→implementation approval gate
- design-review APPROVED path consolidated from 3 widgets to 1 multiSelect

All 9 team-* skills:
- Phase 0 review-mode resolution added (solo/lean/full now respected)
- team-audio output path fixed (design/gdd/ → design/audio/)
- team-level final doc compilation delegated to level-designer subagent
- team-narrative localization-lead added to composition list
- team-qa sprint path fixed (flat files, not directories)
- team-release NO-GO override captures written justification
- team-live-ops Cancel verdict now explicitly BLOCKED

Other fixes:
- Art bible path standardized to design/art/art-bible.md (3 wrong refs)
- AD-PHASE-GATE added to lean-mode skip list in director-gates.md
- design-system duplicate 5d heading fixed; skeleton decline path added;
  mandatory agent spawns now respect review mode
- story-readiness acceptance criteria thresholds now type-aware
- create-stories gains multi-ADR and no-ADR handling guidance
- consistency-check creates docs/consistency-failures.md on first run
- retrospective frontmatter bash injection replaced with explicit Bash call
- smoke-check ls -t gains PowerShell fallback
- Conventional Commits format documented in coding-standards.md
- gate-check: ADR acceptance gate, QA plan check, chain-of-verification
  tool-action requirement all added

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: expose --review flag in argument-hints for all team-* skills

All 9 team-* skills already implement Phase 0 review-mode resolution
internally (full/lean/solo), but none advertised [--review full|lean|solo]
in their argument-hint. Users had no way to discover the per-run override.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs: add SECURITY.md with coordinated disclosure policy

Defines scope, reporting process (GitHub private vulnerability reporting),
contributor security guidelines for hooks/skills/agents, and 90-day
coordinated disclosure timeline.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs: add CONTRIBUTING.md with framework contribution guidelines

Covers what PRs are welcome, skill/hook/agent technical requirements,
the collaborative principle, testing expectations, commit format,
and platform compatibility requirements.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs: add v1.0.0-beta → v1.0 upgrade section to UPGRADING.md

Documents the 17 commits since the beta tag: new /vertical-slice gate,
entity inventory flow in /map-systems, AskUserQuestion widgets across
7 skills, --review flag exposure on team-* skills, bug fixes
(#21, #36, #42, #43, #45), and the new CONTRIBUTING.md and SECURITY.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-13 20:15:08 +10:00

151 lines
6.3 KiB
Markdown

# CCGS Skill Testing Framework
Quality assurance infrastructure for the **Claude Code Game Studios** framework.
Tests the skills and agents themselves — not any game built with them.
> **This folder is self-contained and optional.**
> Game developers using CCGS don't need it. To remove it entirely:
> `rm -rf "CCGS Skill Testing Framework"` — nothing in `.claude/` depends on it.
---
## What's in here
```
CCGS Skill Testing Framework/
├── README.md ← you are here
├── CLAUDE.md ← tells Claude how to use this framework
├── catalog.yaml ← master registry: all 73 skills + 49 agents, coverage tracking
├── quality-rubric.md ← category-specific pass/fail metrics for /skill-test category
├── skills/ ← behavioral spec files for skills (one per skill)
│ ├── gate/ ← gate category specs
│ ├── review/ ← review category specs
│ ├── authoring/ ← authoring category specs
│ ├── readiness/ ← readiness category specs
│ ├── pipeline/ ← pipeline category specs
│ ├── analysis/ ← analysis category specs
│ ├── team/ ← team category specs
│ ├── sprint/ ← sprint category specs
│ └── utility/ ← utility category specs
├── agents/ ← behavioral spec files for agents (one per agent)
│ ├── directors/ ← creative-director, technical-director, producer, art-director
│ ├── leads/ ← lead-programmer, narrative-director, audio-director, etc.
│ ├── specialists/ ← engine/code/shader/UI specialists
│ ├── godot/ ← Godot-specific specialists
│ ├── unity/ ← Unity-specific specialists
│ ├── unreal/ ← Unreal-specific specialists
│ ├── operations/ ← QA, live-ops, release, localization, etc.
│ └── creative/ ← writer, world-builder, game-designer, etc.
├── templates/ ← spec file templates for writing new specs
│ ├── skill-test-spec.md ← template for skill behavioral specs
│ └── agent-test-spec.md ← template for agent behavioral specs
└── results/ ← test run outputs (written by /skill-test spec, gitignored)
```
---
## How to use it
All testing is driven by two skills already in the framework:
### Check structural compliance
```
/skill-test static [skill-name] # Check one skill (7 checks)
/skill-test static all # Check all 73 skills
```
### Run a behavioral spec test
```
/skill-test spec gate-check # Evaluate a skill against its written spec
/skill-test spec design-review
```
### Check against category rubric
```
/skill-test category gate-check # Evaluate one skill against its category metrics
/skill-test category all # Run rubric checks across all categorized skills
```
### See full coverage picture
```
/skill-test audit # Skills + agents: has-spec, last tested, result
```
### Improve a failing skill
```
/skill-improve gate-check # Test → diagnose → propose fix → retest loop
```
---
## Skill categories
| Category | Skills | Key metrics |
|----------|--------|-------------|
| `gate` | gate-check | Review mode read, full/lean/solo director panel, no auto-advance |
| `review` | design-review, architecture-review, review-all-gdds | Read-only, 8-section check, correct verdicts |
| `authoring` | design-system, quick-design, art-bible, create-architecture, … | Section-by-section May-I-write, skeleton-first |
| `readiness` | story-readiness, story-done | Blockers surfaced, director gate in full mode |
| `pipeline` | create-epics, create-stories, dev-story, map-systems, … | Upstream dependency check, handoff path clear |
| `analysis` | consistency-check, balance-check, code-review, tech-debt, … | Read-only report, verdict keyword, no writes |
| `team` | team-combat, team-narrative, team-audio, … | All required agents spawned, blocked surfaced |
| `sprint` | sprint-plan, sprint-status, milestone-review, … | Reads sprint data, status keywords present |
| `utility` | start, adopt, hotfix, localize, setup-engine, … | Passes static checks |
---
## Agent tiers
| Tier | Agents |
|------|--------|
| `directors` | creative-director, technical-director, producer, art-director |
| `leads` | lead-programmer, narrative-director, audio-director, ux-designer, qa-lead, release-manager, localization-lead |
| `specialists` | gameplay-programmer, engine-programmer, ui-programmer, tools-programmer, network-programmer, ai-programmer, level-designer, sound-designer, technical-artist |
| `godot` | godot-specialist, godot-gdscript-specialist, godot-csharp-specialist, godot-shader-specialist, godot-gdextension-specialist |
| `unity` | unity-specialist, unity-ui-specialist, unity-shader-specialist, unity-dots-specialist, unity-addressables-specialist |
| `unreal` | unreal-specialist, ue-gas-specialist, ue-replication-specialist, ue-umg-specialist, ue-blueprint-specialist |
| `operations` | devops-engineer, security-engineer, performance-analyst, analytics-engineer, community-manager |
| `creative` | writer, world-builder, game-designer, economy-designer, systems-designer, prototyper |
---
## Updating the catalog
`catalog.yaml` tracks test coverage for every skill and agent. After running a test:
- `/skill-test spec [name]` will offer to update `last_spec` and `last_spec_result`
- `/skill-test category [name]` will offer to update `last_category` and `last_category_result`
- `last_static` and `last_static_result` are updated manually or via `/skill-improve`
---
## Writing a new spec
1. Find the spec template at `templates/skill-test-spec.md`
2. Copy it to `skills/[category]/[skill-name].md`
3. Update the `spec:` field in `catalog.yaml` to point to the new file
4. Run `/skill-test spec [skill-name]` to validate it
---
## Removing this framework
This folder has no hooks into the main project. To remove:
```bash
rm -rf "CCGS Skill Testing Framework"
```
The skills `/skill-test` and `/skill-improve` will still function — they'll simply
report that `catalog.yaml` is missing and suggest running `/skill-test audit` to
initialize it.