mirror of
https://github.com/Donchitos/Claude-Code-Game-Studios.git
synced 2026-06-27 04:51:46 +00:00
Add /skill-test suite: linter, behavioral specs, and coverage catalog for 52 skills
- New skill: /skill-test (static | spec | audit modes) - static: 7-check structural linter per skill file - spec: Claude-evaluated behavioral assertions against test specs - audit: coverage report across all 52 skills with priority gaps - New hook: validate-skill-change.sh — advisory reminder to lint after skill edits - New template: skill-test-spec.md — standard structure for authoring test specs - New: tests/skills/catalog.yaml — machine-readable coverage index (52 skills) - New: tests/skills/_fixtures/ — shared fixtures (complete concept, incomplete GDD) - New: 4 seed test specs for critical gate skills (gate-check, design-review, story-readiness, story-done) — 4 cases each - Modified: settings.json — validate-skill-change.sh added to PostToolUse hook Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
96
.claude/docs/templates/skill-test-spec.md
vendored
Normal file
96
.claude/docs/templates/skill-test-spec.md
vendored
Normal file
@@ -0,0 +1,96 @@
|
|||||||
|
# Skill Test Spec: /[skill-name]
|
||||||
|
|
||||||
|
## Skill Summary
|
||||||
|
|
||||||
|
[One paragraph: what this skill does, when to use it, what it produces. Include
|
||||||
|
the primary output artifact, the verdict format it uses, and which pipeline stage
|
||||||
|
it belongs to.]
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Static Assertions (Structural)
|
||||||
|
|
||||||
|
Verified automatically by `/skill-test static` — no fixture needed.
|
||||||
|
|
||||||
|
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||||
|
- [ ] Has ≥2 phase headings (## Phase N or numbered ## sections)
|
||||||
|
- [ ] Contains verdict keywords: [list the ones expected, e.g., PASS, FAIL, CONCERNS]
|
||||||
|
- [ ] Contains "May I write" collaborative protocol language (if skill writes files)
|
||||||
|
- [ ] Has a next-step handoff at the end
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Test Cases
|
||||||
|
|
||||||
|
### Case 1: Happy Path — [short description]
|
||||||
|
|
||||||
|
**Fixture:** [Describe the assumed project state. Which files exist? What do they
|
||||||
|
contain? E.g., "game-concept.md exists with all 8 required sections complete.
|
||||||
|
systems-index.md exists. All MVP GDDs are present and individually reviewed."]
|
||||||
|
|
||||||
|
**Input:** `/[skill-name] [args]`
|
||||||
|
|
||||||
|
**Expected behavior:**
|
||||||
|
1. [Phase 1 action — what the skill should read or check]
|
||||||
|
2. [Phase 2 action — what the skill should evaluate]
|
||||||
|
3. [Phase N action — what the skill should output]
|
||||||
|
|
||||||
|
**Assertions:**
|
||||||
|
- [ ] Skill reads [specific file] before producing output
|
||||||
|
- [ ] Output includes verdict keyword [PASS/FAIL/etc.]
|
||||||
|
- [ ] Output lists [specific content] from the fixture
|
||||||
|
- [ ] Skill asks for approval before writing any file
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Case 2: Failure Path — [short description, e.g., "Missing required artifact"]
|
||||||
|
|
||||||
|
**Fixture:** [Describe the failure state. E.g., "game-concept.md is missing.
|
||||||
|
No files exist in design/gdd/."]
|
||||||
|
|
||||||
|
**Input:** `/[skill-name] [args]`
|
||||||
|
|
||||||
|
**Expected behavior:**
|
||||||
|
1. [Phase 1: skill detects missing file]
|
||||||
|
2. [Phase 2: skill surfaces the gap rather than assuming OK]
|
||||||
|
3. [Output: FAIL or BLOCKED verdict with specific blocker named]
|
||||||
|
|
||||||
|
**Assertions:**
|
||||||
|
- [ ] Skill does NOT output PASS when the fixture is incomplete
|
||||||
|
- [ ] Skill names the specific missing artifact
|
||||||
|
- [ ] Skill suggests a remediation action (e.g., "Run /[other-skill]")
|
||||||
|
- [ ] Skill does not create files to fill in the gap without asking
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Case 3: Edge Case — [short description, e.g., "No argument provided"]
|
||||||
|
|
||||||
|
**Fixture:** [State of project files for this case]
|
||||||
|
|
||||||
|
**Input:** `/[skill-name]` (no argument)
|
||||||
|
|
||||||
|
**Expected behavior:**
|
||||||
|
1. [What the skill should do when invoked without arguments]
|
||||||
|
|
||||||
|
**Assertions:**
|
||||||
|
- [ ] [assertion]
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Protocol Compliance
|
||||||
|
|
||||||
|
- [ ] Uses "May I write" before all file writes
|
||||||
|
- [ ] Presents findings or report before asking for write approval
|
||||||
|
- [ ] Ends with a recommended next step or follow-up skill
|
||||||
|
- [ ] Never auto-creates files without explicit user approval
|
||||||
|
- [ ] Does not skip phases or jump straight to a verdict without checking
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Coverage Notes
|
||||||
|
|
||||||
|
[Document what is intentionally NOT tested in this spec and why. Examples:
|
||||||
|
- "Case 3 (all-mode) is not covered because it runs too many checks to evaluate
|
||||||
|
in a single spec — test each sub-mode individually."
|
||||||
|
- "The database integration path is not covered as it requires a live environment."
|
||||||
|
- "Edge cases involving corrupted YAML files are deferred to a future spec."]
|
||||||
39
.claude/hooks/validate-skill-change.sh
Normal file
39
.claude/hooks/validate-skill-change.sh
Normal file
@@ -0,0 +1,39 @@
|
|||||||
|
#!/bin/bash
|
||||||
|
# Claude Code PostToolUse hook: Advises running skill-test after skill file changes
|
||||||
|
# Fires when any file inside .claude/skills/ is written or edited.
|
||||||
|
#
|
||||||
|
# Exit behavior:
|
||||||
|
# exit 0 = advisory only (non-blocking)
|
||||||
|
#
|
||||||
|
# Input schema (PostToolUse for Write|Edit):
|
||||||
|
# { "tool_name": "Write", "tool_input": { "file_path": "...", "content": "..." } }
|
||||||
|
|
||||||
|
INPUT=$(cat)
|
||||||
|
|
||||||
|
# Parse file path -- use jq if available, fall back to grep
|
||||||
|
if command -v jq >/dev/null 2>&1; then
|
||||||
|
FILE_PATH=$(echo "$INPUT" | jq -r '.tool_input.file_path // empty')
|
||||||
|
else
|
||||||
|
FILE_PATH=$(echo "$INPUT" | grep -oE '"file_path"[[:space:]]*:[[:space:]]*"[^"]*"' | sed 's/"file_path"[[:space:]]*:[[:space:]]*"//;s/"$//')
|
||||||
|
fi
|
||||||
|
|
||||||
|
# Normalize path separators (Windows backslash to forward slash)
|
||||||
|
FILE_PATH=$(echo "$FILE_PATH" | sed 's|\\|/|g')
|
||||||
|
|
||||||
|
# Only act on files inside .claude/skills/
|
||||||
|
if ! echo "$FILE_PATH" | grep -qE '(^|/)\.claude/skills/'; then
|
||||||
|
exit 0
|
||||||
|
fi
|
||||||
|
|
||||||
|
# Extract skill name from path (.claude/skills/[skill-name]/SKILL.md)
|
||||||
|
SKILL_NAME=$(echo "$FILE_PATH" | grep -oE '\.claude/skills/[^/]+' | sed 's|\.claude/skills/||')
|
||||||
|
|
||||||
|
if [ -z "$SKILL_NAME" ]; then
|
||||||
|
exit 0
|
||||||
|
fi
|
||||||
|
|
||||||
|
echo "=== Skill Modified: $SKILL_NAME ===" >&2
|
||||||
|
echo "Run /skill-test static $SKILL_NAME to validate structural compliance." >&2
|
||||||
|
echo "====================================" >&2
|
||||||
|
|
||||||
|
exit 0
|
||||||
@@ -74,6 +74,11 @@
|
|||||||
"type": "command",
|
"type": "command",
|
||||||
"command": "bash .claude/hooks/validate-assets.sh",
|
"command": "bash .claude/hooks/validate-assets.sh",
|
||||||
"timeout": 10
|
"timeout": 10
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"type": "command",
|
||||||
|
"command": "bash .claude/hooks/validate-skill-change.sh",
|
||||||
|
"timeout": 5
|
||||||
}
|
}
|
||||||
]
|
]
|
||||||
}
|
}
|
||||||
|
|||||||
290
.claude/skills/skill-test/SKILL.md
Normal file
290
.claude/skills/skill-test/SKILL.md
Normal file
@@ -0,0 +1,290 @@
|
|||||||
|
---
|
||||||
|
name: skill-test
|
||||||
|
description: "Validate skill files for structural compliance and behavioral correctness. Three modes: static (linter), spec (behavioral), audit (coverage report)."
|
||||||
|
argument-hint: "static [skill-name | all] | spec [skill-name] | audit"
|
||||||
|
user-invocable: true
|
||||||
|
allowed-tools: Read, Glob, Grep, Write
|
||||||
|
context: fork
|
||||||
|
---
|
||||||
|
|
||||||
|
# Skill Test
|
||||||
|
|
||||||
|
Validates `.claude/skills/*/SKILL.md` files for structural compliance and
|
||||||
|
behavioral correctness. No external dependencies — runs entirely within the
|
||||||
|
existing skill/hook/template architecture.
|
||||||
|
|
||||||
|
**Three modes:**
|
||||||
|
|
||||||
|
| Mode | Command | Purpose | Token Cost |
|
||||||
|
|------|---------|---------|------------|
|
||||||
|
| `static` | `/skill-test static [name\|all]` | Structural linter — 7 compliance checks per skill | Low (~1k/skill) |
|
||||||
|
| `spec` | `/skill-test spec [name]` | Behavioral verifier — evaluates assertions in test spec | Medium (~5k/skill) |
|
||||||
|
| `audit` | `/skill-test audit` | Coverage report — which skills have specs, last test dates | Low (~2k total) |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Phase 1: Parse Arguments
|
||||||
|
|
||||||
|
Determine mode from the first argument:
|
||||||
|
|
||||||
|
- `static [name]` → run 7 structural checks on one skill
|
||||||
|
- `static all` → run 7 structural checks on all skills (Glob `.claude/skills/*/SKILL.md`)
|
||||||
|
- `spec [name]` → read skill + test spec, evaluate assertions
|
||||||
|
- `audit` (or no argument) → read catalog, list all skills, show coverage
|
||||||
|
|
||||||
|
If argument is missing or unrecognized, output usage and stop.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Phase 2A: Static Mode — Structural Linter
|
||||||
|
|
||||||
|
For each skill being tested, read its `SKILL.md` fully and run all 7 checks:
|
||||||
|
|
||||||
|
### Check 1 — Required Frontmatter Fields
|
||||||
|
The file must contain all of these in the YAML frontmatter block:
|
||||||
|
- `name:`
|
||||||
|
- `description:`
|
||||||
|
- `argument-hint:`
|
||||||
|
- `user-invocable:`
|
||||||
|
- `allowed-tools:`
|
||||||
|
|
||||||
|
**FAIL** if any are absent.
|
||||||
|
|
||||||
|
### Check 2 — Multiple Phases
|
||||||
|
The skill must have ≥2 numbered phase headings. Look for patterns like:
|
||||||
|
- `## Phase N` or `## Phase N:`
|
||||||
|
- `## N.` (numbered top-level sections)
|
||||||
|
- At least 2 distinct `##` headings if phases aren't explicitly numbered
|
||||||
|
|
||||||
|
**FAIL** if fewer than 2 phase-like headings are found.
|
||||||
|
|
||||||
|
### Check 3 — Verdict Keywords
|
||||||
|
The skill must contain at least one of: `PASS`, `FAIL`, `CONCERNS`, `APPROVED`,
|
||||||
|
`BLOCKED`, `COMPLETE`, `READY`, `COMPLIANT`, `NON-COMPLIANT`
|
||||||
|
|
||||||
|
**FAIL** if none are present.
|
||||||
|
|
||||||
|
### Check 4 — Collaborative Protocol Language
|
||||||
|
The skill must contain ask-before-write language. Look for:
|
||||||
|
- `"May I write"` (canonical form)
|
||||||
|
- `"before writing"` or `"approval"` near file-write instructions
|
||||||
|
- `"ask"` + `"write"` in close proximity (within same section)
|
||||||
|
|
||||||
|
**WARN** if absent (some read-only skills legitimately skip this).
|
||||||
|
**FAIL** if `allowed-tools` includes `Write` or `Edit` but no ask-before-write language is found.
|
||||||
|
|
||||||
|
### Check 5 — Next-Step Handoff
|
||||||
|
The skill must end with a recommended next action or follow-up path. Look for:
|
||||||
|
- A final section mentioning another skill (e.g., `/story-done`, `/gate-check`)
|
||||||
|
- "Recommended next" or "next step" phrasing
|
||||||
|
- A "Follow-Up" or "After this" section
|
||||||
|
|
||||||
|
**WARN** if absent.
|
||||||
|
|
||||||
|
### Check 6 — Fork Context Complexity
|
||||||
|
If frontmatter contains `context: fork`, the skill should have ≥5 phase headings
|
||||||
|
(`##` level or numbered Phase N headers). Fork context is for complex multi-phase
|
||||||
|
skills; simple skills should not use it.
|
||||||
|
|
||||||
|
**WARN** if `context: fork` is set but fewer than 5 phases found.
|
||||||
|
|
||||||
|
### Check 7 — Argument Hint Plausibility
|
||||||
|
`argument-hint` must be non-empty. If the skill body mentions multiple modes
|
||||||
|
(e.g., "Mode A | Mode B"), the hint should reflect them. Cross-reference the
|
||||||
|
hint against the first phase's "Parse Arguments" section.
|
||||||
|
|
||||||
|
**WARN** if hint is `""` or if documented modes don't match hint.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Static Mode Output Format
|
||||||
|
|
||||||
|
For a single skill:
|
||||||
|
```
|
||||||
|
=== Skill Static Check: /[name] ===
|
||||||
|
|
||||||
|
Check 1 — Frontmatter Fields: PASS
|
||||||
|
Check 2 — Multiple Phases: PASS (7 phases found)
|
||||||
|
Check 3 — Verdict Keywords: PASS (PASS, FAIL, CONCERNS)
|
||||||
|
Check 4 — Collaborative Protocol: PASS ("May I write" found)
|
||||||
|
Check 5 — Next-Step Handoff: WARN (no follow-up section found)
|
||||||
|
Check 6 — Fork Context Complexity: PASS (8 phases, context: fork set)
|
||||||
|
Check 7 — Argument Hint: PASS
|
||||||
|
|
||||||
|
Verdict: WARNINGS (1 warning, 0 failures)
|
||||||
|
Recommended: Add a "Follow-Up Actions" section at the end of the skill.
|
||||||
|
```
|
||||||
|
|
||||||
|
For `static all`, produce a summary table then list any non-compliant skills:
|
||||||
|
```
|
||||||
|
=== Skill Static Check: All 52 Skills ===
|
||||||
|
|
||||||
|
Skill | Result | Issues
|
||||||
|
-----------------------|--------------|-------
|
||||||
|
gate-check | COMPLIANT |
|
||||||
|
design-review | COMPLIANT |
|
||||||
|
story-readiness | WARNINGS | Check 5: no handoff
|
||||||
|
...
|
||||||
|
|
||||||
|
Summary: 48 COMPLIANT, 3 WARNINGS, 1 NON-COMPLIANT
|
||||||
|
Aggregate Verdict: N WARNINGS / N FAILURES
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Phase 2B: Spec Mode — Behavioral Verifier
|
||||||
|
|
||||||
|
### Step 1 — Locate Files
|
||||||
|
|
||||||
|
Find skill at `.claude/skills/[name]/SKILL.md`.
|
||||||
|
Find spec at `tests/skills/[name].md`.
|
||||||
|
|
||||||
|
If either is missing:
|
||||||
|
- Missing skill: "Skill '[name]' not found in `.claude/skills/`."
|
||||||
|
- Missing spec: "No test spec found for '[name]'. Run `/skill-test audit` to see
|
||||||
|
coverage gaps, or create a spec using the template at
|
||||||
|
`.claude/docs/templates/skill-test-spec.md`."
|
||||||
|
|
||||||
|
### Step 2 — Read Both Files
|
||||||
|
|
||||||
|
Read the skill file and test spec file completely.
|
||||||
|
|
||||||
|
### Step 3 — Evaluate Assertions
|
||||||
|
|
||||||
|
For each **Test Case** in the spec:
|
||||||
|
|
||||||
|
1. Read the **Fixture** description (assumed state of project files)
|
||||||
|
2. Read the **Expected behavior** steps
|
||||||
|
3. Read each **Assertion** checkbox
|
||||||
|
|
||||||
|
For each assertion, evaluate whether the skill's written instructions, if
|
||||||
|
followed correctly given the fixture state, would satisfy it. This is a
|
||||||
|
Claude-evaluated reasoning check, not code execution.
|
||||||
|
|
||||||
|
Mark each assertion:
|
||||||
|
- **PASS** — skill instructions clearly satisfy this assertion
|
||||||
|
- **PARTIAL** — skill instructions partially address it, but with ambiguity
|
||||||
|
- **FAIL** — skill instructions would NOT satisfy this assertion given the fixture
|
||||||
|
|
||||||
|
For **Protocol Compliance** assertions (always present):
|
||||||
|
- Check whether the skill requires "May I write" before file writes
|
||||||
|
- Check whether the skill presents findings before requesting approval
|
||||||
|
- Check whether the skill ends with a recommended next step
|
||||||
|
- Check whether the skill avoids auto-creating files without approval
|
||||||
|
|
||||||
|
### Step 4 — Build Report
|
||||||
|
|
||||||
|
```
|
||||||
|
=== Skill Spec Test: /[name] ===
|
||||||
|
Date: [date]
|
||||||
|
Spec: tests/skills/[name].md
|
||||||
|
|
||||||
|
Case 1: [Happy Path — name]
|
||||||
|
Fixture: [summary]
|
||||||
|
Assertions:
|
||||||
|
[PASS] [assertion text]
|
||||||
|
[FAIL] [assertion text]
|
||||||
|
Reason: The skill's Phase 3 says "..." but the fixture state means "..."
|
||||||
|
Case Verdict: FAIL
|
||||||
|
|
||||||
|
Case 2: [Edge Case — name]
|
||||||
|
...
|
||||||
|
Case Verdict: PASS
|
||||||
|
|
||||||
|
Protocol Compliance:
|
||||||
|
[PASS] Uses "May I write" before file writes
|
||||||
|
[PASS] Presents findings before asking approval
|
||||||
|
[WARN] No explicit next-step handoff at end
|
||||||
|
|
||||||
|
Overall Verdict: FAIL (1 case failed, 1 warning)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step 5 — Offer to Write Results
|
||||||
|
|
||||||
|
"May I write these results to `tests/results/skill-test-spec-[name]-[date].md`
|
||||||
|
and update `tests/skills/catalog.yaml`?"
|
||||||
|
|
||||||
|
If yes:
|
||||||
|
- Write results file to `tests/results/`
|
||||||
|
- Update the skill's entry in `tests/skills/catalog.yaml`:
|
||||||
|
- `last_spec: [date]`
|
||||||
|
- `last_spec_result: PASS|PARTIAL|FAIL`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Phase 2C: Audit Mode — Coverage Report
|
||||||
|
|
||||||
|
### Step 1 — Read Catalog
|
||||||
|
|
||||||
|
Read `tests/skills/catalog.yaml`. If missing, note that catalog doesn't exist
|
||||||
|
yet (first-run state).
|
||||||
|
|
||||||
|
### Step 2 — Enumerate All Skills
|
||||||
|
|
||||||
|
Glob `.claude/skills/*/SKILL.md` to get the complete list of skills.
|
||||||
|
Extract skill name from each path (directory name).
|
||||||
|
|
||||||
|
### Step 3 — Build Coverage Table
|
||||||
|
|
||||||
|
For each skill:
|
||||||
|
- Check if a spec file exists at `tests/skills/[name].md`
|
||||||
|
- Look up `last_static`, `last_static_result`, `last_spec`, `last_spec_result`
|
||||||
|
from catalog (or mark as "never" if not in catalog)
|
||||||
|
- Assign priority:
|
||||||
|
- `critical` — gate-check, design-review, story-readiness, story-done, review-all-gdds, architecture-review
|
||||||
|
- `high` — create-epics-stories, create-control-manifest, propagate-design-change, story-done
|
||||||
|
- `medium` — team-* skills, sprint-plan, sprint-status
|
||||||
|
- `low` — all others
|
||||||
|
|
||||||
|
### Step 4 — Output Report
|
||||||
|
|
||||||
|
```
|
||||||
|
=== Skill Test Coverage Audit ===
|
||||||
|
Date: [date]
|
||||||
|
Total skills: 52
|
||||||
|
Specs written: 4 (7.7%)
|
||||||
|
Never tested (static): 48
|
||||||
|
|
||||||
|
Coverage Table:
|
||||||
|
Skill | Has Spec | Last Static | Static Result | Last Spec | Spec Result | Priority
|
||||||
|
-----------------------|----------|------------------|---------------|------------------|-------------|----------
|
||||||
|
gate-check | YES | never | — | never | — | critical
|
||||||
|
design-review | YES | never | — | never | — | critical
|
||||||
|
story-readiness | YES | never | — | never | — | critical
|
||||||
|
story-done | YES | never | — | never | — | critical
|
||||||
|
architecture-review | NO | never | — | never | — | critical
|
||||||
|
review-all-gdds | NO | never | — | never | — | critical
|
||||||
|
...
|
||||||
|
|
||||||
|
Top 5 Priority Gaps (no spec, critical/high priority):
|
||||||
|
1. /architecture-review — critical, no spec
|
||||||
|
2. /review-all-gdds — critical, no spec
|
||||||
|
3. /create-epics-stories — high, no spec
|
||||||
|
4. /propagate-design-change — high, no spec
|
||||||
|
5. /sprint-plan — medium, no spec
|
||||||
|
|
||||||
|
Coverage: 4/52 specs (7.7%)
|
||||||
|
```
|
||||||
|
|
||||||
|
No file writes in audit mode.
|
||||||
|
|
||||||
|
Offer: "Would you like to run `/skill-test static all` to check structural
|
||||||
|
compliance across all skills? Or `/skill-test spec [name]` to run a specific
|
||||||
|
behavioral test?"
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Phase 3: Recommended Next Steps
|
||||||
|
|
||||||
|
After any mode completes, offer contextual follow-up:
|
||||||
|
|
||||||
|
- After `static [name]`: "Run `/skill-test spec [name]` to validate behavioral
|
||||||
|
correctness if a test spec exists."
|
||||||
|
- After `static all` with failures: "Address NON-COMPLIANT skills first. Run
|
||||||
|
`/skill-test static [name]` individually for detailed remediation guidance."
|
||||||
|
- After `spec [name]` PASS: "Update `tests/skills/catalog.yaml` to record this
|
||||||
|
pass date. Consider running `/skill-test audit` to find the next spec gap."
|
||||||
|
- After `spec [name]` FAIL: "Review the failing assertions and update the skill
|
||||||
|
or the test spec to resolve the mismatch."
|
||||||
|
- After `audit`: "Start with the critical-priority gaps. Use the spec template
|
||||||
|
at `.claude/docs/templates/skill-test-spec.md` to create new specs."
|
||||||
51
tests/skills/_fixtures/incomplete-gdd.md
Normal file
51
tests/skills/_fixtures/incomplete-gdd.md
Normal file
@@ -0,0 +1,51 @@
|
|||||||
|
# GDD: Light Manipulation System
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
The light manipulation system allows players to interact with bioluminescent
|
||||||
|
organisms and ancient light conduits to redirect beams of light. Light beams
|
||||||
|
illuminate dark areas, power ancient mechanisms, and reveal hidden surfaces.
|
||||||
|
|
||||||
|
## Player Fantasy
|
||||||
|
|
||||||
|
The player should feel like a puzzle archaeologist — discovering the logic of
|
||||||
|
an alien but internally consistent technology. The "aha" moment when a complex
|
||||||
|
light path clicks into place should feel earned and satisfying.
|
||||||
|
|
||||||
|
## Detailed Rules
|
||||||
|
|
||||||
|
- Players can pick up portable light sources (max 3 carried at once)
|
||||||
|
- Stationary conduits redirect beams at fixed angles (45°/90°/135°/180°)
|
||||||
|
- Light beams are blocked by solid terrain and most objects
|
||||||
|
- Living bioluminescent organisms pulse light on a 3-second cycle
|
||||||
|
- Ancient mirrors rotate freely and redirect any light beam that touches them
|
||||||
|
- A beam must reach a receptor to activate a mechanism
|
||||||
|
|
||||||
|
## Formulas
|
||||||
|
|
||||||
|
[SECTION MISSING — not yet authored]
|
||||||
|
|
||||||
|
## Edge Cases
|
||||||
|
|
||||||
|
[SECTION MISSING — not yet authored]
|
||||||
|
|
||||||
|
## Dependencies
|
||||||
|
|
||||||
|
- **Oxygen System**: Light sources consume no oxygen but picking them up takes
|
||||||
|
time (opportunity cost with oxygen drain)
|
||||||
|
- **Cave Navigation**: Illuminated paths reveal branching routes not visible
|
||||||
|
in darkness
|
||||||
|
- Player Inventory System (not yet designed)
|
||||||
|
|
||||||
|
## Tuning Knobs
|
||||||
|
|
||||||
|
[SECTION MISSING — not yet authored]
|
||||||
|
|
||||||
|
## Acceptance Criteria
|
||||||
|
|
||||||
|
[SECTION MISSING — not yet authored]
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
*Status: Draft — 4/8 required sections populated*
|
||||||
|
*Last updated: 2026-03-13*
|
||||||
62
tests/skills/_fixtures/minimal-game-concept.md
Normal file
62
tests/skills/_fixtures/minimal-game-concept.md
Normal file
@@ -0,0 +1,62 @@
|
|||||||
|
# Game Concept: Echoes of the Deep
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
Echoes of the Deep is a single-player atmospheric puzzle-platformer set in
|
||||||
|
a bioluminescent underwater cave network. Players control a deep-sea diver
|
||||||
|
exploring ancient ruins while managing oxygen supplies and manipulating light
|
||||||
|
sources to reveal hidden paths and solve environmental puzzles.
|
||||||
|
|
||||||
|
## Player Fantasy
|
||||||
|
|
||||||
|
The player should feel like a lone explorer uncovering a lost civilization,
|
||||||
|
experiencing wonder at beautiful environments, and the satisfying "aha" moment
|
||||||
|
when a clever puzzle clicks into place. The oxygen mechanic creates gentle
|
||||||
|
pressure without punishing failure harshly.
|
||||||
|
|
||||||
|
## Core Loop
|
||||||
|
|
||||||
|
1. **Explore** — navigate branching cave sections using light and movement
|
||||||
|
2. **Discover** — find oxygen caches, light sources, and ancient mechanisms
|
||||||
|
3. **Solve** — manipulate light and environment to unlock new areas
|
||||||
|
4. **Progress** — unlock deeper cave sections with escalating complexity
|
||||||
|
|
||||||
|
## Game Pillars
|
||||||
|
|
||||||
|
1. **Wonder** — every area should contain something visually or mechanically surprising
|
||||||
|
2. **Accessibility** — the game should be completable without frustration; oxygen
|
||||||
|
manages pacing, not punishment
|
||||||
|
3. **Environmental Storytelling** — the ruins tell a story without text exposition
|
||||||
|
|
||||||
|
## Target Audience
|
||||||
|
|
||||||
|
Casual-to-midcore players who enjoy relaxed exploration games (Subnautica,
|
||||||
|
Journey, ABZÛ) and puzzle games that reward observation over reflexes.
|
||||||
|
Target age: 16+. Target sessions: 30–90 minutes.
|
||||||
|
|
||||||
|
## Unique Selling Points
|
||||||
|
|
||||||
|
- Bioluminescent light manipulation as the core puzzle mechanic
|
||||||
|
- No enemies — tension comes from environment and resource management
|
||||||
|
- Procedurally decorated (handcrafted levels, procedural detail pass)
|
||||||
|
|
||||||
|
## Technical Scope
|
||||||
|
|
||||||
|
- **Engine**: Godot 4.6
|
||||||
|
- **Platform**: PC (Steam), with console ports post-launch
|
||||||
|
- **Team size**: Solo developer
|
||||||
|
- **Target completion**: 12-month development cycle
|
||||||
|
- **Scope**: 4–6 hours main story, 8–12 hours completionist
|
||||||
|
|
||||||
|
## Art Direction
|
||||||
|
|
||||||
|
Darkly atmospheric with vibrant bioluminescence providing the primary color
|
||||||
|
palette. Deep blues, purples, and blacks punctuated by greens, teals, and
|
||||||
|
ambers from living organisms and ancient technology.
|
||||||
|
|
||||||
|
## Fun Hypothesis
|
||||||
|
|
||||||
|
Players will feel rewarded by the combination of visual beauty and the
|
||||||
|
satisfying moment of discovering how light manipulation solves each puzzle.
|
||||||
|
The oxygen system will create just enough pressure to make exploration feel
|
||||||
|
meaningful without making death feel punishing.
|
||||||
438
tests/skills/catalog.yaml
Normal file
438
tests/skills/catalog.yaml
Normal file
@@ -0,0 +1,438 @@
|
|||||||
|
version: 1
|
||||||
|
last_updated: ""
|
||||||
|
skills:
|
||||||
|
# Critical — gate skills that control phase transitions
|
||||||
|
- name: gate-check
|
||||||
|
spec: tests/skills/gate-check.md
|
||||||
|
last_static: ""
|
||||||
|
last_static_result: ""
|
||||||
|
last_spec: ""
|
||||||
|
last_spec_result: ""
|
||||||
|
priority: critical
|
||||||
|
|
||||||
|
- name: design-review
|
||||||
|
spec: tests/skills/design-review.md
|
||||||
|
last_static: ""
|
||||||
|
last_static_result: ""
|
||||||
|
last_spec: ""
|
||||||
|
last_spec_result: ""
|
||||||
|
priority: critical
|
||||||
|
|
||||||
|
- name: story-readiness
|
||||||
|
spec: tests/skills/story-readiness.md
|
||||||
|
last_static: ""
|
||||||
|
last_static_result: ""
|
||||||
|
last_spec: ""
|
||||||
|
last_spec_result: ""
|
||||||
|
priority: critical
|
||||||
|
|
||||||
|
- name: story-done
|
||||||
|
spec: tests/skills/story-done.md
|
||||||
|
last_static: ""
|
||||||
|
last_static_result: ""
|
||||||
|
last_spec: ""
|
||||||
|
last_spec_result: ""
|
||||||
|
priority: critical
|
||||||
|
|
||||||
|
- name: review-all-gdds
|
||||||
|
spec: ""
|
||||||
|
last_static: ""
|
||||||
|
last_static_result: ""
|
||||||
|
last_spec: ""
|
||||||
|
last_spec_result: ""
|
||||||
|
priority: critical
|
||||||
|
|
||||||
|
- name: architecture-review
|
||||||
|
spec: ""
|
||||||
|
last_static: ""
|
||||||
|
last_static_result: ""
|
||||||
|
last_spec: ""
|
||||||
|
last_spec_result: ""
|
||||||
|
priority: critical
|
||||||
|
|
||||||
|
# High — pipeline-critical skills
|
||||||
|
- name: create-epics-stories
|
||||||
|
spec: ""
|
||||||
|
last_static: ""
|
||||||
|
last_static_result: ""
|
||||||
|
last_spec: ""
|
||||||
|
last_spec_result: ""
|
||||||
|
priority: high
|
||||||
|
|
||||||
|
- name: create-control-manifest
|
||||||
|
spec: ""
|
||||||
|
last_static: ""
|
||||||
|
last_static_result: ""
|
||||||
|
last_spec: ""
|
||||||
|
last_spec_result: ""
|
||||||
|
priority: high
|
||||||
|
|
||||||
|
- name: propagate-design-change
|
||||||
|
spec: ""
|
||||||
|
last_static: ""
|
||||||
|
last_static_result: ""
|
||||||
|
last_spec: ""
|
||||||
|
last_spec_result: ""
|
||||||
|
priority: high
|
||||||
|
|
||||||
|
- name: architecture-decision
|
||||||
|
spec: ""
|
||||||
|
last_static: ""
|
||||||
|
last_static_result: ""
|
||||||
|
last_spec: ""
|
||||||
|
last_spec_result: ""
|
||||||
|
priority: high
|
||||||
|
|
||||||
|
- name: map-systems
|
||||||
|
spec: ""
|
||||||
|
last_static: ""
|
||||||
|
last_static_result: ""
|
||||||
|
last_spec: ""
|
||||||
|
last_spec_result: ""
|
||||||
|
priority: high
|
||||||
|
|
||||||
|
- name: design-system
|
||||||
|
spec: ""
|
||||||
|
last_static: ""
|
||||||
|
last_static_result: ""
|
||||||
|
last_spec: ""
|
||||||
|
last_spec_result: ""
|
||||||
|
priority: high
|
||||||
|
|
||||||
|
# Medium — team and sprint management skills
|
||||||
|
- name: sprint-plan
|
||||||
|
spec: ""
|
||||||
|
last_static: ""
|
||||||
|
last_static_result: ""
|
||||||
|
last_spec: ""
|
||||||
|
last_spec_result: ""
|
||||||
|
priority: medium
|
||||||
|
|
||||||
|
- name: sprint-status
|
||||||
|
spec: ""
|
||||||
|
last_static: ""
|
||||||
|
last_static_result: ""
|
||||||
|
last_spec: ""
|
||||||
|
last_spec_result: ""
|
||||||
|
priority: medium
|
||||||
|
|
||||||
|
- name: team-ui
|
||||||
|
spec: ""
|
||||||
|
last_static: ""
|
||||||
|
last_static_result: ""
|
||||||
|
last_spec: ""
|
||||||
|
last_spec_result: ""
|
||||||
|
priority: medium
|
||||||
|
|
||||||
|
- name: team-combat
|
||||||
|
spec: ""
|
||||||
|
last_static: ""
|
||||||
|
last_static_result: ""
|
||||||
|
last_spec: ""
|
||||||
|
last_spec_result: ""
|
||||||
|
priority: medium
|
||||||
|
|
||||||
|
- name: team-narrative
|
||||||
|
spec: ""
|
||||||
|
last_static: ""
|
||||||
|
last_static_result: ""
|
||||||
|
last_spec: ""
|
||||||
|
last_spec_result: ""
|
||||||
|
priority: medium
|
||||||
|
|
||||||
|
- name: team-audio
|
||||||
|
spec: ""
|
||||||
|
last_static: ""
|
||||||
|
last_static_result: ""
|
||||||
|
last_spec: ""
|
||||||
|
last_spec_result: ""
|
||||||
|
priority: medium
|
||||||
|
|
||||||
|
- name: team-level
|
||||||
|
spec: ""
|
||||||
|
last_static: ""
|
||||||
|
last_static_result: ""
|
||||||
|
last_spec: ""
|
||||||
|
last_spec_result: ""
|
||||||
|
priority: medium
|
||||||
|
|
||||||
|
- name: team-polish
|
||||||
|
spec: ""
|
||||||
|
last_static: ""
|
||||||
|
last_static_result: ""
|
||||||
|
last_spec: ""
|
||||||
|
last_spec_result: ""
|
||||||
|
priority: medium
|
||||||
|
|
||||||
|
- name: team-release
|
||||||
|
spec: ""
|
||||||
|
last_static: ""
|
||||||
|
last_static_result: ""
|
||||||
|
last_spec: ""
|
||||||
|
last_spec_result: ""
|
||||||
|
priority: medium
|
||||||
|
|
||||||
|
- name: team-live-ops
|
||||||
|
spec: ""
|
||||||
|
last_static: ""
|
||||||
|
last_static_result: ""
|
||||||
|
last_spec: ""
|
||||||
|
last_spec_result: ""
|
||||||
|
priority: medium
|
||||||
|
|
||||||
|
# Low — analysis, reporting, utility skills
|
||||||
|
- name: skill-test
|
||||||
|
spec: ""
|
||||||
|
last_static: ""
|
||||||
|
last_static_result: ""
|
||||||
|
last_spec: ""
|
||||||
|
last_spec_result: ""
|
||||||
|
priority: medium
|
||||||
|
|
||||||
|
- name: start
|
||||||
|
spec: ""
|
||||||
|
last_static: ""
|
||||||
|
last_static_result: ""
|
||||||
|
last_spec: ""
|
||||||
|
last_spec_result: ""
|
||||||
|
priority: low
|
||||||
|
|
||||||
|
- name: help
|
||||||
|
spec: ""
|
||||||
|
last_static: ""
|
||||||
|
last_static_result: ""
|
||||||
|
last_spec: ""
|
||||||
|
last_spec_result: ""
|
||||||
|
priority: low
|
||||||
|
|
||||||
|
- name: brainstorm
|
||||||
|
spec: ""
|
||||||
|
last_static: ""
|
||||||
|
last_static_result: ""
|
||||||
|
last_spec: ""
|
||||||
|
last_spec_result: ""
|
||||||
|
priority: low
|
||||||
|
|
||||||
|
- name: project-stage-detect
|
||||||
|
spec: ""
|
||||||
|
last_static: ""
|
||||||
|
last_static_result: ""
|
||||||
|
last_spec: ""
|
||||||
|
last_spec_result: ""
|
||||||
|
priority: low
|
||||||
|
|
||||||
|
- name: setup-engine
|
||||||
|
spec: ""
|
||||||
|
last_static: ""
|
||||||
|
last_static_result: ""
|
||||||
|
last_spec: ""
|
||||||
|
last_spec_result: ""
|
||||||
|
priority: low
|
||||||
|
|
||||||
|
- name: quick-design
|
||||||
|
spec: ""
|
||||||
|
last_static: ""
|
||||||
|
last_static_result: ""
|
||||||
|
last_spec: ""
|
||||||
|
last_spec_result: ""
|
||||||
|
priority: low
|
||||||
|
|
||||||
|
- name: ux-design
|
||||||
|
spec: ""
|
||||||
|
last_static: ""
|
||||||
|
last_static_result: ""
|
||||||
|
last_spec: ""
|
||||||
|
last_spec_result: ""
|
||||||
|
priority: low
|
||||||
|
|
||||||
|
- name: ux-review
|
||||||
|
spec: ""
|
||||||
|
last_static: ""
|
||||||
|
last_static_result: ""
|
||||||
|
last_spec: ""
|
||||||
|
last_spec_result: ""
|
||||||
|
priority: low
|
||||||
|
|
||||||
|
- name: code-review
|
||||||
|
spec: ""
|
||||||
|
last_static: ""
|
||||||
|
last_static_result: ""
|
||||||
|
last_spec: ""
|
||||||
|
last_spec_result: ""
|
||||||
|
priority: low
|
||||||
|
|
||||||
|
- name: balance-check
|
||||||
|
spec: ""
|
||||||
|
last_static: ""
|
||||||
|
last_static_result: ""
|
||||||
|
last_spec: ""
|
||||||
|
last_spec_result: ""
|
||||||
|
priority: low
|
||||||
|
|
||||||
|
- name: asset-audit
|
||||||
|
spec: ""
|
||||||
|
last_static: ""
|
||||||
|
last_static_result: ""
|
||||||
|
last_spec: ""
|
||||||
|
last_spec_result: ""
|
||||||
|
priority: low
|
||||||
|
|
||||||
|
- name: reverse-document
|
||||||
|
spec: ""
|
||||||
|
last_static: ""
|
||||||
|
last_static_result: ""
|
||||||
|
last_spec: ""
|
||||||
|
last_spec_result: ""
|
||||||
|
priority: low
|
||||||
|
|
||||||
|
- name: create-architecture
|
||||||
|
spec: ""
|
||||||
|
last_static: ""
|
||||||
|
last_static_result: ""
|
||||||
|
last_spec: ""
|
||||||
|
last_spec_result: ""
|
||||||
|
priority: low
|
||||||
|
|
||||||
|
- name: content-audit
|
||||||
|
spec: ""
|
||||||
|
last_static: ""
|
||||||
|
last_static_result: ""
|
||||||
|
last_spec: ""
|
||||||
|
last_spec_result: ""
|
||||||
|
priority: low
|
||||||
|
|
||||||
|
- name: bug-report
|
||||||
|
spec: ""
|
||||||
|
last_static: ""
|
||||||
|
last_static_result: ""
|
||||||
|
last_spec: ""
|
||||||
|
last_spec_result: ""
|
||||||
|
priority: low
|
||||||
|
|
||||||
|
- name: hotfix
|
||||||
|
spec: ""
|
||||||
|
last_static: ""
|
||||||
|
last_static_result: ""
|
||||||
|
last_spec: ""
|
||||||
|
last_spec_result: ""
|
||||||
|
priority: low
|
||||||
|
|
||||||
|
- name: prototype
|
||||||
|
spec: ""
|
||||||
|
last_static: ""
|
||||||
|
last_static_result: ""
|
||||||
|
last_spec: ""
|
||||||
|
last_spec_result: ""
|
||||||
|
priority: low
|
||||||
|
|
||||||
|
- name: playtest-report
|
||||||
|
spec: ""
|
||||||
|
last_static: ""
|
||||||
|
last_static_result: ""
|
||||||
|
last_spec: ""
|
||||||
|
last_spec_result: ""
|
||||||
|
priority: low
|
||||||
|
|
||||||
|
- name: perf-profile
|
||||||
|
spec: ""
|
||||||
|
last_static: ""
|
||||||
|
last_static_result: ""
|
||||||
|
last_spec: ""
|
||||||
|
last_spec_result: ""
|
||||||
|
priority: low
|
||||||
|
|
||||||
|
- name: tech-debt
|
||||||
|
spec: ""
|
||||||
|
last_static: ""
|
||||||
|
last_static_result: ""
|
||||||
|
last_spec: ""
|
||||||
|
last_spec_result: ""
|
||||||
|
priority: low
|
||||||
|
|
||||||
|
- name: scope-check
|
||||||
|
spec: ""
|
||||||
|
last_static: ""
|
||||||
|
last_static_result: ""
|
||||||
|
last_spec: ""
|
||||||
|
last_spec_result: ""
|
||||||
|
priority: low
|
||||||
|
|
||||||
|
- name: estimate
|
||||||
|
spec: ""
|
||||||
|
last_static: ""
|
||||||
|
last_static_result: ""
|
||||||
|
last_spec: ""
|
||||||
|
last_spec_result: ""
|
||||||
|
priority: low
|
||||||
|
|
||||||
|
- name: milestone-review
|
||||||
|
spec: ""
|
||||||
|
last_static: ""
|
||||||
|
last_static_result: ""
|
||||||
|
last_spec: ""
|
||||||
|
last_spec_result: ""
|
||||||
|
priority: low
|
||||||
|
|
||||||
|
- name: retrospective
|
||||||
|
spec: ""
|
||||||
|
last_static: ""
|
||||||
|
last_static_result: ""
|
||||||
|
last_spec: ""
|
||||||
|
last_spec_result: ""
|
||||||
|
priority: low
|
||||||
|
|
||||||
|
- name: changelog
|
||||||
|
spec: ""
|
||||||
|
last_static: ""
|
||||||
|
last_static_result: ""
|
||||||
|
last_spec: ""
|
||||||
|
last_spec_result: ""
|
||||||
|
priority: low
|
||||||
|
|
||||||
|
- name: patch-notes
|
||||||
|
spec: ""
|
||||||
|
last_static: ""
|
||||||
|
last_static_result: ""
|
||||||
|
last_spec: ""
|
||||||
|
last_spec_result: ""
|
||||||
|
priority: low
|
||||||
|
|
||||||
|
- name: onboard
|
||||||
|
spec: ""
|
||||||
|
last_static: ""
|
||||||
|
last_static_result: ""
|
||||||
|
last_spec: ""
|
||||||
|
last_spec_result: ""
|
||||||
|
priority: low
|
||||||
|
|
||||||
|
- name: localize
|
||||||
|
spec: ""
|
||||||
|
last_static: ""
|
||||||
|
last_static_result: ""
|
||||||
|
last_spec: ""
|
||||||
|
last_spec_result: ""
|
||||||
|
priority: low
|
||||||
|
|
||||||
|
- name: launch-checklist
|
||||||
|
spec: ""
|
||||||
|
last_static: ""
|
||||||
|
last_static_result: ""
|
||||||
|
last_spec: ""
|
||||||
|
last_spec_result: ""
|
||||||
|
priority: low
|
||||||
|
|
||||||
|
- name: release-checklist
|
||||||
|
spec: ""
|
||||||
|
last_static: ""
|
||||||
|
last_static_result: ""
|
||||||
|
last_spec: ""
|
||||||
|
last_spec_result: ""
|
||||||
|
priority: low
|
||||||
|
|
||||||
|
- name: adopt
|
||||||
|
spec: ""
|
||||||
|
last_static: ""
|
||||||
|
last_static_result: ""
|
||||||
|
last_spec: ""
|
||||||
|
last_spec_result: ""
|
||||||
|
priority: low
|
||||||
144
tests/skills/design-review.md
Normal file
144
tests/skills/design-review.md
Normal file
@@ -0,0 +1,144 @@
|
|||||||
|
# Skill Test Spec: /design-review
|
||||||
|
|
||||||
|
## Skill Summary
|
||||||
|
|
||||||
|
`/design-review` reads a game design document (GDD) and evaluates it against
|
||||||
|
the project's 8-section design standard (Overview, Player Fantasy, Detailed
|
||||||
|
Rules, Formulas, Edge Cases, Dependencies, Tuning Knobs, Acceptance Criteria).
|
||||||
|
It checks for internal consistency, implementability, and cross-system
|
||||||
|
conflicts. It produces a verdict of APPROVED, NEEDS REVISION, or MAJOR
|
||||||
|
REVISION NEEDED. It is a read-only skill (no file writes) and runs as a
|
||||||
|
`context: fork` subagent.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Static Assertions (Structural)
|
||||||
|
|
||||||
|
Verified automatically by `/skill-test static` — no fixture needed.
|
||||||
|
|
||||||
|
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||||
|
- [ ] Has ≥2 phase headings or numbered steps
|
||||||
|
- [ ] Contains verdict keywords: APPROVED, NEEDS REVISION, MAJOR REVISION NEEDED
|
||||||
|
- [ ] Does NOT require "May I write" language (read-only skill — `allowed-tools` excludes Write/Edit)
|
||||||
|
- [ ] Output format is documented (review template shown in skill body)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Test Cases
|
||||||
|
|
||||||
|
### Case 1: Happy Path — Complete GDD, all 8 sections present
|
||||||
|
|
||||||
|
**Fixture:**
|
||||||
|
- `design/gdd/light-manipulation.md` exists (use `_fixtures/minimal-game-concept.md`
|
||||||
|
as a stand-in — represents a complete document with all required content)
|
||||||
|
- All 8 required sections are populated with substantive content
|
||||||
|
- Formulas section contains at least one formula with defined variables
|
||||||
|
- Acceptance Criteria section contains at least 3 testable criteria
|
||||||
|
|
||||||
|
**Input:** `/design-review design/gdd/light-manipulation.md`
|
||||||
|
|
||||||
|
**Expected behavior:**
|
||||||
|
1. Skill reads the target document in full
|
||||||
|
2. Skill reads CLAUDE.md for project context and standards
|
||||||
|
3. Skill evaluates all 8 required sections (present/absent check)
|
||||||
|
4. Skill checks internal consistency (formulas match described behavior)
|
||||||
|
5. Skill checks implementability (rules are precise enough to code)
|
||||||
|
6. Skill outputs structured review with section-by-section status
|
||||||
|
7. Skill outputs APPROVED verdict
|
||||||
|
|
||||||
|
**Assertions:**
|
||||||
|
- [ ] Skill reads the target file before producing any output
|
||||||
|
- [ ] Output includes a "Completeness" section showing X/8 sections present
|
||||||
|
- [ ] Output includes an "Internal Consistency" section
|
||||||
|
- [ ] Output includes an "Implementability" section
|
||||||
|
- [ ] Output ends with a verdict line: APPROVED / NEEDS REVISION / MAJOR REVISION NEEDED
|
||||||
|
- [ ] APPROVED verdict is given when all 8 sections are present and consistent
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Case 2: Failure Path — Incomplete GDD (4/8 sections)
|
||||||
|
|
||||||
|
**Fixture:**
|
||||||
|
- `design/gdd/light-manipulation.md` exists using content from
|
||||||
|
`tests/skills/_fixtures/incomplete-gdd.md` (4 of 8 sections populated;
|
||||||
|
Formulas, Edge Cases, Tuning Knobs, Acceptance Criteria are missing)
|
||||||
|
|
||||||
|
**Input:** `/design-review design/gdd/light-manipulation.md`
|
||||||
|
|
||||||
|
**Expected behavior:**
|
||||||
|
1. Skill reads the document
|
||||||
|
2. Skill identifies 4 missing sections
|
||||||
|
3. Skill outputs "Completeness: 4/8 sections present"
|
||||||
|
4. Skill lists specifically which 4 sections are missing
|
||||||
|
5. Skill outputs MAJOR REVISION NEEDED verdict (not APPROVED or NEEDS REVISION)
|
||||||
|
|
||||||
|
**Assertions:**
|
||||||
|
- [ ] Output shows "4/8" in the completeness section (not a higher number)
|
||||||
|
- [ ] Output explicitly names each missing section (Formulas, Edge Cases, Tuning Knobs, Acceptance Criteria)
|
||||||
|
- [ ] Verdict is MAJOR REVISION NEEDED (not APPROVED or NEEDS REVISION) when ≥3 sections are missing
|
||||||
|
- [ ] Output does not suggest the document is implementation-ready
|
||||||
|
- [ ] Skill does not write any files (read-only enforcement)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Case 3: Partial Path — 7/8 sections, minor inconsistency
|
||||||
|
|
||||||
|
**Fixture:**
|
||||||
|
- GDD has all sections except Formulas
|
||||||
|
- The described behavior mentions numeric values but no formulas are defined
|
||||||
|
- Acceptance Criteria exist but are vague ("feels good" rather than measurable)
|
||||||
|
|
||||||
|
**Input:** `/design-review design/gdd/[document].md`
|
||||||
|
|
||||||
|
**Expected behavior:**
|
||||||
|
1. Skill identifies missing Formulas section
|
||||||
|
2. Skill flags vague acceptance criteria as an implementability issue
|
||||||
|
3. Skill outputs NEEDS REVISION verdict (not APPROVED, not MAJOR REVISION NEEDED)
|
||||||
|
4. Skill provides specific remediation notes for each issue
|
||||||
|
|
||||||
|
**Assertions:**
|
||||||
|
- [ ] Verdict is NEEDS REVISION (not APPROVED, not MAJOR REVISION NEEDED) for 7/8 with issues
|
||||||
|
- [ ] Output identifies the missing Formulas section specifically
|
||||||
|
- [ ] Output flags the vague acceptance criteria as an implementability gap
|
||||||
|
- [ ] Each flagged issue has a specific, actionable remediation note
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Case 4: Edge Case — File not found
|
||||||
|
|
||||||
|
**Fixture:**
|
||||||
|
- The path provided does not exist in the project
|
||||||
|
|
||||||
|
**Input:** `/design-review design/gdd/nonexistent.md`
|
||||||
|
|
||||||
|
**Expected behavior:**
|
||||||
|
1. Skill attempts to read the file
|
||||||
|
2. File not found
|
||||||
|
3. Skill outputs an error message naming the missing file
|
||||||
|
4. Skill suggests checking the path or listing files in `design/gdd/`
|
||||||
|
5. Skill does NOT produce a verdict
|
||||||
|
|
||||||
|
**Assertions:**
|
||||||
|
- [ ] Skill outputs a clear error when the file is not found
|
||||||
|
- [ ] Skill does NOT output APPROVED, NEEDS REVISION, or MAJOR REVISION NEEDED when file is missing
|
||||||
|
- [ ] Skill suggests a corrective action (check path, list available GDDs)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Protocol Compliance
|
||||||
|
|
||||||
|
- [ ] Does NOT use Write or Edit tools (read-only skill)
|
||||||
|
- [ ] Presents complete findings before any verdict
|
||||||
|
- [ ] Does not ask for approval before producing output (no writes to approve)
|
||||||
|
- [ ] Ends with recommended next step (e.g., fix issues and re-run, or proceed to `/map-systems`)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Coverage Notes
|
||||||
|
|
||||||
|
- Cross-system consistency checking (Case 3 in the skill's own phase list) is
|
||||||
|
not directly tested here because it requires multiple GDD files to compare;
|
||||||
|
this is covered by the `/review-all-gdds` spec instead.
|
||||||
|
- The skill's `context: fork` behavior (running as a subagent) is not tested
|
||||||
|
at the spec level — this is a runtime behavior verified manually.
|
||||||
|
- Performance and edge cases involving very large GDD files are not in scope.
|
||||||
144
tests/skills/gate-check.md
Normal file
144
tests/skills/gate-check.md
Normal file
@@ -0,0 +1,144 @@
|
|||||||
|
# Skill Test Spec: /gate-check
|
||||||
|
|
||||||
|
## Skill Summary
|
||||||
|
|
||||||
|
`/gate-check` validates whether the project is ready to advance to the next
|
||||||
|
development phase. It checks for required artifacts, runs quality checks, asks
|
||||||
|
the user about unverifiable items, and produces a PASS/CONCERNS/FAIL verdict.
|
||||||
|
On PASS with user confirmation, it writes the new stage name to
|
||||||
|
`production/stage.txt`. It governs all 6 phase transitions and is the most
|
||||||
|
critical gate-keeping skill in the pipeline.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Static Assertions (Structural)
|
||||||
|
|
||||||
|
Verified automatically by `/skill-test static` — no fixture needed.
|
||||||
|
|
||||||
|
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||||
|
- [ ] Has ≥2 phase headings (numbered Phase N or ## sections)
|
||||||
|
- [ ] Contains verdict keywords: PASS, CONCERNS, FAIL
|
||||||
|
- [ ] Contains "May I write" collaborative protocol language
|
||||||
|
- [ ] Has a next-step handoff at the end (Follow-Up Actions section)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Test Cases
|
||||||
|
|
||||||
|
### Case 1: Happy Path — All Concept artifacts present, advancing to Systems Design
|
||||||
|
|
||||||
|
**Fixture:**
|
||||||
|
- `design/gdd/game-concept.md` exists, has content including all required sections
|
||||||
|
- `design/gdd/game-pillars.md` exists (or pillars defined within concept doc)
|
||||||
|
- No systems index yet (which is correct for this stage)
|
||||||
|
|
||||||
|
**Input:** `/gate-check systems-design`
|
||||||
|
|
||||||
|
**Expected behavior:**
|
||||||
|
1. Skill reads `design/gdd/game-concept.md` and verifies it has content
|
||||||
|
2. Skill checks for game pillars (in concept or separate file)
|
||||||
|
3. Skill checks quality items (core loop described, target audience identified)
|
||||||
|
4. Skill outputs structured checklist with all items marked
|
||||||
|
5. Skill presents PASS/CONCERNS/FAIL verdict
|
||||||
|
6. If PASS: skill asks "May I update `production/stage.txt` to 'Systems Design'?"
|
||||||
|
|
||||||
|
**Assertions:**
|
||||||
|
- [ ] Skill uses Glob or Read to verify `design/gdd/game-concept.md` exists before marking it checked
|
||||||
|
- [ ] Output includes a "Required Artifacts" section with check status per item
|
||||||
|
- [ ] Output includes a "Quality Checks" section with check status per item
|
||||||
|
- [ ] Output includes a "Verdict" line with one of PASS / CONCERNS / FAIL
|
||||||
|
- [ ] Skill asks about unverifiable quality items (e.g., "Has this been reviewed?") rather than assuming PASS
|
||||||
|
- [ ] Skill asks "May I write" before updating `production/stage.txt`
|
||||||
|
- [ ] Skill does NOT write `production/stage.txt` without explicit user confirmation
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Case 2: Failure Path — Missing required artifacts for Concept → Systems Design
|
||||||
|
|
||||||
|
**Fixture:**
|
||||||
|
- `design/gdd/game-concept.md` does NOT exist
|
||||||
|
- No game pillars document exists
|
||||||
|
- `design/gdd/` directory is empty or absent
|
||||||
|
|
||||||
|
**Input:** `/gate-check systems-design`
|
||||||
|
|
||||||
|
**Expected behavior:**
|
||||||
|
1. Skill attempts to read `design/gdd/game-concept.md` — file not found
|
||||||
|
2. Skill marks required artifact as missing (not present)
|
||||||
|
3. Skill outputs FAIL verdict
|
||||||
|
4. Skill lists blocker: "No game concept document found"
|
||||||
|
5. Skill suggests remediation: run `/brainstorm` to create one
|
||||||
|
|
||||||
|
**Assertions:**
|
||||||
|
- [ ] Verdict is FAIL (not PASS or CONCERNS) when required artifacts are missing
|
||||||
|
- [ ] Output explicitly names `design/gdd/game-concept.md` as missing
|
||||||
|
- [ ] Output includes a "Blockers" section with at least 1 item
|
||||||
|
- [ ] Output recommends `/brainstorm` as the remediation action
|
||||||
|
- [ ] Skill does NOT write `production/stage.txt` when verdict is FAIL
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Case 3: No Argument — Auto-detect current stage
|
||||||
|
|
||||||
|
**Fixture:**
|
||||||
|
- `production/stage.txt` contains `Concept`
|
||||||
|
- `design/gdd/game-concept.md` exists with content
|
||||||
|
- No systems index yet
|
||||||
|
|
||||||
|
**Input:** `/gate-check` (no argument)
|
||||||
|
|
||||||
|
**Expected behavior:**
|
||||||
|
1. Skill reads `production/stage.txt` to determine current stage
|
||||||
|
2. Skill determines the next gate is Concept → Systems Design
|
||||||
|
3. Skill proceeds with the Systems Design gate checks
|
||||||
|
4. Output clearly states which transition is being validated
|
||||||
|
|
||||||
|
**Assertions:**
|
||||||
|
- [ ] Skill reads `production/stage.txt` (or uses project-stage-detect heuristics) to determine current stage
|
||||||
|
- [ ] Output header names both current and target phases (e.g., "Gate Check: Concept → Systems Design")
|
||||||
|
- [ ] Skill does not ask the user which gate to check if current stage is determinable
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Case 4: Edge Case — Manual check items flagged correctly
|
||||||
|
|
||||||
|
**Fixture:**
|
||||||
|
- All required artifacts for Concept → Systems Design are present
|
||||||
|
- No playtest or review record exists (can't auto-verify quality checks)
|
||||||
|
|
||||||
|
**Input:** `/gate-check systems-design`
|
||||||
|
|
||||||
|
**Expected behavior:**
|
||||||
|
1. Skill verifies all artifact files exist
|
||||||
|
2. Skill encounters quality check: "Game concept reviewed (not MAJOR REVISION NEEDED)"
|
||||||
|
3. Since no review record exists, skill marks item as MANUAL CHECK NEEDED
|
||||||
|
4. Skill asks the user: "Has the game concept been reviewed for design quality?"
|
||||||
|
5. Skill waits for user input before finalizing verdict
|
||||||
|
|
||||||
|
**Assertions:**
|
||||||
|
- [ ] Items that cannot be auto-verified are marked `[?] MANUAL CHECK NEEDED` rather than assumed PASS
|
||||||
|
- [ ] Skill uses a question to the user for at least one unverifiable quality item
|
||||||
|
- [ ] Skill does not mark unverifiable items as PASS by default
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Protocol Compliance
|
||||||
|
|
||||||
|
- [ ] Uses "May I write" before updating `production/stage.txt`
|
||||||
|
- [ ] Presents the full checklist report before asking for write approval
|
||||||
|
- [ ] Ends with a "Follow-Up Actions" section listing next steps per verdict
|
||||||
|
- [ ] Never advances the stage without explicit user confirmation
|
||||||
|
- [ ] Never auto-creates `production/stage.txt` if it doesn't exist without asking
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Coverage Notes
|
||||||
|
|
||||||
|
- The Production → Polish and Polish → Release gates are not covered here
|
||||||
|
because they require complex multi-artifact setups (sprint plans, playtest
|
||||||
|
data, QA sign-off); these are deferred to dedicated follow-up specs.
|
||||||
|
- The "CONCERNS" verdict path (minor gaps, not blocking) is not explicitly
|
||||||
|
tested here; it falls between Case 1 and Case 2 and follows the same pattern.
|
||||||
|
- The Vertical Slice validation block (Pre-Production → Production gate) is not
|
||||||
|
covered because it requires a playable build context that cannot be expressed
|
||||||
|
as a document fixture.
|
||||||
165
tests/skills/story-done.md
Normal file
165
tests/skills/story-done.md
Normal file
@@ -0,0 +1,165 @@
|
|||||||
|
# Skill Test Spec: /story-done
|
||||||
|
|
||||||
|
## Skill Summary
|
||||||
|
|
||||||
|
`/story-done` closes the loop between design and implementation. Run at the
|
||||||
|
end of implementing a story, it reads the story file and verifies each
|
||||||
|
acceptance criterion against the implementation. It checks for GDD and ADR
|
||||||
|
deviations, prompts a code review, updates the story status to `Complete`,
|
||||||
|
logs any tech debt, and surfaces the next ready story from the sprint. It
|
||||||
|
produces a COMPLETE / COMPLETE WITH NOTES / BLOCKED verdict and writes to
|
||||||
|
the story file and optionally to `docs/tech-debt-register.md`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Static Assertions (Structural)
|
||||||
|
|
||||||
|
Verified automatically by `/skill-test static` — no fixture needed.
|
||||||
|
|
||||||
|
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||||
|
- [ ] Has ≥5 phase headings (complex skill warranting `context: fork` if applicable)
|
||||||
|
- [ ] Contains verdict keywords: COMPLETE, BLOCKED
|
||||||
|
- [ ] Contains "May I write" collaborative protocol language (writes to story file and tech-debt register)
|
||||||
|
- [ ] Has a next-step handoff (surfaces next story from sprint)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Test Cases
|
||||||
|
|
||||||
|
### Case 1: Happy Path — All acceptance criteria met, no deviations
|
||||||
|
|
||||||
|
**Fixture:**
|
||||||
|
- Story file at `production/epics/core/story-light-pickup.md` with:
|
||||||
|
- 3 acceptance criteria, all implemented as described
|
||||||
|
- `TR-ID: TR-light-001` referencing a GDD requirement
|
||||||
|
- `ADR: docs/architecture/adr-003-inventory.md` (Accepted)
|
||||||
|
- `Status: In Progress`
|
||||||
|
- Implementation files listed in story exist in `src/`
|
||||||
|
- GDD requirement text at TR-light-001 matches how the feature was implemented
|
||||||
|
- ADR guidance was followed (no deviations)
|
||||||
|
|
||||||
|
**Input:** `/story-done production/epics/core/story-light-pickup.md`
|
||||||
|
|
||||||
|
**Expected behavior:**
|
||||||
|
1. Skill reads the story file and extracts all key fields
|
||||||
|
2. Skill reads the GDD requirement fresh from `tr-registry.yaml` (not from story's quoted text)
|
||||||
|
3. Skill reads the referenced ADR to understand implementation constraints
|
||||||
|
4. Skill evaluates each acceptance criterion (auto where possible, manual prompt where not)
|
||||||
|
5. Skill checks for GDD requirement deviations
|
||||||
|
6. Skill checks for ADR guideline deviations
|
||||||
|
7. Skill prompts user: "Please provide the code review outcome for this story"
|
||||||
|
8. Skill presents COMPLETE verdict
|
||||||
|
9. Skill asks "May I update story Status to Complete and add Completion Notes?"
|
||||||
|
10. If yes: skill updates the story file
|
||||||
|
11. Skill surfaces the next `Ready for Dev` story from the sprint
|
||||||
|
|
||||||
|
**Assertions:**
|
||||||
|
- [ ] Skill reads `docs/architecture/tr-registry.yaml` for TR-ID requirement text (not just story)
|
||||||
|
- [ ] Skill reads the referenced ADR file (not just the story reference)
|
||||||
|
- [ ] Each acceptance criterion is listed with VERIFIED / DEFERRED / FAILED status
|
||||||
|
- [ ] Skill prompts the user for code review outcome (does not skip this step)
|
||||||
|
- [ ] Verdict is COMPLETE when all criteria are verified and no deviations exist
|
||||||
|
- [ ] Skill asks "May I write" before updating the story file
|
||||||
|
- [ ] Skill does NOT auto-update story status without user confirmation
|
||||||
|
- [ ] After completion, skill surfaces the next ready story from `production/sprints/`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Case 2: Blocked Path — Acceptance criterion cannot be verified
|
||||||
|
|
||||||
|
**Fixture:**
|
||||||
|
- Story file has an acceptance criterion: "Player sees correct animation on pickup"
|
||||||
|
- No automated test for this criterion exists
|
||||||
|
- Manual verification has not been performed
|
||||||
|
- All other criteria are met
|
||||||
|
|
||||||
|
**Input:** `/story-done production/epics/core/story-light-pickup.md`
|
||||||
|
|
||||||
|
**Expected behavior:**
|
||||||
|
1. Skill processes all acceptance criteria
|
||||||
|
2. Reaches the animation criterion — cannot auto-verify
|
||||||
|
3. Skill asks the user: "Acceptance criterion 'Player sees correct animation on
|
||||||
|
pickup' cannot be auto-verified. Has this been manually tested?"
|
||||||
|
4. If user says No: criterion is marked DEFERRED, verdict becomes COMPLETE WITH NOTES
|
||||||
|
5. Skill records the deferred criterion in completion notes
|
||||||
|
6. Asks "May I write updated story with deferred criterion noted?"
|
||||||
|
|
||||||
|
**Assertions:**
|
||||||
|
- [ ] Skill asks the user about unverifiable criteria rather than assuming PASS
|
||||||
|
- [ ] Deferred criteria result in COMPLETE WITH NOTES (not COMPLETE or BLOCKED)
|
||||||
|
- [ ] The deferred criterion is explicitly named in the completion notes
|
||||||
|
- [ ] Skill still asks "May I write" before updating the story file
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Case 3: Blocked Path — GDD deviation detected
|
||||||
|
|
||||||
|
**Fixture:**
|
||||||
|
- Story TR-ID points to requirement: "Player can carry max 3 light sources"
|
||||||
|
- Implementation in `src/` uses a variable `MAX_CARRIED_LIGHTS = 5`
|
||||||
|
- This is a deliberate deviation from the GDD
|
||||||
|
|
||||||
|
**Input:** `/story-done production/epics/core/story-light-pickup.md`
|
||||||
|
|
||||||
|
**Expected behavior:**
|
||||||
|
1. Skill reads the GDD requirement text (max 3)
|
||||||
|
2. Skill detects discrepancy between requirement and implementation value (5)
|
||||||
|
3. Skill flags this as a GDD deviation and asks the user to classify it:
|
||||||
|
- INTENTIONAL: document the deviation and reason
|
||||||
|
- ERROR: implementation must be fixed before story can be marked Complete
|
||||||
|
- OUT OF SCOPE: requirement changed and GDD needs updating
|
||||||
|
4. If INTENTIONAL: skill records deviation in completion notes, verdict is COMPLETE WITH NOTES
|
||||||
|
5. If ERROR: verdict is BLOCKED until implementation is corrected
|
||||||
|
|
||||||
|
**Assertions:**
|
||||||
|
- [ ] Skill detects the mismatch between GDD requirement and implementation value
|
||||||
|
- [ ] Skill asks the user to classify the deviation (not auto-assumes either way)
|
||||||
|
- [ ] INTENTIONAL deviation → COMPLETE WITH NOTES (not BLOCKED)
|
||||||
|
- [ ] ERROR deviation → BLOCKED verdict until fixed
|
||||||
|
- [ ] Detected deviations are recorded in completion notes or tech debt register
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Case 4: Edge Case — No argument, auto-detect current story
|
||||||
|
|
||||||
|
**Fixture:**
|
||||||
|
- `production/session-state/active.md` contains a reference to
|
||||||
|
`production/epics/core/story-oxygen-drain.md` as the active story
|
||||||
|
- That story file exists with `Status: In Progress`
|
||||||
|
|
||||||
|
**Input:** `/story-done` (no argument)
|
||||||
|
|
||||||
|
**Expected behavior:**
|
||||||
|
1. Skill reads `production/session-state/active.md`
|
||||||
|
2. Skill finds the active story reference
|
||||||
|
3. Skill reads that story file and proceeds normally
|
||||||
|
4. Output confirms which story was auto-detected
|
||||||
|
|
||||||
|
**Assertions:**
|
||||||
|
- [ ] Skill reads `production/session-state/active.md` when no argument is given
|
||||||
|
- [ ] Skill identifies and confirms the auto-detected story before proceeding
|
||||||
|
- [ ] If no story is found in session state, skill asks the user to provide a path
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Protocol Compliance
|
||||||
|
|
||||||
|
- [ ] Uses "May I write" before updating the story file
|
||||||
|
- [ ] Uses "May I write" before adding entries to `docs/tech-debt-register.md`
|
||||||
|
- [ ] Presents complete findings (criteria check, deviation check) before asking approval
|
||||||
|
- [ ] Ends by surfacing the next ready story from the sprint plan
|
||||||
|
- [ ] Does not mark a story Complete if any criteria are in ERROR state
|
||||||
|
- [ ] Does not skip the code review prompt
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Coverage Notes
|
||||||
|
|
||||||
|
- The full 8-phase flow of the skill is exercised across Cases 1-3; not all
|
||||||
|
edge cases within each phase are covered.
|
||||||
|
- Tech debt logging (deferred items written to `docs/tech-debt-register.md`)
|
||||||
|
is mentioned in Case 2 but not the primary assertion focus; dedicated
|
||||||
|
coverage deferred.
|
||||||
|
- The `sprint-status.yaml` update (Phase 7 in the skill) is implied by Case 1
|
||||||
|
but not the primary assertion; assumed to follow the same "May I write" pattern.
|
||||||
|
- Stories with multiple TR-IDs or multiple ADRs are not explicitly tested.
|
||||||
153
tests/skills/story-readiness.md
Normal file
153
tests/skills/story-readiness.md
Normal file
@@ -0,0 +1,153 @@
|
|||||||
|
# Skill Test Spec: /story-readiness
|
||||||
|
|
||||||
|
## Skill Summary
|
||||||
|
|
||||||
|
`/story-readiness` validates that a story file is ready for a developer to
|
||||||
|
pick up and implement. It checks four dimensions: Design (embedded GDD
|
||||||
|
requirements), Architecture (ADR references and status), Scope (clear
|
||||||
|
boundaries and DoD), and Definition of Done (testable criteria). It produces
|
||||||
|
a READY / NEEDS WORK / BLOCKED verdict. It is a read-only skill and runs
|
||||||
|
before any developer picks up a story.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Static Assertions (Structural)
|
||||||
|
|
||||||
|
Verified automatically by `/skill-test static` — no fixture needed.
|
||||||
|
|
||||||
|
- [ ] Has required frontmatter fields: `name`, `description`, `argument-hint`, `user-invocable`, `allowed-tools`
|
||||||
|
- [ ] Has ≥2 phase headings or numbered check sections
|
||||||
|
- [ ] Contains verdict keywords: READY, NEEDS WORK, BLOCKED
|
||||||
|
- [ ] Does NOT require "May I write" language (read-only skill)
|
||||||
|
- [ ] Has a next-step handoff (what to do after verdict)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Test Cases
|
||||||
|
|
||||||
|
### Case 1: Happy Path — Fully ready story
|
||||||
|
|
||||||
|
**Fixture:**
|
||||||
|
- Story file exists at `production/epics/core/story-light-pickup.md`
|
||||||
|
- Story contains:
|
||||||
|
- `TR-ID: TR-light-001` (GDD requirement reference)
|
||||||
|
- `ADR: docs/architecture/adr-003-inventory.md`
|
||||||
|
- Referenced ADR exists and has status `Accepted`
|
||||||
|
- Referenced TR-ID exists in `docs/architecture/tr-registry.yaml`
|
||||||
|
- Story has `## Acceptance Criteria` with ≥3 testable items
|
||||||
|
- Story has `## Definition of Done` section
|
||||||
|
- Story has `Status: Ready for Dev`
|
||||||
|
- Manifest version in story header matches current `docs/architecture/control-manifest.md`
|
||||||
|
|
||||||
|
**Input:** `/story-readiness production/epics/core/story-light-pickup.md`
|
||||||
|
|
||||||
|
**Expected behavior:**
|
||||||
|
1. Skill reads the story file
|
||||||
|
2. Skill reads the referenced ADR — verifies status is `Accepted`
|
||||||
|
3. Skill reads `docs/architecture/tr-registry.yaml` — verifies TR-ID exists
|
||||||
|
4. Skill reads `docs/architecture/control-manifest.md` — verifies manifest version matches
|
||||||
|
5. Skill evaluates all 4 dimensions (Design, Architecture, Scope, DoD)
|
||||||
|
6. Skill outputs READY verdict with all checks passing
|
||||||
|
|
||||||
|
**Assertions:**
|
||||||
|
- [ ] Skill reads the referenced ADR file (not just the story)
|
||||||
|
- [ ] Skill verifies ADR status is `Accepted` (not `Proposed`)
|
||||||
|
- [ ] Skill reads `tr-registry.yaml` to verify TR-ID exists
|
||||||
|
- [ ] Output includes check results for all 4 dimensions
|
||||||
|
- [ ] Verdict is READY when all checks pass
|
||||||
|
- [ ] Skill does not write any files
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Case 2: Blocked Path — Referenced ADR is Proposed (not Accepted)
|
||||||
|
|
||||||
|
**Fixture:**
|
||||||
|
- Story file exists with `ADR: docs/architecture/adr-005-light-system.md`
|
||||||
|
- `adr-005-light-system.md` exists but has `Status: Proposed`
|
||||||
|
- All other story content is otherwise complete
|
||||||
|
|
||||||
|
**Input:** `/story-readiness production/epics/core/story-light-system.md`
|
||||||
|
|
||||||
|
**Expected behavior:**
|
||||||
|
1. Skill reads the story
|
||||||
|
2. Skill reads `adr-005-light-system.md` — finds `Status: Proposed`
|
||||||
|
3. Skill flags this as a BLOCKING issue (cannot implement against unaccepted ADR)
|
||||||
|
4. Skill outputs BLOCKED verdict
|
||||||
|
5. Skill recommends: accept or reject the ADR before picking up the story
|
||||||
|
|
||||||
|
**Assertions:**
|
||||||
|
- [ ] Verdict is BLOCKED (not NEEDS WORK or READY) when ADR is Proposed
|
||||||
|
- [ ] Output explicitly names the Proposed ADR as the blocker
|
||||||
|
- [ ] Output recommends resolving ADR status before proceeding
|
||||||
|
- [ ] Skill does not output READY regardless of other checks passing
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Case 3: Needs Work — Missing Acceptance Criteria
|
||||||
|
|
||||||
|
**Fixture:**
|
||||||
|
- Story file exists but has no `## Acceptance Criteria` section
|
||||||
|
- ADR reference exists and is `Accepted`
|
||||||
|
- TR-ID exists in registry
|
||||||
|
- Manifest version matches
|
||||||
|
|
||||||
|
**Input:** `/story-readiness production/epics/core/story-oxygen-drain.md`
|
||||||
|
|
||||||
|
**Expected behavior:**
|
||||||
|
1. Skill reads the story
|
||||||
|
2. Skill finds no Acceptance Criteria section
|
||||||
|
3. Skill flags this as a NEEDS WORK issue (story is incomplete, not blocked)
|
||||||
|
4. Skill outputs NEEDS WORK verdict
|
||||||
|
5. Skill names the missing section and suggests adding measurable criteria
|
||||||
|
|
||||||
|
**Assertions:**
|
||||||
|
- [ ] Verdict is NEEDS WORK (not BLOCKED or READY) when Acceptance Criteria section is absent
|
||||||
|
- [ ] Output identifies the missing Acceptance Criteria section specifically
|
||||||
|
- [ ] Output suggests adding testable/measurable criteria
|
||||||
|
- [ ] Skill distinguishes NEEDS WORK (fixable without external dependencies) from BLOCKED (requires outside action)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Case 4: Edge Case — Stale manifest version
|
||||||
|
|
||||||
|
**Fixture:**
|
||||||
|
- Story file has `Manifest Version: 2026-01-15` in its header
|
||||||
|
- `docs/architecture/control-manifest.md` has `Manifest Version: 2026-03-10`
|
||||||
|
- Versions do not match (story was created before manifest was updated)
|
||||||
|
|
||||||
|
**Input:** `/story-readiness production/epics/core/story-mirror-rotation.md`
|
||||||
|
|
||||||
|
**Expected behavior:**
|
||||||
|
1. Skill reads the story and extracts manifest version `2026-01-15`
|
||||||
|
2. Skill reads control manifest header and extracts current version `2026-03-10`
|
||||||
|
3. Skill detects version mismatch
|
||||||
|
4. Skill flags this as an ADVISORY issue (not blocking, but worth noting)
|
||||||
|
5. Verdict is NEEDS WORK with manifest staleness noted
|
||||||
|
|
||||||
|
**Assertions:**
|
||||||
|
- [ ] Skill reads `docs/architecture/control-manifest.md` to get current version
|
||||||
|
- [ ] Skill compares story's embedded manifest version against current manifest version
|
||||||
|
- [ ] Stale manifest version results in NEEDS WORK (not BLOCKED, not READY)
|
||||||
|
- [ ] Output explains that the story's embedded guidance may be outdated
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Protocol Compliance
|
||||||
|
|
||||||
|
- [ ] Does NOT use Write or Edit tools (read-only skill)
|
||||||
|
- [ ] Presents complete check results before verdict
|
||||||
|
- [ ] Does not ask for approval (no file writes)
|
||||||
|
- [ ] Ends with recommended next step (fix issues or proceed to implementation)
|
||||||
|
- [ ] Distinguishes three verdict levels clearly (READY vs NEEDS WORK vs BLOCKED)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Coverage Notes
|
||||||
|
|
||||||
|
- Case where TR-ID is missing from the registry entirely is not explicitly
|
||||||
|
tested here; it follows the same NEEDS WORK pattern as Case 3.
|
||||||
|
- The "no argument" path (skill auto-detecting the current story) is not
|
||||||
|
tested because it depends on `production/session-state/active.md` content,
|
||||||
|
which is hard to fixture reliably.
|
||||||
|
- Stories with multiple ADR references are not tested; behavior is assumed to
|
||||||
|
be additive (all ADRs must be Accepted for READY verdict).
|
||||||
Reference in New Issue
Block a user