Release v0.4.0: /consistency-check, skill fixes, genre-agnostic agents

New skill: /consistency-check — cross-GDD entity registry scanner New registries: design/registry/entities.yaml, docs/registry/architecture.yaml Skill fixes: no-arg guards, verdict keywords, AskUserQuestion gates on all team-* skills Agent fixes: genre-agnostic language in game-designer, systems-designer, economy-designer, live-ops-designer Docs: skill/template counts corrected, stale references cleaned up Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-27 04:51:46 +00:00 · 2026-03-27 20:06:33 +11:00
parent 04ed5d5c36
commit 6c041ac1be
108 changed files with 2745 additions and 1005 deletions
--- a/.claude/skills/estimate/SKILL.md
+++ b/.claude/skills/estimate/SKILL.md
@@ -6,51 +6,51 @@ user-invocable: true
 allowed-tools: Read, Glob, Grep
 ---

-When this skill is invoked:
+## Phase 1: Understand the Task

-1. **Read the task description** from the argument. If the description is too
-   vague to estimate meaningfully, ask for clarification before proceeding.
+Read the task description from the argument. If the description is too vague to estimate meaningfully, ask for clarification before proceeding.

-2. **Read CLAUDE.md** for project context: tech stack, coding standards,
-   architectural patterns, and any estimation guidelines.
+Read CLAUDE.md for project context: tech stack, coding standards, architectural patterns, and any estimation guidelines.

-3. **Read relevant design documents** from `design/gdd/` if the task relates
-   to a documented feature or system.
+Read relevant design documents from `design/gdd/` if the task relates to a documented feature or system.

-4. **Scan the codebase** to understand the systems affected by this task:
-   - Identify files and modules that would need to change
-   - Assess the complexity of those files (size, dependency count, cyclomatic
-     complexity)
-   - Identify integration points with other systems
-   - Check for existing test coverage in the affected areas
+---

-5. **Read past sprint data** from `production/sprints/` if available:
-   - Look for similar completed tasks and their actual effort
-   - Calculate historical velocity (planned vs actual)
-   - Identify any estimation bias patterns (consistently over or under)
+## Phase 2: Scan Affected Code

-6. **Analyze the following factors**:
+Identify files and modules that would need to change:

-   **Code Complexity**:
-   - Lines of code in affected files
-   - Number of dependencies and coupling level
-   - Whether this touches core/engine code vs leaf/feature code
-   - Whether existing patterns can be followed or new patterns are needed
+- Assess complexity (size, dependency count, cyclomatic complexity)
+- Identify integration points with other systems
+- Check for existing test coverage in the affected areas
+- Read past sprint data from `production/sprints/` for similar completed tasks and historical velocity

-   **Scope**:
-   - Number of systems touched
-   - New code vs modification of existing code
-   - Amount of new test coverage required
-   - Data migration or configuration changes needed
+---

-   **Risk**:
-   - New technology or unfamiliar libraries
-   - Unclear or ambiguous requirements
-   - Dependencies on unfinished work
-   - Cross-system integration complexity
-   - Performance sensitivity
+## Phase 3: Analyze Complexity Factors

-7. **Generate the estimate**:
+**Code Complexity:**
+- Lines of code in affected files
+- Number of dependencies and coupling level
+- Whether this touches core/engine code vs leaf/feature code
+- Whether existing patterns can be followed or new patterns are needed
+
+**Scope:**
+- Number of systems touched
+- New code vs modification of existing code
+- Amount of new test coverage required
+- Data migration or configuration changes needed
+
+**Risk:**
+- New technology or unfamiliar libraries
+- Unclear or ambiguous requirements
+- Dependencies on unfinished work
+- Cross-system integration complexity
+- Performance sensitivity
+
+---
+
+## Phase 4: Generate the Estimate

 ```markdown
 ## Task Estimate: [Task Name]
@@ -65,99 +65,67 @@ Generated: [Date]
 |--------|-----------|-------|
 | Systems affected | [List] | [Core, gameplay, UI, etc.] |
 | Files likely modified | [Count] | [Key files listed below] |
-| New code vs modification | [Ratio, e.g., 70% new / 30% modification] | |
+| New code vs modification | [Ratio] | |
 | Integration points | [Count] | [Which systems interact] |
-| Test coverage needed | [Low / Medium / High] | [Unit, integration, manual] |
-| Existing patterns available | [Yes / Partial / No] | [Can follow existing code or new ground] |
+| Test coverage needed | [Low / Medium / High] | |
+| Existing patterns available | [Yes / Partial / No] | |

 **Key files likely affected:**
 - `[path/to/file1]` -- [what changes here]
- `[path/to/file2]` -- [what changes here]
- `[path/to/file3]` -- [what changes here]

 ### Effort Estimate

 | Scenario | Days | Assumption |
 |----------|------|------------|
-| Optimistic | [X] | Everything goes right, no surprises, requirements are clear |
-| Expected | [Y] | Normal pace, minor issues, one round of review feedback |
-| Pessimistic | [Z] | Significant unknowns surface, blocked for a day, requirements change |
+| Optimistic | [X] | Everything goes right, no surprises |
+| Expected | [Y] | Normal pace, minor issues, one round of review |
+| Pessimistic | [Z] | Significant unknowns surface, blocked for a day |

 **Recommended budget: [Y days]**

-[If historical data is available: "Based on [N] similar tasks that averaged
-[X] days actual vs [Y] days estimated, a [correction factor] adjustment has
-been applied."]
-
 ### Confidence: [High / Medium / Low]

-**High** -- Clear requirements, familiar systems, follows existing patterns,
-similar tasks completed before.
-
-**Medium** -- Some unknowns, touches moderately complex systems, partial
-precedent from previous work.
-
-**Low** -- Significant unknowns, new technology, unclear requirements, or
-cross-cutting concerns across many systems.
-
 [Explain which factors drive the confidence level for this specific task.]

 ### Risk Factors

 | Risk | Likelihood | Impact | Mitigation |
 |------|-----------|--------|------------|
-| [Specific risk] | [High/Med/Low] | [Days added if realized] | [How to reduce] |
-| [Another risk] | [Likelihood] | [Impact] | [Mitigation] |

 ### Dependencies

 | Dependency | Status | Impact if Delayed |
 |-----------|--------|-------------------|
-| [What must be done first] | [Done / In Progress / Not Started] | [How it affects this task] |

 ### Suggested Breakdown

 | # | Sub-task | Estimate | Notes |
 |---|----------|----------|-------|
-| 1 | [Research / spike] | [X days] | [If unknowns need investigation first] |
-| 2 | [Core implementation] | [X days] | [The main work] |
-| 3 | [Integration with system X] | [X days] | [Connecting to existing code] |
-| 4 | [Testing and validation] | [X days] | [Writing tests, manual verification] |
-| 5 | [Code review and iteration] | [X days] | [Review feedback, fixes] |
+| 1 | [Research / spike] | [X days] | |
+| 2 | [Core implementation] | [X days] | |
+| 3 | [Testing and validation] | [X days] | |
 | | **Total** | **[Y days]** | |

-### Historical Comparison
-[If similar tasks exist in sprint history:]
-
-| Similar Task | Estimated | Actual | Relevant Difference |
-|-------------|-----------|--------|-------------------|
-| [Past task 1] | [X days] | [Y days] | [What makes it similar/different] |
-| [Past task 2] | [X days] | [Y days] | [What makes it similar/different] |
-
 ### Notes and Assumptions
 - [Key assumption that affects the estimate]
- [Another assumption]
- [Any caveats about scope boundaries -- what is included vs excluded]
- [Recommendations: e.g., "Consider a spike first if requirement X is unclear"]
+- [Any caveats about scope boundaries]
 ```

-8. **Output the estimate** to the user with a brief summary: recommended
-   budget, confidence level, and the single biggest risk factor.
+Output the estimate with a brief summary: recommended budget, confidence level, and the single biggest risk factor.
+
+This skill is read-only — no files are written. Verdict: **COMPLETE** — estimate generated.
+
+---
+
+## Phase 5: Next Steps
+
+- If confidence is Low: recommend a time-boxed spike (`/prototype`) before committing.
+- If the task is > 10 days: recommend breaking it into smaller stories via `/create-stories`.
+- To schedule the task: run `/sprint-plan update` to add it to the next sprint.

 ### Guidelines

- Always give a range (optimistic / expected / pessimistic), never a single
-  number. Single-point estimates create false precision.
- The recommended budget should be the expected estimate, not the optimistic
-  one. Padding is not dishonest -- it is realistic.
- If confidence is Low, recommend a time-boxed spike or prototype before
-  committing to the full estimate.
- Be explicit about what is included and excluded. Scope ambiguity is the
-  most common source of estimation error.
- Round to half-day increments. Estimating in hours implies false precision
-  for tasks longer than a day.
- If the task is too large to estimate confidently (more than 10 days
-  expected), recommend breaking it into smaller tasks and estimating those
-  individually.
- Do not pad estimates silently. If risk exists, call it out explicitly in
-  the risk factors section so the team can decide how to handle it.
+- Always give a range (optimistic / expected / pessimistic), never a single number
+- The recommended budget should be the expected estimate, not the optimistic one
+- Round to half-day increments — estimating in hours implies false precision for tasks longer than a day
+- Do not pad estimates silently — call out risk explicitly so the team can decide