DESIGNVAULT
OVERVIEW
Agent Harness Engineering for Game Production
DesignVault is an AI Agent workflow system for game development. It is not just a documentation vault, but a shared working surface where human designers, coding agents, and project knowledge can collaborate around the same source of current truth.
In AI-assisted production, short tasks are often fast, but long-running projects introduce drift: context changes across threads, design and implementation diverge, and agents may redesign while executing. DesignVault turns that instability into a maintainable workflow: design converges first, execution has boundaries, validation produces evidence, and confirmed changes are written back.
Knowledge Layer
A retrieval-friendly Wiki Truth layer adapted from Karpathy's LLM Wiki idea.
Agent Harness
Skills, phase packets, execution logs, handoffs, acceptance, and repair loops.
Production Workflow
Three lanes:/design, /execute, and /bug.
Gallery
DESIGN DETAILS
1. LLM Wiki + Spec Coding
DesignVault is directly inspired by Karpathy's LLM Wiki concept: instead of relying only on chat history, the LLM works with a searchable, curated, and rule-bound knowledge system.
Stable rules, system definitions, UI responsibilities, design boundaries, and terminology.
Design reasoning, tradeoffs, open questions, and convergence before implementation.
Implementation phases, required context, validation methods, and stop conditions.
Core principle: Longform shapes the design, Wiki stores the current truth, and Execution Plan drives implementation.
2. Agent Harness Engineering
The core of DesignVault is not a longer prompt, but an Agent harness that makes AI work repeatable, recoverable, and verifiable. The harness turns human intent into executable context, constrains agent autonomy within the right boundaries, and preserves state at key checkpoints.
The harness design also references ideas from the OpenAI Agents SDK: agents, tools / handoffs, guardrails, sessions, human-in-the-loop control, tracing, and MCP tool calling. DesignVault adapts these agent-engineering ideas into a file-based workflow for game development.
Skill Layering
Reusable skills such as designvault-design, designvault-execute, designvault-bug, designvault-ui-handoff, and designvault-wiki-maintain. Each skill owns a workflow lane, references, scripts, and handoff expectations.
Execution State Machine
/execute is modeled as preflight -> phase -> acceptance -> complete. The parent harness decides the next action, while child agents only execute bounded phase tasks.
Context Packet Generation
Phase packets package required Wiki pages, code entry points, success criteria, non-goals, validation methods, and stop conditions before agent execution.
Structured Output Normalization
Agent output is normalized into completed work, changed files, verification evidence, risks, Wiki writeback needs, and whether execution should stop.
Machine Evidence Adapter
The harness separates agent claims from machine evidence. Completion requires acceptance evidence such as compile results, console status, targeted tests, or Unity Editor evidence.
Stop Protocol + Repair Loop
When plan, Wiki, or implementation context conflicts, the workflow returns a Decision Packet. When acceptance finds a mismatch, it enters a bounded repair loop instead of silently redesigning.
3. Why It Improves Production
DesignVault is designed around one production reality: most studios already have some form of wiki or design documentation, but those documents often receive the most attention during pre-production. Once implementation starts, maintaining the wiki becomes expensive, designers do not want to constantly rewrite tool docs, and programmers rarely have time to read every design page in full. This is exactly the kind of coordination work an AI agent can absorb.
Design: faster idea clarification
The /design lane uses a Socratic questioning style: instead of asking the designer to write a complete spec upfront, the agent interviews them about edge cases, player feedback, system boundaries, UI surfaces, and failure cases. A rough idea becomes a clearer design contract faster.
Documentation: lower human maintenance cost
DesignVault lets agents read, summarize, index, and update workflow documents. Humans still make design decisions, but agents take over the repetitive burden of keeping Wiki Truth, tool notes, API-style docs, and execution traces searchable.
Implementation: not vibe coding
AI-assisted development is becoming a default choice for many developers, but DesignVault avoids unbounded vibe coding. It follows the same direction recommended in agentic coding practice: explore first, plan before implementation, keep context specific, and give the agent verification criteria.
Validation: separate doing from judging
Execution and acceptance are separated. A phase executor implements within a bounded context, while an acceptance pass checks the result against Wiki Truth, the plan, tests, console output, or Unity evidence. This reduces the risk that the same agent both creates and over-trusts its own work.
This maps directly to current agent engineering practice: OpenAI's Agents SDK emphasizes orchestration, state, approvals, guardrails, handoffs, tracing, and evaluation; Anthropic's Claude Code guidance recommends verification, separating exploration/planning from coding, aggressive context management, and using subagents for investigation.
4. Three Workflow Lanes
/design
Used when the design is not yet stable. The agent reads minimal Wiki truth, asks clarifying questions, then produces a longform draft, Wiki updates, and an executable implementation plan.
/execute
Used when the design is stable and ready to implement. It reads the implementation plan, runs preflight, executes phase by phase, performs acceptance, and writes confirmed changes back to the Wiki.
/bug
Used for concrete observed issues. It starts from the symptom, reads minimal truth, performs a narrow fix and validation, and stops if the root cause is design ambiguity.