Spec-Driven Development Evolution

This retrospective documents the evolution from code-first development to spec-driven development. Both approaches involve LLM-assisted game development, but they represent fundamentally different philosophies about how to leverage AI in software development.

The key insight: The shift from “LLM as code generator” to “LLM as intent interpreter” dramatically improves scalability and coherence.

The Projects Compared

Metric	Artcraft (Code-First)	DreamReach (Spec-First)
Type	RTS (Warcraft-style)	FPS roguelike
Lua lines	28,818	18,399
Spec/doc lines	~500	14,033
Commits	77	67
Development span	~1 week	~4 days
Doc:Code ratio	~1:57	~1:1.3

Code-First: What Worked and What Didn’t

The code-first approach excelled at:

Fast initial velocity - working game in days
Immediate feedback loop - see results instantly
Good for learning a domain - discovered what an RTS needs by building one

But it struggled with:

Knowledge trapped in code - understanding requires reading implementation
Refactoring is risky - no contract to validate against
Hard to explain intent to new LLM sessions - each session starts cold

The pivot moment came when I realized: AI assistants need context to be useful, and that context should be injected automatically rather than manually provided each time.

The Spec-First Approach

The structure inverts the code-first pattern:

spec/
├── schema/      # What things ARE (rigid, validates)
├── guidance/    # How to THINK about it (flexible, interpretive)
└── data/        # What EXISTS (creative, varies)

The code is generated FROM this spec. The spec is the source of truth.

The 6k-Before-Code Philosophy

DreamReach reached ~6,000 lines of spec before any code was generated. This wasn’t documentation - it was prompt engineering at scale.

Each spec generation refined how the LLM thinks about the domain. By the time “build” was invoked, the LLM had internalized the constraints, relationships, and rules.

The Creative Loop

The development cycle that emerged:

spec → feature → imagination → testing → boundaries → spec

This isn’t waterfall. It’s a spiral where the spec is a living constraint surface that evolves WITH the project.

Key Insights

Intent Density Matters

The metric that matters isn’t doc-to-code ratio. It’s:

Intent Density = (consistent_behaviors_generated) / (spec_lines_required)

High intent density comes from describing constraints (what CAN’T happen), relationships (how things connect), and rules (what governs behavior). The LLM infers implementation from these.

Vocabulary Is a DSL

The spec is essentially a domain-specific language for describing the project to an LLM. Once you have enough vocabulary to describe a thing, you can describe anything in that domain.

Context Injection Was the Bridge

The insight: if context must be injected anyway, make the context THE specification. Don’t describe code - describe intent and let code be generated.

Summary

Aspect	Code-First	Spec-First
LLM role	Code generator	Intent interpreter
Source of truth	The code	The spec
Knowledge location	Embedded in implementation	Explicit in documentation
Session continuity	Requires context injection	Context IS the spec
Scaling	Linear (more code = more complexity)	Leveraged (more spec = more inference)

Both approaches have their place. Code-first is excellent for exploration and learning. Spec-first excels when you understand the domain and want coherence at scale.

The ideal workflow may be: code-first to learn the domain, then spec-first to build it properly.

The question isn’t “how do I get the AI to write good code?”

The question is “how do I express intent so clearly that good code is inevitable?”