Using the LLM to express your intent, and rise above the noise
There’s a debate happening right now between people who view creative work as a labor-intensive ritual and people who view it as a problem-solving exercise. I’ve been on a journey from one camp to the other, and along the way AI slowly unlocked my ability to make bigger leaps and do more interesting things. Here’s how that happened.
When Copilot first showed up, I wasn’t impressed. Autocomplete for code seemed like a cute trick, but most of my work wasn’t about well-trodden atomic paths like base64 conversions. I was managing structure, building architecture (I’m doing real engineering over here!). Often, it wasn’t any faster to ask an LLM for a copy-pastable snippet than to write a small connector function myself. Copilot wasn’t seeing anything in my code—it was just “autocompleting the function.”
I had to tell it what symbols were there anyway… I already know the language, and traditionally its been harder to describe algorithms in English– I’ve never been a huge fan of design docs for solo projects where I’m doing all the work. So I may as well keep going with this language I already know.
Then ChatGPT arrived and the chatbot wave started. I asked it to solve math problems and write articles connecting disconnected topics to produce silly articles– like getting a Vienna Circle commentary on Nirvana. The numerical solutions were usually wrong, and in language it could synthesize ideas in an abstract way, but the consistency in detail was missing. The numbers didn’t add up.
I wasn’t really convinced these new tools were anything radical, but I could see the novelty in text generation and how it might be useful to author game dialog or descriptions of places and events.
Around April-May of 2023, I started messing with ChatGPT’s API because I was starting to see the potential. Not yet the full potential, but I was seeing glimmers of what could be. I needed to get the transformer out of the browser, out of the editor.
I integrated ChatGPT’s API into a CLI chatbot. It tracked conversations, worked the same way as the web bots. I started to see something new in the workflow. I needed a new concept: context management.
In those sessions with my API shim, I started thinking about asking the LLM for just code as output. Pipe the prompt in, get out some code that can be patched in. But I could see this might not scale– how do I get it to see only the relevant parts of the codebase, and all of them?
I started thinking about how a virtual machine manages its stack and heap. Were there answers there?
Could I trace through the formalisms in the language to pass the right stack of context, so the AI could see the whole thread of execution?
I was asking the right questions, I knew what the AI needed, but I was still figuring out if what I was imagining was even necessary. Deadlines called. Bookmark placed.
I kept going to various services, having conversations, trying to get complete artifacts. Then I’d copy-paste the boilerplate into my editor and decorate it with my domain language. A style began to develop around this modular boilerplate workflow. I was generating smaller chunks of code that I’d manually work into my project. But these conversations also arrived at structures that catered to the workflow.
The LLM was good at small things: “fire an event when this function is called, create an array of event handlers, call them in round robin.” This was about as complex as I could get for a while. Once I hit a third or fourth feature in the same chat, we’d start to see regressions. Anything requiring recursion, the LLM would try to solve problems in frustrated ways—repeating conditional branches a fixed number of times. Finite bounded thinking.
This is where the Fallacy of the Manual Badge started to become clear to me. The people who complain that using AI is “cheating” often ignore that their preferred techniques were originally technical compromises, not choices. In the 80s, artists like Mark Ferrari used dithering not because they loved the “sweat” of placing dots, but because they only had 16 colors. If Ferrari had a script in 1990 that could reify his intent to build Monkey Island, he probably would have used it.
Eventually the ollama models started catching up to the chatbots, and I started to do a lot of experimentation with a qwq model.
Ollama was immediately parsable to me-the interface was similar to my CLI tool, but QwQ was more verbal about what it was thinking than the bigger models. It was easier to trace what information was missing in the prompt.
QwQ could generate entire programs that were correct and ran, then produce diffs to fix bugs. Tedious, but it pushed me to keep thinking about context management and constraining output. I produced complete playable versions of Snake and Pong for the web with two or three prompts. When I saw the curve we were on, I knew something had shifted.
I was intrigued at the possibility of having this power on my laptop, with no token fees.
From there, I started building Aseprite scripts that would do shading and 3D projection into pixel art. Looking back, this was a bigger deal to my workflow than I realized at the time. I was starting to autonomously generate code from the LLM and testing it with use, not by examining the code.
This is where the Force Multiplier Effect really started to take hold. I wasn’t asking the AI to “make me a cool game character”—I was asking it to build a custom tool that let me maintain my specific aesthetic across a higher volume of work. Writing a lighting shader from scratch can take hours of debugging math. Using AI to generate the boilerplate means you spend those hours fine-tuning the mood instead of fighting syntax. You get to put “longer passes” into the work.
In summer 2025, I used AI to price out a real estate market and find opportunities. This was the first time I actually started to believe LLMs might help solve real problems that already built software hadn’t solved.
I kept trodding along with my “lightly assisted” workflow for the next few months until I finished Beef Arena. Everything in that game is handcrafted, but I used a fair amount of boilerplate from ChatGPT and Claude, decorated it myself, or used those conversations to work out repeatable patterns in the architecture problems I was facing. One foot in the pond.
The irony of the “sweat fetishists” is that they often produce highly laborious kitsch. They’ll spend 40 hours hand-drawing a scene with zero original intent, while a developer using scripts to automate workflow spends their human energy on a unique mechanical idea. The sweat becomes a shield against the fear that their style is replaceable. But the juice—the real juice—comes from the interdisciplinary leap: knowing enough about 3D, pixel art, and scripting to combine them into something cohesive.
A week after the Beef Arena launch, I started thinking up a new project. I started a conversation with claude-we were riffing, jiving, vibing—and the conversation stayed coherent for an hour. My mind was blown. Opus 4.5 represented a qualitative leap. We arrived at it incrementally, but Opus can “jump higher” to pass logical hurdles than anything before it.
Now we’re seeing a divide in the indie dev community:
The most successful “AI-era” art isn’t going to be the stuff that looks like an AI prompt. It’s going to be the stuff that looks like it took a team of 50 people to make, but was actually done by one person who knew how to throw longer passes and then run the distance to throw the next one.
If you’re looking to throw longer passes, keep reading this blog, and check out my spec authoring tool Buildwinkle. By building tools and teaching workflows to constrain the LLM’s noise I hope to help others unlock their unfettered creativity, too.
There isn’t one path through this. Young developers are vibecoding—letting the conversation flow and seeing what emerges. I’m spec coding, because that’s an extension of what I was already doing when I worked with teams: writing specs, reviewing implementations, iterating. Your background shapes your entry point.
But regardless of path, there’s a choice: are you going to dominate the conversation, or let the cultural noise drown you out? The model will happily generate plausible-sounding filler forever. The unlock comes when you learn to steer—when your intent shapes the output instead of the other way around.