Stop Blaming the Agent: Why Your AI Coding Projects Fail Without Guardrails

I had to throw away a few of my code projects recently. And I’m not talking about little throwaway experiments — these were real projects I’d invested time in. They got to a point where they were really unmanageable. There was a lot of duplicated code, a lot of duplicated processes, and honestly, it just became a nightmare to work with.

And here’s the thing — it wasn’t the AI agent’s fault.

We’re Giving Agents Too Much Room to Guess

The main problem we’re having at the moment when we’re instructing a coding agent to code is that we give them broader tasks in an open-ended manner. What that basically means is we leave the window open for the AI agent to interpret things on their own.

Think about it. You tell an agent “build me a user management system” and now it has to figure out where to put things, what patterns to follow, whether to create new services or use existing ones. It doesn’t have the context you have in your head. So it makes decisions. And those decisions stack up.

Before you know it, the agent has reinvented services that already exist in your codebase. It’s completely ignored your Domain-Driven Design boundaries because nobody told it about them. It’s duplicated logic everywhere because each task it worked on was in isolation — it had no idea about the bigger picture.

This is what I keep seeing, and I think a lot of people working with AI coding agents are hitting the same wall.

The Runtime Interpretation Problem

The way I see it, the core issue is that we’re letting agents do interpretation at runtime. You give it a vague task, and now the agent is making dozens of architectural decisions on the fly. Where should this logic live? Should I create something new or extend what’s already there? What conventions should I follow?

Every single one of those decisions is a point where the agent might go in a completely different direction than what you had in mind. And when you multiply that across an entire feature — yeah, you end up with something that technically works but doesn’t really belong in your application.

The solution isn’t to stop using AI coding agents. The solution is to stop asking them to think about architecture and start giving them clear blueprints to work from.

The Spec-Driven Approach

This is where I’ve landed with a framework called spec-kit. The whole idea is pretty simple — instead of you allowing a coding agent to do the interpretation of what you want at runtime, you do this deliberately upfront.

The workflow goes like this: Constitution → Specification → Clarification → Planning → Tasking → Implementation.

Start With a Constitution

Every project needs a Constitution. It’s basically a living document that defines the core principles, vision, and mission of your application. Think of it as the ground truth for your codebase.

It captures things like your architectural patterns, your domain boundaries, what services exist and what they’re responsible for, and the conventions that the agent absolutely has to follow. It’s not something you write once and forget about — it evolves as your application evolves. But it’s always there as the thing that keeps the agent grounded and stops it from drifting off into its own interpretation of what your app should look like.

Breaking Work Down Properly

With the Constitution in place, you then write a specification for what needs to be built. That spec is grounded in everything the Constitution defines. From the spec, you create a plan. From the plan, you define tasks.

And here’s the really important bit — once you define your tasks through this process, they are really nicely isolated chunks of work that a coding agent can fully concentrate on. The agent isn’t guessing about architecture anymore. It’s not wondering about domain boundaries. It’s not deciding whether to build something new or reuse what’s already there. All of that thinking has been done by you, the human, before the agent touches any code.

Why This Actually Works

It works because it matches what AI coding agents are genuinely good at. They’re brilliant at implementing well-defined, bounded tasks. They’re great at following patterns that are already established. They’re really capable when they know exactly what “done” looks like.

What they’re not good at — and what we really shouldn’t be expecting them to be good at — is holding the full context of a complex application while making architectural trade-offs. That’s your job.

That’s always been your job. By shifting the interpretation work from runtime to upfront, you get the best of both. You do the thinking, the structuring, the guardrailing. The agent does the execution — fast, consistent, and tireless. Nobody’s doing the other’s job.

Think About It This Way

If you tell a senior engineer to build a system and you give them open-ended requirements with no architecture docs, no domain boundaries, no conventions — they’ll build you something. It’ll probably work. But it won’t be what you had in your head. Now take that same senior engineer and give them clear principles, well-defined domain boundaries, established patterns, and a solid understanding of the existing codebase — and the outcome is completely different. Night and day.

It’s the same with AI agents. We need to start thinking about them the same way we think about developers on our team. The more context we can give them, the more principles and practices we can put in place, the better the outcome. It’s really that simple. An agent with a Constitution, clear specs, and isolated tasks will outperform the same agent running on vibes and vague instructions every single time.

What I’d Suggest

If you’re dealing with AI-generated codebases that have gone off the rails, don’t blame the agent. Look at the instructions you gave it.

Start building a Constitution for your application. Break your work into specs before you ever open a coding agent session. Plan before you task. Task before you implement. And make every task isolated enough that the agent can succeed without needing to understand the entire universe your application lives in.

The agents are ready. The question is whether we’re ready to use them properly.