When the Rules Stop Working 🦆
It’s funny how predictable this pattern has become.
You start with a clean setup. A small CLAUDE.md. A handful of instructions that feel obvious. The model behaves. It follows the rules. It feels almost… cooperative.
Then you hit your first edge case.
So you add a rule.
Then another. Then a clarification of the previous rule. Then an exception to that clarification. Then a note about how the exception should be interpreted.
And somewhere around that 100 to 150 line mark, things quietly fall apart.
The model is still responding. It’s still referencing your instructions. But it’s no longer reliably constrained by them. The rules start to feel more like suggestions. You get compliance sometimes, drift other times, and it becomes harder to tell which part of your system is actually doing the work.
I’ve been running into the same wall.
Which is why this Reddit thread hit a little too close to home.
Not because it introduced something new, but because it confirmed a pattern I’ve been slowly backing into.
Context: The Rule Accumulation Trap
At first, adding rules feels like progress.
You are capturing intent. Encoding expectations. Documenting how the agent should behave in your environment.
That all makes sense. It mirrors how we write documentation for humans.
But agents don’t consume documentation the same way.
They don’t build a stable internal model of your system over time. They reconstruct it on every invocation from whatever context you give them.
So when the context becomes dense, overlapping, and slightly contradictory, the failure mode is subtle.
Not a crash.
Just drift.
Observation: The Shift From Rules to Systems
The most useful idea from the thread is also the simplest one.
Stop adding rules. Start building infrastructure.
I didn’t land on that cleanly myself. It came out of frustration.
I would tell an agent to run tests. Sometimes it would. Sometimes it would skip them. So I added a rule.
“Always run tests before submitting changes.”
Which worked. Until it didn’t.
And that’s when the realization starts to form.
The rule is describing a world that may or may not exist in the execution path.
The system, on the other hand, creates that world.
If tests run automatically on save, or on commit, or inside the execution loop, then there is no decision to be made. The agent doesn’t need to remember. It doesn’t need to interpret.
It just experiences the consequence.
That distinction ends up mattering more than I expected.
Pattern: CLAUDE.md as a Router
The other shift that keeps showing up is how people are using CLAUDE.md.
Early on, it becomes a rulebook. A place to centralize everything you want the agent to know.
But as systems grow, that breaks down.
The more stable pattern is treating it as a router.
Instead of trying to encode behavior directly, it points to the places where behavior is defined.
- Relevant documentation
- Task-specific “skills”
- Directories with scoped context
- Scripts and tools that enforce constraints
This changes how you think about context entirely.
You stop trying to preload everything.
You give the agent a map and let it pull what it needs.
That “progressive disclosure” idea shows up everywhere once you start looking for it.
It also explains why large monolithic context dumps feel worse over time. They flatten everything into the same priority level.
Nothing stands out. Nothing is clearly actionable.
Adjustment: Enforce What Matters
The line from the thread that stuck with me was this:
“The rule tells the model what you think is true; the tooling shows it what is true.”
That’s been the pivot.
When something matters, I try not to describe it anymore.
I try to enforce it.
Tests run because the system runs them.
Documentation updates because something is watching diffs and generating changes.
Formatting is correct because a tool rejects anything that isn’t.
And interestingly, once those pieces are in place, the amount of instruction you need drops significantly.
The agent becomes simpler to reason about because fewer things depend on it making the “right choice.”
Where This Lands
I don’t think rules are useless.
They’re still good for orientation. For tone. For high-level expectations.
But they don’t scale as a control mechanism.
At some point, the system has to carry the weight.
That’s the part I’m still actively building out.
Breaking work into clearer tasks. Defining skills that are actually reusable. Tightening the loop between planning, execution, and validation.
It’s slower than writing another paragraph in a config file.
But it’s also the first thing that has felt stable.
And stability, it turns out, is a lot more useful than a really impressive set of rules.
Anyway. That’s where my head is at right now.
Comments