The Systems Designer’s Guide to Working with Chaos
Most systems aren’t broken by chaos. They’re broken by the assumption that chaos wasn’t coming.
Here is the thing about chaos: it is not the exception. It is the operating environment.
Every design system I have ever built was designed in a controlled setting and deployed into a world that immediately started trying to break it. Edge cases the spec never anticipated. Implementation teams who interpreted guidelines differently than the people who wrote them. Stakeholders who needed something the system technically allowed but philosophically didn’t account for. Engineers who found ways to be technically compliant while defeating the entire intent.
This is not a failure of the system. This is Tuesday.
The practitioners who struggle most with chaos are the ones who built their mental model around elimination. If we document it clearly enough, they reason, the ambiguity goes away. If the spec is tight enough, the edge cases disappear. If we enforce the rules consistently enough, the chaos gets managed out of existence.
It doesn’t. The documentation just shifts where the chaos lives.
Chaos Isn’t the Problem — Fragility Is
There is a meaningful difference between a system that encounters chaos and breaks, and a system that encounters chaos and bends. The goal of a systems designer isn’t to prevent chaos — it’s to build structures that are elastic enough to survive contact with reality without losing their shape.
Fragility is the actual enemy. Fragility is what happens when you design for ideal conditions.
In software engineering, this is understood through the lens of fault tolerance and graceful degradation: systems that keep functioning when components fail, rather than collapsing entirely. The insight transfers almost directly to design systems, creative workflows, and AI pipelines. Build for the conditions you’ll actually operate in, not the conditions you’d prefer.
What this requires, practically, is designing to the 80%. Your system should handle 80% of cases elegantly and completely. For the remaining 20%, your job isn’t to write more rules — it’s to establish clear decision-making principles so teams can navigate the edge cases without escalating everything to the person who wrote the spec. A system that requires constant interpretation from its author isn’t a system. It’s a dependency.
The Three Ways Brittle Systems Fail
Brittle systems don’t fail randomly. They fail in patterns. Recognizing those patterns early saves an enormous amount of time.
They fail at the seams. Every system has connection points — where one component hands off to another, where one team’s work meets another’s, where the spec ends and judgment begins. Brittle systems are tight at the center and weak at the seams. The core use cases are handled beautifully. The handoffs are a disaster. This is why design tokens break down at implementation, why AI pipeline outputs degrade at postprocessing, why brand voice guidelines sound great until someone actually tries to write copy in a new format.
They fail under time pressure. A system that works when teams have adequate time, resources, and context will fail the moment a deadline appears. Time pressure is one of the most reliable chaos accelerators. When people are in a hurry, they don’t carefully follow the spec — they approximate it, guess at it, skip steps they don’t fully understand. If your system isn’t intuitive enough to approximate correctly, the time pressure will surface every fragility at once.
They fail when the context shifts. Every system is designed for a specific context. That context changes — new products, new team members, new platforms, new constraints, new business requirements. A system designed for one set of conditions rarely survives a significant context shift without modification. The mistake isn’t that the context changed. The mistake is building as though it wouldn’t.
What Chaos-Resilient Systems Actually Look Like
They’re loosely coupled. Tight coupling is the architectural equivalent of building a house of cards: a change anywhere propagates everywhere, and the system can only tolerate the most linear, sequential kinds of evolution. Loose coupling builds in the slack that lets components evolve independently without collapsing the whole. In a design system, this means separating tokens from components, components from patterns, patterns from usage guidelines — so any layer can change without forcing a cascade.
They’re observable. A system you can’t see fail is a system you can’t fix. Observability isn’t just a DevOps concept. In design systems, it means clear documentation of what broke and why. In AI pipelines, it means logging inputs and outputs at every stage so you can trace where the signal degraded. In creative workflows, it means building in review checkpoints that surface problems while they’re still cheap to fix. Chaos doesn’t disappear in observable systems — it just becomes visible in time to do something about it.
They assume error. The most expensive assumption a systems designer makes is that everything will work as intended. It won’t. Plan for the fail states. Document what a team should do when the system doesn’t have an answer. Build the recovery path before you need it. A system with a well-designed failure mode is fundamentally more trustworthy than one that pretends failure isn’t coming.
When AI Is in the Loop, This Gets Urgent
AI systems don’t behave deterministically. The same input, run twice, produces different outputs. The model’s responses are shaped by training, context window, temperature, position in the conversation — variables that the designer doesn’t fully control. Every AI-integrated workflow is, by definition, a workflow that contains a source of probabilistic chaos.
This is not a problem to be solved. It’s a condition to be designed for.
The practitioners building reliable AI products aren’t the ones trying to eliminate variance. They’re the ones building systems that tolerate variance at the model layer while maintaining consistency at the output layer — tight validation, structured output formats, human review steps placed at the right points in the chain, fallback behaviors when the model produces something outside acceptable parameters.
The chaos is inside the black box. The system wraps around it.
You can’t build your way out of chaos. But you can build systems that stay coherent inside it — that bend without breaking, that fail gracefully, that give teams the tools to make good decisions in conditions the original spec didn’t anticipate.
That’s not a concession to entropy. That’s what good systems design actually is. The best structures aren’t the ones that hold everything perfectly in place. They’re the ones still standing when everything else shifts.
Linda Brown
Systems Architect building intelligent structures for creative teams — at the intersection of design systems, AI infrastructure, and the stubbornly human parts of creative practice.