We Talk a Lot About Speed in AI. I Think We’re Missing Something Else.
The teams that actually produce better creative work with AI aren’t the fastest ones. They’re the teams who did the slowest work first.
There’s a version of the AI conversation that goes like this: AI is fast. Therefore, it produces more. Therefore, the team that adopts it first wins.
I hear some version of this in almost every enterprise conversation I’m in these days. And I’m not here to argue with the speed part — AI tools really are fast, and the throughput gains are real. But I’ve started noticing that the teams who are actually producing better creative work with AI aren’t the fastest ones. They’re the teams who did the slowest work first.
They took the time to define what “good” actually looks like. Before they generated anything.
The Gap Nobody Talks About
Here’s what I’ve seen happen, repeatedly, with enterprise creative teams that adopt AI tools:
Week one: output volume explodes. Everyone is impressed.
Week three: someone starts noticing the outputs feel “off.”
Week six: the team is back to producing roughly what they were before — but with more tools, more meetings, and more frustration.
The problem isn’t the tools. The problem is that the team never built the quality layer.
By “quality layer,” I don’t mean a style guide or a brand book. Most enterprise teams have those. I mean the operational definition of good for this specific pipeline — specific enough to be evaluable, written in a way that both the humans and the systems involved can actually use.
The brand book says “our tone is warm and confident.” That’s a value statement. The quality layer says: In product copy, warmth means sentences under 18 words. It means second person. It means no passive constructions. Confidence means we don’t hedge with “may” or “could” when we mean “will.”
That’s a standard. And without standards, speed just means generating more things that feel vaguely wrong.
What a Quality Standard Actually Is (and Isn’t)
I’ve started thinking about quality standards for AI creative pipelines in three layers. When a team is struggling with inconsistent or off-brand output, it’s almost always because at least one of these is missing.
Layer 1: Output Constraints
These are the hard guardrails — the non-negotiables about what an output should and shouldn’t be. Not stylistic preferences, but firm lines. Character limits. Reading level. Prohibited words or phrases. Structural requirements (for example: “all campaign copy must open with a verb”).
Output constraints are the easiest to document. They’re also the ones teams most often skip, assuming everyone just knows. That assumption is almost always wrong — and the gap only becomes visible when outputs come back wrong at scale.
Layer 2: Voice Specificity
This goes beyond tone. Voice specificity means you can hand someone — or something — a brief and have them produce something that sounds unmistakably like your brand. Not “we’re conversational” but: We address readers by name in email headers. We use em dashes instead of parentheses. We never open with “At [Company Name], we believe.”
This layer requires the most work. It can’t be copied from a brand book. It has to be built by looking at examples of your best-performing work and asking: what specific decisions made this ours? The answers to that question are your voice spec.
Layer 3: Evaluation Criteria
How do you know when an output is done? This sounds obvious, but I’ve been in more review meetings than I can count where the team couldn’t agree on whether something was finished — because nobody had ever defined what “finished” meant.
An evaluation rubric doesn’t have to be elaborate. Five questions work: Does it sound like us? Does it do the job it was designed to do? Is it accessible to the audience it’s meant for? Are there any hard no’s to flag? Would we be proud to ship it?
What matters is that the rubric exists, is shared, and is applied consistently — not improvised fresh in each review.
The Scale Problem Is Actually a Standards Problem
Here’s what I’ve noticed about the teams that genuinely scale their creative output with AI: they’ve already done the work of operationalizing taste.
This is hard to do. It requires the most experienced voices on the team to stop trusting their gut and start externalizing what their gut actually knows. It requires disagreements about quality to happen in writing, not just in hallway conversations. It requires someone to own the standard and update it when it drifts.
None of that is AI work. It’s design work. But it’s the design work that determines whether your AI pipeline becomes generative — or just fast.
When you force a team to articulate what “good” means in writing, you discover how much silent disagreement existed before. Those are good disagreements to have before you scale, not after.
There’s a side effect to this process that I’ve come to really appreciate: when you force a team to articulate what “good” means in writing, you discover how much silent disagreement existed before. You find that two senior designers were operating with completely different definitions of brand voice. You find that “we all agreed the old approach wasn’t working” was masking the fact that nobody agreed on what working meant.
The quality layer is clarifying. Sometimes uncomfortably so. But those are good disagreements to have before you scale, not after.
What to Build First
If you’re at the beginning of this work — or if your pipeline has been running and outputs have been feeling increasingly generic — here’s where I’d start.
Gather your 15 best examples. Collect 15–20 pieces of creative output your team is genuinely proud of. Not “good enough to ship” — actually proud of. If you can’t find 15, that’s useful information about where the standard has eroded.
Ask what made each one good. For every example, name the specific decisions that made it land. Not “it was engaging” — what was the actual choice that made it engaging? The opening line? The structure? A word that doesn’t show up in the brand guidelines anywhere? Write these down.
Find the patterns. Across all those specifics, look for what repeats. Those patterns are your operational voice. They’re not the whole picture, but they’re the actionable core — the things that made good work good consistently.
Build a negative list. This is the most underrated step. Document what your best work never does. What phrases never appear? What structural choices always get avoided? What tonal notes are always off-limits? Constraints on the negative space are often more useful than positive instructions, especially for AI systems.
Write a one-page standard. Not a document, not a guide — one page that a reviewer can reference during a QA pass. It should answer three things: What does good look like here? What disqualifies something immediately? Who makes the call on edge cases?
That’s your quality layer. It isn’t finished — it’ll evolve — but it gives your pipeline somewhere to land.
The Standard Is the Work
I want to push back gently on the framing that building quality standards is the slow, expensive part of AI adoption.
It’s not the slow part. It’s the part that makes everything else faster, for longer.
The teams I’ve seen succeed with AI creative pipelines at scale aren’t the ones that generated the most the fastest. They’re the teams that built the quality layer first — and then let speed work in its proper place, which is after you know what good looks like.
AI will continue to get faster. The models will improve. The tooling will get better. But the quality of what you produce will always be a function of the clarity you brought to the system — whether the people building the pipeline knew how to define good before they asked it to generate anything.
That definition is the work. Everything else is just output.
Linda Brown
Linda Brown designs AI systems, creative pipelines, and operational tools for enterprise teams. She writes about the design decisions that happen before the screen.