Why most mid-market AI pilots die (and the three gates that save them)
The demo impresses, the quarter ends, nothing ships. The three gates that separate pilots that change a P&L from pilots that become a slide are a named number, production conditions from day one, and an owner on your team.
Who this is for: CEOs and operators who have watched a promising pilot stall after the demo
The pattern: impressive demo, dead by the next board meeting
The pilot starts with energy. A vendor or an internal champion shows a demo that genuinely works, everyone agrees it’s the future, a budget gets approved. Two quarters later the pilot is a slide in a deck and nothing in the company runs differently.
The autopsy is nearly always the same four findings:
- No P&L outcome was ever defined. The pilot was supposed to “explore the technology,” which means there was no number it could fail to hit, and therefore no number it could hit.
- It was built for demo conditions. Clean sample data, low volume, a happy path. Production has none of those, and the gap between demo and production is where the budget dies.
- Nobody owned it. The vendor’s engagement ended, the internal champion got reassigned, and a system without an owner stops the first time it hiccups.
- Theater replaced outcomes. Somewhere along the way a “maturity score” or a readiness framework was presented instead of a margin, revenue, or capacity result. Buyer-side research is blunt about this: reject pilots without P&L outcomes, and require the work to be evaluated in margin and revenue terms, not maturity scores.
There’s a deeper issue under all four: the pilot fulfilled a stated want instead of the real need. You wanted a pilot; what the business needed was a number to move. That gap is the value distance The gap between what a client asks for and what they actually need. Closing that gap, not fulfilling the original request, is where the value is. , and the three gates below exist to close it before any money is spent.
Gate one: a named number, before any work starts
Every pilot must be assigned one P&L number it exists to move, margin, revenue, or capacity, and that number gets named before anything is built. Not after the demo, not at the readout. Before.
The discipline sounds trivial and isn’t. Naming the number forces three decisions that vague pilots never make: which workflow is actually in scope, what the baseline is today, and what result would justify rolling it out beyond the pilot. If nobody in the room can say “this exists to cut document-handling hours” or “this exists to answer the calls we currently miss,” the pilot is a field trip with a budget.
A useful pressure test from the same buyer research: ask whoever is proposing the pilot, “how many dollars of new profit per dollar spent?” They don’t need a precise answer. They need to be visibly working on that question rather than on the technology.
Gate two: production conditions from day one
Demos run on curated data. Businesses run on the other kind. Gate two says the pilot touches real data, at real volumes, with real failure handling from the first week, because the move from demo conditions to production conditions is precisely where most pilots die, and you want to find out in week one rather than month five.
What this looks like in practice: one of our pilots took on turning 170-page documents into structured product data: genetic lab reports, the messiest paperwork imaginable. It runs against the real thing: 238 markers extracted from a 170-page report, three live lab-format integrations each with their own structure, a 21,297-item reference database behind it, and a production backend serving the results. It’s honestly labeled a pilot (there’s no revenue number attached yet) but it passed gate two on day one because there was never a curated-sample phase to hide in. The documents were real from the first run.
The rule of thumb: if the pilot’s data had to be prepared for it, the pilot hasn’t started yet.
Gate three: a named owner on your team
The pilot must have one named person on the client side who will run the system after handoff, chosen before the build, present during it, trained by the end of it. Not a committee, not “operations,” not the vendor on a maintenance retainer. A name.
This is the gate that separates a delivered system from a dependency. When we wired a phone system into a real business, it shipped to production on the client’s own server, answering their real inbound calls and writing into their own customer database, infrastructure they hold the keys to. That’s a client delivery in the literal sense: the thing delivered is theirs, documented, and runnable without us. The engagements that skip this gate produce the opposite: a system that works exactly as long as the consultant’s phone number does.
Gate three is also where the org question hides. Someone owning the system means someone’s job description changed, and someone’s incentives have to reward running it. If your team has never operated systems like this, that’s a teachable skill; running this kind of work like an organization is its own playbook.
What to demand from anyone selling you a pilot
Compress the three gates into the questions you ask before signing anything:
- “Which P&L number does this move, and what’s the baseline?” No number, no pilot.
- “When does it touch our real data at real volume?” Any answer other than “immediately” means you’re buying a demo.
- “Who on my team runs this in month four, and when do you train them?” If the answer involves their retainer, you’ve found the business model.
- “How will we evaluate it?” Margin, revenue, or capacity, in dollars or hours. If the proposal mentions a maturity score, keep your budget.
A pilot that passes all four is cheap insurance. A pilot that can’t is the most expensive way to learn nothing, and the cost compounds, because the failed pilot becomes the reason the company won’t fund the one that would have worked. That’s the real damage, and it’s why decoupling growth from payroll stalls in companies that burned a budget on theater first.
| Phase | Doing it yourself | With an operator |
|---|---|---|
| Naming the P&L number | Weeks of meetings, often skipped entirely | Settled in the first working session |
| Real data, real volume | Deferred to 'phase two,' where pilots go to die | A day-one requirement, no curated phase |
| Surviving production failures | 1–2 rebuilds after the demo falls over | Failure handling built in from the start |
| Handoff to a named owner | Rarely happens; the vendor becomes the owner | Owner chosen up front, trained as part of delivery |
The same pilot, run loose versus run through the gates.
Not ready to talk to anyone?
Run the Capacity Check: five inputs, your number, no email required.
Referenced by