Most firms do the opposite.
They use agents to grow revenue and cut cost. Here are the five things they get right.
Thirty minutes with me to talk through your AI agent programme.
We work through where you are now, and what to put in place first. No pitch.
The value from AI agents is real, and a small group of firms are pulling ahead. For everyone else, the question is no longer whether agents work. It is why they are not seeing a return.
1.7× revenue growth, 40% more cost savings
the gap BCG found between AI leaders and laggards.
BCG, 2025
95% are leaving that money on the table
only 1 firm in 20 captures AI value at scale. The rest are stuck.
BCG
Money spent, no return
74% have already pulled a live, customer-facing AI agent back out of production, after a governance failure.
Sinch, 2026
Tick the ones that are true of an AI agent you are running or planning. Each tick is a sign that money is at risk, and a pointer to where to look first.
Across the AI programmes we see, the value rarely stalls on the model. It stalls on five decisions made around it. Here is each one, in plain terms, with what the firms getting it right do instead.
The agent is picked because it demos well, before anyone has worked out what it saves, what it costs to run, or what it is worth. It then costs more to run than the budget expected, and because it does not behave the same way every time, it needs rounds of testing the plan never allowed for. The spend overruns, and the board loses patience.
Work out the value in real numbers: how often the work happens, and the time and staff cost it removes, set against what it will cost to run, the model, the integration and the monitoring. Decide how much risk you are willing to carry, and make a clear go or no-go call before the spend scales up.
It goes live in a regulated process with no independent checks around it. The first time it makes something up, it reaches a customer or a contract before anyone notices. A second slip follows, and the review that lands puts the whole AI investment under board scrutiny.
Put the checks in code, sitting outside the model so it cannot talk its way past them, from the first sprint: every answer checked against its real source, a complete record of everything it read and did, and a gate that blocks an unsafe action before it happens.
It passed a controlled demo, so it shipped. Real use showed what the demo could not: it was not reliable enough, it cost far more to run than the estimate, and it could be tricked into leaking data or taking the wrong action. With no way to absorb that, the team rolled it back, and leadership lost confidence again.
Test it against a fixed set of real, scored examples. Then run it quietly alongside the people doing the work, then in a small live pilot, then fully. At each step it has to clear a set bar for accuracy, cost and safety, signed off by someone whose job is to find the holes, and it moves on evidence, not on a date.
The people whose work the agent changes are not consulted, trained, or told how their jobs will change. They start using their own unapproved tools, they quietly work around the agent, and trust never forms. An agent that works technically ends up barely used.
Treat bringing people with you as real work from the start. Map who is affected and the new jobs the system creates, like running the agent, checking its quality, and reviewing its records. Expect the normal reaction to new AI, from early worry, through frustration, to settling into trust, and bring the doubters in rather than working around them.
No one owns the system once it is live. There is no routine to catch new ways it can fail, no way to stop it safely when it does, and no check on whether it still pays its way. It quietly drifts, and within months the agents are switched off and the manual work comes back.
Give the system a named owner and a simple loop: watch what happens in production, catch the failures, add a check for each one, and adjust the agent. Keep a growing list of the ways it can fail, a way to stop it or fall back safely, and a regular review that grows, shrinks, or retires each agent on the evidence.
The good news: not one of these is a limit of the technology. Each is a decision, and there is a clear, practical method for getting it right.
The five things above are five stages, run in order. The leaders work through all five. The firms that miss out skip or rush some of them.
For each stage there is a clear method and a tool. Our Agent Discovery and Design framework runs across all of them, end to end. It gives product, engineering and change teams the knowledge and tools to deliver AI that is reliable, secure, and worth the money.
Stage 1
Decide where an agent fits, and size it
Stage 2
Design a governable agentic system
Stage 3
Engineer it for production
Stage 4
Bring your people with you
Stage 5
Operate the system, keep improving
Bring a programme you are running or considering, and in thirty minutes we will work out where the biggest gap is and what to do first.
Book a 30-minute callYou do not need to rebuild anything to find out where you stand.
Size your most promising agent. Before you spend more on it, work out what it is actually worth and what it costs to run. If no one can say, that is your first gap.
Check your evidence. For the agents already live, what proof do you have that they are accurate, safe, and running at a cost that makes sense? If you could not show it to a sceptic, that is a gap.
Make adoption real work. Name who owns the people side: training, role change, and bringing teams with you. Start it now, not at launch.
Thirty minutes with me. Bring a programme you are running or thinking about, and we will work through what to do first.
You are welcome to come and explore, too, if you are still working out where AI fits.