
Co-authored by Gerard Pietrykiewicz and Achim Klor
A story has been making the rounds about an unnamed company that allegedly ran up a $500 million Claude bill in a single month because nobody set usage limits. Every account traces back to the same Axios report, sourced from an unnamed AI consultant — no named company, no invoice, no primary confirmation.
Gerard and I are not treating the $500M as fact.
But we are treating it as a warning.
Whether the $500M number is real is neither here nor there. The recognisable pattern is more important. AI spend can behave less like traditional software and more like cloud spend, where it grows quietly, in places leadership doesn’t see until the bill arrives.
The surface pricing doesn’t help either.
Anthropic publishes Claude pricing in dollars per million tokens, which makes individual usage feel small. What it doesn’t show is how fast the meter runs when there are no guardrails on the workflow.
AI agents retry behind the scenes . They loop. They re-read long context windows. They call tools, spawn more work, generate multiple versions, critique them mulitple times, then generate many more. None of that feels dramatic while it’s happening. It just feels like the agent is working.
Until you ask what the work was for.
This is where the conversation can get lazy.
Someone looks at token costs, compares them with a salary, and concludes agents are obviously cheaper than junior employees. In the narrowest possible sense, that can be true. A bounded task can cost pennies in model usage while a human costs considerably more per hour. The Government of Canada Job Bank puts the median for a computer software engineer at CAD $56.49/hour nationally, CAD $62.50 in British Columbia.
But that comparison ignores what a junior person actually is.
A junior developer — or a junior analyst, or a junior content manager — is someone learning your systems, your customers, your standards, your trade-offs. They ask questions. They make mistakes that require review. They also build context, and over time that context turns into someone who can own work, mentor others, and make better decisions because they understand how things actually fit together.
An agent has its own costs that the tokens-versus-salary math skips over. It also makes mistakes. It can produce output that looks right but is subtly wrong. It can generate work faster than the team can evaluate it. And if the task isn’t clearly defined, it tends to resolve ambiguity by generating volume.
That’s the hidden cost. The agent may be cheap per prompt. The workflow may still be expensive.
If AI is cheaper, cheaper than what, exactly?
Cheaper than a salary line? Cheaper than a task taking three days? Cheaper than a senior person getting interrupted ten times to answer questions a clear brief would have prevented? Cheaper than work not getting done because nobody had time to start it?
Those are different questions. The metric that matters isn’t cost per prompt, tokens consumed, or drafts produced. Not even “hours saved” unless those hours convert into something useful.
The better question is this: What did it cost to produce a completed, reviewed, useful outcome?
If an agent drafts a campaign brief in ten minutes but a marketer spends two hours correcting hallucinated positioning and wrong product names, the cheap part wasn’t the whole cost.
If an agent writes documentation nobody trusts, the output isn’t an asset. If it generates twenty prospect summaries and the team acts on two, the activity looked productive but the value was thin.
On the other hand, if an agent takes a messy sales call transcript, extracts the objections and action items, and gives a rep a better starting point for follow-up, that’s real value. The agent didn’t replace the rep. It removed the worst part of the work and made the human part easier to begin.
That’s where agents are most useful. As a way to make difficult work more startable, more reviewable, and less blocked by repetitive setup.
This is why the $500M story, accurate or not, resonates.
Anyone who’s been through enterprise software adoption recognises the pattern:
Then finance or security eventually asks the questions that should have been asked earlier:
Those questions aren’t bureaucratic overhead. They’re how you tell adoption from a “free-for-all”.
Microsoft’s Cloud Adoption Framework guidance on AI agents makes the point plainly: agent observability, governance, and security aren’t optional features — they’re requirements. The framework treats every agent as something that must be auditable and controlled throughout its lifecycle.
Anthropic moved in a similar direction when it released enterprise controls for Claude Code — spend limits at the organisation and user level, usage analytics, managed policy settings, and a Compliance API for programmatic access to usage data. Once agents do real work at scale, governance stops being a side feature.
The leadership question isn’t “how many people can we replace?”
It’s “which parts of the workflow should no human be wasting their best attention on?”
Those lead to very different decisions.
Let agents handle work that is bounded, repeatable, and easy to verify. Scaffold code from a clear spec. Summarise calls. Draft the first version of a brief, a ticket, a proposal. Produce the boring first pass that often prevents people from starting at all.
Keep humans responsible for the work that requires ownership: defining the problem, setting the constraints, judging quality, deciding what ships.
A good junior role isn’t a queue of small tasks. It’s how people learn what good looks like. Remove the repetitive setup with agents — that’s useful. Confuse that with removing the development path, and you’ve cut your pipeline of future senior talent.
More AI output doesn’t automatically mean more business value. Sometimes it’s more material to review, more edge cases to catch, more misplaced confidence to walk back.
Before scaling agents across a team, start with one workflow where the task is bounded and the output can be verified. Put an owner on it. Define what the agent can do alone and what requires human sign-off. Track whether the work gets accepted, reused, shipped, or acted on.
The measure is simple: Did this produce a better outcome, or just more output?
If the answer is better outcomes, expand it. If it’s more output, fix the workflow before adding more usage.
AI isn’t the problem. Unmanaged work is.
Co-authored by Gerard Pietrykiewicz and Achim Klor. Follow us on LinkedIn, schedule a call with Achim, or contact Gerard to see if there’s a fit. Subscribe below for more.
This article is AC-A and published on LinkedIn. Join the conversation!