In seven days, Anthropic stops eating the bill for your AI agents.
On June 15, the company's flat-rate Claude subscriptions stop covering unlimited Claude Code and Agent SDK usage. After that, those tools run on a credit pool tied to your plan: $20 of credit on Pro, $100 on Max 5x, $200 on Max 20x. Anything beyond that gets billed at API rates, and unused credits don't roll over.
We covered the mechanics in detail last week, so the policy itself isn't the news here. The news is what people are scrambling to do before Monday.
The reason Anthropic had to act shows up in one number. On the $200 Max 20x plan, the heaviest Claude Code users were pulling roughly $35,000 a month in API-equivalent compute, a 175-to-1 subsidy ratio between what they paid and what they actually used. Subscriptions were never designed for agents that run for hours, reread the same context at every step, and spin off sub-tasks while you sleep.
Gartner pegs agentic workloads at five to thirty times the token consumption of a chat session. Stanford's Digital Economy Lab estimates that re-sent context alone accounts for about 62% of an agent's inference bill. The thing that made Claude Code feel like magic is the same thing that made the math impossible.
For developers, the practical playbook is starting to look like this:
- Cache everything you can. Prompt caching cuts the re-sent context tax, which is where most of the bill is coming from.
- Stop defaulting to Opus. At $5 per million input tokens and $25 per million output, Opus 4.7 burns credits fast. Sonnet 4.6 handles most coding tasks at roughly three-fifths the cost.
- Set ceilings on autonomous runs. DigitalApplied flagged a case where an unbounded agentic workflow consumed about $4,200 in tokens over a single weekend before anyone noticed.
- Check your tool integrations. Zed told its users this week that anyone routing Claude through the editor will now pull from the same credit pool, with overages billed directly.
The bigger shift sits one level above the developer's terminal. AI token spend has quietly turned into a finance problem.
"Token costs and efficiency have become a CEO-level concern, not an engineering footnote," J.R. Storment, executive director of the FinOps Foundation, said this week when the group announced a new Tokenomics Foundation to set open standards around this stuff. Jay Litkey at Flexera put it more bluntly, pointing out that cloud waste rose in 2026 for the first time in five years, with AI workloads doing a lot of the damage.
Sanchit Vir Gogia, chief analyst at Greyhound Research, captured the limit of what better tracking can actually do. "The dashboards do not lower the bill," he said. "The architecture lowers the bill. The dashboards merely describe the bill while it arrives."
The timing is also a little awkward for Anthropic. The company spent last week calling on AI labs to coordinate a global pause on frontier development, then turned around and metered its most popular product. OpenAI noticed and offered new business customers two months of free Codex that same week.
For enterprises, the credit overhaul forces a conversation most have been putting off. Mike Eisenstein, a managing director at Accenture, said the hardest question his clients are asking right now isn't whether to adopt AI, it's how to prove the return when token spend keeps climbing. Flat-rate pricing let that conversation stay theoretical. After Monday, every team running Claude Code will have a line item to defend.

The unlimited era of AI tools was always going to end. Anthropic just happened to be the first to admit the math doesn't work when one user can burn $35,000 of compute on a $200 plan. Every other vendor selling AI by the seat is now staring at the same chart Anthropic was staring at, which means anyone still charging a flat rate for "unlimited" anything is running out of runway. By the end of the year, the CFO question about AI won't be how much it costs per user. It'll be how much it cost yesterday, and why nobody set a ceiling.
