Same model, different contract. The same task can cost several times more through Copilot Credits than on Claude direct at Anthropic token rates. The gap is real, but capturing it means building the platform yourself. Here is the build versus buy math.
It is the same model on both sides, but a very different contract. The same Cowork task can cost several times more through Copilot Credits than on Claude direct at Anthropic token rates. The gap is not Microsoft overcharging for the model. It is the price of the managed agent built around it, and whether to pay it is a build versus buy decision you make per workload.
Take the pillar deployment: 200 users, about 27 million credits a year. Priced through Microsoft at list it is roughly 271,000 dollars, and prepay barely moves it. The same inference on Claude direct, at Anthropic list rates, runs about 21,000 dollars with Sonnet as the workhorse and about 35,000 dollars using Opus for everything.
On raw inference the same year of work costs roughly 270,000 dollars through credits and 21,000 to 35,000 dollars on Claude direct.
That is a gap of roughly a quarter of a million dollars, and prepaying credits does almost nothing to close it because the prepay discount caps at 20 percent. The gap is structural, not a discounting failure.
White Paper · Microsoft
What Microsoft Copilot Cowork Really Costs
What a task really costs in dollars, the prepay floor, and the same work on Claude direct. Read it free.
Because a credit is not a model price. Each credit bundles the model with four layers you would otherwise build and run yourself. On Claude direct you pay only the inference, and you supply the rest.
This is the headline the prepay ceiling cannot match. On the direct API, two optimizations stack on top of the already lower token rate.
Direct optimization versus the prepay floor
| Lever | Effect | Available on credits? |
|---|---|---|
| Prepay discount | Up to 20 percent off, then expiry | Yes, capped |
| Batch processing | About 50 percent off | No |
| Prompt caching | Up to 90 percent off repeated input | No |
Batch and caching stack, so the effective direct rate can fall well below the credit floor of eight tenths of a cent. That is the structural reason the gap holds at volume, and why a credit prepay is not a substitute for direct optimization.
The gap is real, but it is not free money. To match Cowork on direct Claude you stand up and operate the four layers Microsoft bundles. That is a team and a running cost, not a one time spend.
For 200 users, capturing the quarter of a million can take one to two engineers plus infrastructure, often more than the saving. That is exactly why the decision is per workload, not all or nothing.
Per workload, not by default. The question is whether capturing the gap is worth standing up and operating an agent platform for that specific workload.
The broader managed versus direct analysis is in the direct versus managed comparison, and the meter in the Copilot Credits pillar.
Most large estates land on both, and the sequence matters. Start where the build is cheapest to justify and expand from there.
The mistake at both extremes is treating this as a single decision for the whole estate. Going all in on credits leaves a large, avoidable premium on your highest volume workloads. Going all in on direct forces you to build and run a platform for workloads that would have been cheaper to buy. The discipline is to keep both processors open, price each significant workload on its own merits, and revisit the split as your volume and your platform maturity change. A workload that is buy today can become build next year once volume grows.
Run this per workload before you concentrate spend on credits.
Per unit of inference, yes, often by several times. A credit bundles the model with the runtime, retrieval, tools, identity, and governance, so it bills well above the raw token rate. On Claude direct at Anthropic list rates you pay only for the inference.
For 200 users running about 27 million credits a year, the work costs roughly 270,000 dollars through Microsoft credits against about 21,000 to 35,000 dollars on Claude direct at Anthropic rates. The gap is roughly a quarter of a million dollars a year on raw inference.
Because credits buy the whole managed agent, not just the model. To match Cowork on direct Claude you build and run orchestration, retrieval, governance, and identity yourself. With no platform team and modest volume, the credit premium is cheaper than the build.
They are Anthropic optimizations on the direct API. Batch processing runs work at about half price, and prompt caching can cut repeated input cost by up to 90 percent. They stack, so the effective direct rate can fall well below the 20 percent prepay floor on credits.
When Claude runs through Microsoft Foundry or Copilot, Microsoft handles identity and billing while Anthropic processes the prompts. That is two processors of your data rather than one. It matters for data protection mapping and is a point to cover in your security review.
Yes, and most large estates should. Run high volume, repeatable workloads on direct Claude with batch and caching, and keep credits for the turnkey Microsoft 365 workflows that are not worth building. Decide per workload rather than picking one path for everything.
The task mix model in dollars, the credit to dollar conversion across light, medium, and heavy work, the build versus buy math against Claude direct, and the governance controls to set before you provision.
Used across more than five hundred enterprise engagements. Independent. Buyer side. Built for procurement leaders sizing Copilot inside the next EA renewal.
500+ enterprise clients. 11 vendor practices. We model your task mix in dollars, weigh direct Claude against buying through Microsoft, and set the governance and commitment terms before Microsoft sizes them for you.
One email a month on Copilot Credits, EA renewal levers, Azure commitment, and audit defense. Buyer side only.