AWS Bedrock prices generative AI on three commercial shapes. On demand token rates, Provisioned Throughput Units, and model customization. The buyer side reference for procurement and CIO leaders carrying Bedrock spend in 2026.
AWS Bedrock prices generative AI inference on three commercial shapes. On demand token rates for the standard catalog, Provisioned Throughput Units for committed capacity, and model customization for fine tuned variants.
Each shape carries different cost dynamics. On demand pays per token consumed. PTU pays for reserved throughput at a fixed monthly fee. Fine tuning carries a one off training charge plus a higher inference rate.
Read this with the AWS knowledge hub, the EDP negotiation landing, the EDP flexibility article, and the AI platform contract landing. Pair it with the AWS services page and the Vendor Shield subscription.
The Bedrock catalog hosts foundation models from multiple vendors under a single AWS contract. Each model carries its own pricing.
Run the use case through three to five models on the same prompts and compare quality, latency, and cost. The cheapest model is rarely the best fit. The most capable model is rarely the cheapest. The right pick depends on the workload.
On demand pricing charges per million tokens consumed. Each model carries a separate input and output token rate.
| Model family | Input USD per M tokens | Output USD per M tokens | Use case |
|---|---|---|---|
| Premium reasoning models | 3.00 to 15.00 | 15.00 to 75.00 | Long context analysis, agent workflows |
| Standard chat models | 0.80 to 3.00 | 4.00 to 15.00 | General chat, summarization |
| Lightweight models | 0.20 to 0.80 | 1.00 to 4.00 | Classification, retrieval, routing |
| Open weight models | 0.30 to 1.50 | 0.60 to 3.00 | Self hosted or replicated workloads |
| Embedding models | 0.02 to 0.20 | n/a | RAG retrieval |
PTU pricing reserves model capacity at a fixed monthly fee. Useful for workloads with predictable throughput requirements or latency sensitivity.
PTU wins when the workload runs consistent throughput at scale and the on demand cost would exceed the PTU reservation cost. The breakeven sits around forty percent utilization of the reserved capacity for most models.
PTU economics only work for sustained throughput at scale. Most enterprise Bedrock workloads burst on demand and sit idle most hours of the day. On demand pricing wins for those workloads. PTU is a tool for production inference at high volume.
Bedrock supports model fine tuning for select models. Fine tuning carries a one off training charge plus a higher per token inference rate on the customized model.
Bedrock economics live in the prompt design, not the contract. A team that cuts output token counts in half saves real money on the EDP commit. A team that fine tunes when retrieval would do pays twice for the same answer quality.
AWS Bedrock spend counts toward the Enterprise Discount Program commit on most EDP contracts. The customer can negotiate a Bedrock specific discount layer on top of the broader EDP discount.
AWS EDP contracts run on three to five year terms. The Bedrock pricing position should be reopened at every renewal.
The seven step checklist below is the buyer side starting position before any Bedrock renewal conversation.
Yes. Bedrock spend counts toward the EDP commit on most contracts. The customer can negotiate a Bedrock specific discount layer on top of the broader EDP discount. The Bedrock spend forecast should be included in the year on year EDP commit ramp at signature, not added later as an overlay.
Only for sustained throughput at scale. PTU economics work when the reserved capacity runs above forty percent utilization. Most enterprise Bedrock workloads burst on demand and sit idle most hours of the day. On demand pricing wins for those workloads. PTU is a tool for production inference at high volume.
Sometimes. Bedrock token rates for Claude models often match the direct Anthropic API. The discount layer in an EDP commit can push the Bedrock rate below the direct API rate. The procurement question is whether the AWS contract overlay plus enterprise discount delivers a better total cost than the direct vendor relationship at the customer's scale.
Pin the model list and version retirement schedule in the EDP order form. AWS deprecates older Bedrock model versions on a rolling schedule. Without explicit version language, the customer can find a production workload running on a deprecated model that AWS plans to retire on three months notice.
Rarely as a first option. Retrieval augmented generation and few shot prompting solve most quality problems at lower cost and lower lock in. Fine tuning makes sense when the workload requires very specific style, format, or domain knowledge that prompting cannot achieve. Run the RAG and prompting path first.
Redress runs AWS Bedrock advisory inside the Vendor Shield subscription and the Renewal Program. Every engagement is led by a former AWS commercial executive on the buyer side and supported by the Bedrock benchmark we maintain across recent EDP renewals at similar scale and workload profile.
Redress runs AWS advisory inside the Vendor Shield subscription, the Renewal Program, the Benchmark Program, and the Software Spend Assessment.
Read the related benchmarking, about us, locations, and contact pages.
A buyer side reference on the AWS Enterprise Discount Program. Commit ramp math, Bedrock overlay, Marketplace pull through, Reserved Instance attach, and the renewal posture across every EDP cycle.
Independent. Buyer side. Written for CIOs, CFOs, and procurement leaders carrying AWS EDP and Bedrock spend. No AWS influence. No sales kickback.
Open the white paper in your browser. Corporate email only.
Open the Paper →Bedrock economics live in the prompt design, not the contract. A team that cuts output token counts in half saves real money on the EDP commit. A team that fine tunes when retrieval would do pays twice for the same answer quality.
We have run 500+ enterprise clients across 11 publishers. Every engagement starts with one conversation.
Bedrock benchmark, EDP commit math, Marketplace pull through patterns, and Reserved Instance posture across every AWS engagement we run on the buyer side.
Once a month. Audit patterns, renewal benchmarks, vendor commercial signals across Oracle, Microsoft, SAP, Salesforce, IBM, Broadcom, AWS, Google Cloud, ServiceNow, Workday, Cisco, and the GenAI vendors. No follow up sales pressure.
Free providers (Gmail, Yahoo, Outlook) cannot subscribe. Work email only. Unsubscribe in one click.