Cloud infrastructure representing Amazon Bedrock foundation model hosting
Amazon Bedrock 2026

Amazon Bedrock pricing what it really costs in 2026.

Every model on Bedrock carries its own token rate. Here is how the pricing works, when each buying mode wins, and the levers that cut your spend.

Contact Us AWS Practice
500+Enterprise clients
$2B+Under advisory
Industry Recognized
500+ Enterprise Clients
$2B+ Under Advisory
11 Vendor Practices
100% Buyer Side Independent

Amazon Bedrock prices each foundation model on its own token rate, with two buying modes. The mode you pick decides whether you pay for what you use or for capacity you reserve.

Key takeaways

  • Bedrock charges per input and output token, priced separately by model.
  • On demand pricing bills only the tokens you actually process.
  • Provisioned throughput reserves model capacity for a fixed hourly fee.
  • Each foundation model on Bedrock has its own token rate.
  • Bedrock spend can be folded into an AWS EDP commitment.
  • Prompt and output size discipline is the largest controllable cost lever.

This guide is for cloud architects and FinOps teams budgeting Amazon Bedrock. Read it with the AWS Bedrock pricing guide and the AWS EDP pillar.

Bedrock looks like a single service, but every model on it carries its own price. Budgeting means knowing the model rates and the two buying modes, then matching the mode to the workload.

How is Amazon Bedrock priced in 2026?

Bedrock charges for model inference by the token, with input and output tokens often priced at different rates. Each foundation model sets its own rate, so the bill depends on which models you call.

What is a token and why does it matter?

A token is a chunk of text, roughly a few characters. You pay per thousand or per million tokens, split between input and output. The Amazon Bedrock pricing page lists the current per model rates.

  • Input tokens: the text you send to the model.
  • Output tokens: the text the model returns, often priced higher.
  • Per model: each foundation model has its own rate.

What are the two buying modes?

On demand bills the tokens you process with no commitment. Provisioned throughput reserves dedicated capacity for an hourly fee. The right mode depends on whether your traffic is steady or spiky.

When does on demand pricing make sense?

On demand is the default. You pay only for the tokens you process, which suits variable or early stage workloads where volume is hard to predict.

Which workloads suit on demand?

Bursty, low volume, or experimental workloads fit on demand. You avoid paying for idle capacity and the cost tracks usage directly, which keeps early projects cheap to run.

Where does on demand get expensive?

At steady high volume, on demand token costs can exceed the cost of reserved capacity. A workload that runs constantly is usually cheaper on provisioned throughput once volume is predictable.

Bedrock buying modes compared

Mode Billing Best for
On demandPer token usedVariable or early workloads
Provisioned throughputHourly per model unitSteady high volume
EDP foldedCounts to commitmentPredictable enterprise spend
Bedrock looks like one service, but every model carries its own token rate. The bill follows the models you call and the prompts you send.

How does provisioned throughput pricing work?

Provisioned throughput reserves model units for a committed period at an hourly rate. It trades flexibility for a lower effective cost per token at scale.

What are model units?

A model unit is a block of guaranteed throughput for a specific model. You pay for the unit by the hour whether you use the full capacity or not, so it rewards steady, high utilization.

  • Reserved capacity: guaranteed throughput per model unit.
  • Hourly fee: charged regardless of utilization.
  • Commitment terms: longer terms lower the hourly rate.

Which workloads justify provisioned capacity?

High volume, latency sensitive, production workloads justify provisioned throughput. Once a model runs near capacity for most of the day, the reserved rate beats on demand token pricing.

What buyer side moves cut Bedrock cost?

The levers are model choice, prompt discipline, mode selection, and EDP placement. Most of them sit with the engineering team, not the AWS account manager.

Why is prompt size the biggest lever?

Every token costs money. Trimming bloated prompts, capping output length, and caching repeated context cut token volume directly. This is the largest lever a team controls without touching a contract.

How does the AWS EDP affect Bedrock?

Bedrock spend counts toward an AWS Enterprise Discount Program commitment. Folding predictable Bedrock volume into the EDP can improve the overall discount, so model the spend before the EDP is sized.

Suggested reading

What to do next

  1. List the foundation models your workloads call on Bedrock.
  2. Pull the current per model token rates for input and output.
  3. Classify each workload as variable or steady high volume.
  4. Put variable workloads on on demand and steady ones on provisioned throughput.
  5. Audit prompts and output limits to cut token volume at the source.
  6. Model predictable Bedrock spend against your AWS EDP commitment.
  7. Set monthly token monitoring per model and per workload.

Frequently asked questions

How is Amazon Bedrock priced?

Amazon Bedrock charges per token for model inference, with input and output tokens often priced at different rates. Each foundation model sets its own rate, so the bill depends on which models you call and how much text you process.

What is the difference between on demand and provisioned throughput?

On demand bills only the tokens you process with no commitment, while provisioned throughput reserves dedicated model capacity for an hourly fee. On demand suits variable workloads; provisioned suits steady high volume.

What is a model unit in Bedrock?

A model unit is a block of guaranteed throughput for a specific model under provisioned throughput. You pay for it by the hour regardless of utilization, so it rewards steady, high use workloads.

Can Bedrock spend count toward an AWS EDP?

Yes, Bedrock spend counts toward an AWS Enterprise Discount Program commitment. Folding predictable Bedrock volume into the EDP can improve the overall discount, so model the spend before sizing the commitment.

What is the biggest lever to cut Bedrock cost?

Prompt and output discipline. Trimming bloated prompts, capping output length, and caching repeated context cut token volume directly, which is the largest lever a team controls without touching a contract.

Is on demand or provisioned cheaper?

It depends on volume. On demand is cheaper for variable or low volume workloads, while provisioned throughput is cheaper once a model runs near capacity for most of the day at predictable, steady volume.

AWS Bedrock Licensing Guide

The full aws bedrock licensing guide framework from the AWS Practice.

Amazon Bedrock pricing models, EDP commit interaction, model unit economics, and the buyer side moves across the full AWS estate.

Used across more than five hundred enterprise engagements. Independent. Buyer side. Built for procurement leaders running the next renewal cycle.

No spam. We will only email you about this download. Privacy.
Run the software spend health check on your AWS estate in under five minutes.
Open the Tool →
500+
Enterprise Clients
$2B+
Under Advisory
11
Vendor Practices
30+
AWS Engagements
100%
Buyer Side

The biggest Bedrock saving is rarely in the contract. It is in the prompts your engineers send every second.

Fredrik Filipsson
Co Founder and Group CEO, ex Oracle, IBM, SAP
Deep Library

More on this topic.

AWS Practice →
AWS Bedrock pricing guide
AWS
AWS Bedrock pricing guide
The deeper reference on Bedrock model rates and modes.
9 min read
Bedrock versus Azure OpenAI
AWS
Bedrock versus Azure OpenAI
How the two enterprise model platforms compare on cost.
9 min read
AWS EDP comprehensive pillar
AWS
AWS EDP comprehensive pillar
How the Enterprise Discount Program shapes AWS spend.
12 min read
Editorial boardroom interior

The advisor your vendors do not want.

500+ enterprise clients. 11 vendor practices. Industry recognized. One conversation can change what you pay for the next three years.

AWS cost intelligence, monthly.

Bedrock token math, EDP levers, and FinOps moves that hold spend down. No noise.