Bedrock prices per token, per provisioned unit, and per model provider. The bill is controllable, but only with the right structure. Here is the buyer side map.
AWS Bedrock bills per token on demand, per model unit when provisioned, and through your existing AWS agreement, which makes it the AI spend that is easiest to govern and easiest to ignore.
Bedrock charges for model inference per 1,000 input and output tokens on demand, with rates set per model provider and size. There is no platform subscription; the meter is the contract. The full rate card sits on the AWS Bedrock pricing page and changes as providers reprice.
Output tokens typically cost several times input tokens, and long context workloads multiply both sides of the meter. Prompt design is a procurement concern here, not just an engineering one.
Provisioned throughput wins when production load is steady, measured, and latency sensitive; it loses everywhere else because model units bill by the hour regardless of use. The breakeven sits where sustained utilization of a model unit clears roughly the same token volume bought on demand.
Bedrock pricing modes, buyer view
| Mode | Billing basis | Best for | Risk |
|---|---|---|---|
| On demand | Per 1,000 tokens | Variable and unproven load | Cost spikes without caps |
| Batch | Per token, discounted | Asynchronous processing | None beyond latency |
| Provisioned | Per model unit hour | Steady production inference | Idle units bill anyway |
Run new workloads on demand with budget alerts for a full business cycle before buying provisioned capacity. The estates that inverted this order carried 25 to 40 percent idle provisioned units in our file.
Bedrock consumption counts toward AWS commit programs, which cuts both ways. It helps retire an existing commit, and it gives AWS a growth story to anchor a larger renewal ask. The Bedrock platform is strategic for AWS, and that makes your AI roadmap negotiating currency.
The standard advice is to negotiate a bigger AWS discount to bring AI costs down. We disagree. In roughly 8 of the 10 to 14 AWS AI engagements we advised in 2024 to 2025, model routing moved 30 to 60 percent of the bill while incremental discount moved single digits. The buyer side move is an internal model routing standard, smallest sufficient model per task class, enforced in the application layer. No discount percentage survives comparison with not sending the tokens at all.
Source: Redress Compliance advisory engagement file, 2024 to 2025.
The cheapest token is the one a smaller model handled. Routing policy is the only AI discount that compounds.
The commitment terms sit in the provisioned throughput documentation and the Bedrock FAQ, including the term options and their discount steps. Read them before the account team models the purchase; the no commitment option is the one the model rarely leads with.
Five levers, in order of impact: model routing, prompt and context discipline, batch conversion, caching, and only then commercial structure. The first four are engineering policies with procurement consequences; the fifth is where the EDP and private pricing land.
The moves below turn the Bedrock meter into a governed, negotiable cost line.
White Paper · AWS
AWS Bedrock. Enterprise AI inference licensing
What AWS Bedrock really costs to run at enterprise scale: token pricing, provisioned throughput commits, customization fees, and the EDP rollup. Read it free.
Bedrock has no license or subscription; it bills for usage, per 1,000 tokens on demand, discounted for batch, or per model unit hour for provisioned throughput. Costs ride your existing AWS agreement and count toward commit programs.
Rates vary by an order of magnitude across model providers and sizes, with output tokens costing several times input tokens. The current rate card on the AWS Bedrock pricing page is the only reliable reference because providers reprice frequently.
Yes. Bedrock consumption retires EDP commit like other AWS service spend, which makes measured AI growth projections useful currency in commit negotiations and renewals.
Only after a full business cycle of measured production load that is steady and latency sensitive. Model units bill hourly whether used or not, and early buyers in our file carried 25 to 40 percent idle capacity.
Budget alerts and hard caps per application, a routing standard that assigns the smallest sufficient model per task, and batch pricing for asynchronous work. Routing alone cut 30 to 60 percent in the deployments we benchmarked.
It depends on the model and volume; direct provider contracts can undercut Bedrock at scale, while Bedrock wins on EDP integration and operational simplicity. Price both against your routed workload mix before committing either way.
The pricing modes, the EDP interaction, and the levers that cut Bedrock spend.
Used across more than five hundred enterprise engagements. Independent. Buyer side. Built for procurement leaders running the next renewal cycle.
Bedrock is the easiest AI spend to govern because the meter is honest. It is also the easiest to ignore until the invoice is not.
500+ enterprise clients. 11 vendor practices. Industry recognized. One conversation can change what you pay for the next three years.
One buyer side briefing a week. Bedrock moves, EDP signals, and the levers that work. No vendor spin.