Machine learning infrastructure visualized over a server room backdrop
AWS

AWS Bedrock licensing, tokens, throughput, and the EDP.

Bedrock prices per token, per provisioned unit, and per model provider. The bill is controllable, but only with the right structure. Here is the buyer side map.

Contact Us AWS Advisory
500+Enterprise clients
$2B+Under advisory
Industry Recognized
500+ Enterprise Clients
$2B+ Under Advisory
11 Vendor Practices
100% Buyer Side Independent

AWS Bedrock bills per token on demand, per model unit when provisioned, and through your existing AWS agreement, which makes it the AI spend that is easiest to govern and easiest to ignore.

Key takeaways

  • Three pricing modes: on demand per token, batch at roughly half the on demand rate, and provisioned throughput per model unit.
  • The model choice is the price: rates vary by an order of magnitude across model providers and sizes on the same task.
  • Bedrock spend feeds the EDP: consumption counts toward AWS commit programs, which makes it negotiating currency.
  • Provisioned is a commitment: model units bill by the hour whether used or not; buy them against measured load only.
  • Batch is the free discount: half price for any workload that tolerates asynchronous processing.
  • Routing is the lever: matching each task to the cheapest sufficient model cuts 30 to 60 percent off naive deployments.

How does AWS Bedrock pricing actually work?

Bedrock charges for model inference per 1,000 input and output tokens on demand, with rates set per model provider and size. There is no platform subscription; the meter is the contract. The full rate card sits on the AWS Bedrock pricing page and changes as providers reprice.

Output tokens typically cost several times input tokens, and long context workloads multiply both sides of the meter. Prompt design is a procurement concern here, not just an engineering one.

  • On demand: per token, no commitment, the right default while load is unmeasured.
  • Batch: roughly half the on demand rate for asynchronous jobs.
  • Provisioned throughput: hourly model units for steady, latency sensitive production load.

When does provisioned throughput beat on demand?

Provisioned throughput wins when production load is steady, measured, and latency sensitive; it loses everywhere else because model units bill by the hour regardless of use. The breakeven sits where sustained utilization of a model unit clears roughly the same token volume bought on demand.

Bedrock pricing modes, buyer view

ModeBilling basisBest forRisk
On demandPer 1,000 tokensVariable and unproven loadCost spikes without caps
BatchPer token, discountedAsynchronous processingNone beyond latency
ProvisionedPer model unit hourSteady production inferenceIdle units bill anyway

Measure before you commit

Run new workloads on demand with budget alerts for a full business cycle before buying provisioned capacity. The estates that inverted this order carried 25 to 40 percent idle provisioned units in our file.

How does Bedrock interact with an AWS EDP?

Bedrock consumption counts toward AWS commit programs, which cuts both ways. It helps retire an existing commit, and it gives AWS a growth story to anchor a larger renewal ask. The Bedrock platform is strategic for AWS, and that makes your AI roadmap negotiating currency.

  • Inside an EDP: Bedrock spend retires commit at your discounted effective rate.
  • At renewal: bring measured AI projections, not vendor enthusiasm, into the commit sizing.
  • Private pricing: sustained Bedrock volume justifies service specific terms beyond the EDP percentage.

Where the common advice on Bedrock costs is wrong

The standard advice is to negotiate a bigger AWS discount to bring AI costs down. We disagree. In roughly 8 of the 10 to 14 AWS AI engagements we advised in 2024 to 2025, model routing moved 30 to 60 percent of the bill while incremental discount moved single digits. The buyer side move is an internal model routing standard, smallest sufficient model per task class, enforced in the application layer. No discount percentage survives comparison with not sending the tokens at all.

Closeup of a processor chip on a dark circuit board
Token meters reward engineering discipline the way power meters reward insulation: the cheapest optimization is always the request you stopped sending.
30 to 60%
Overspend from unrouted frontier model use
25 to 40%
Idle share of prematurely provisioned units
~50%
Batch discount vs on demand rates

Source: Redress Compliance advisory engagement file, 2024 to 2025.

The cheapest token is the one a smaller model handled. Routing policy is the only AI discount that compounds.

Where does AWS document the commitment mechanics?

The commitment terms sit in the provisioned throughput documentation and the Bedrock FAQ, including the term options and their discount steps. Read them before the account team models the purchase; the no commitment option is the one the model rarely leads with.

What levers actually cut Bedrock spend?

Five levers, in order of impact: model routing, prompt and context discipline, batch conversion, caching, and only then commercial structure. The first four are engineering policies with procurement consequences; the fifth is where the EDP and private pricing land.

  • Routing standard: classify tasks and assign the smallest sufficient model per class.
  • Context discipline: trim prompts and retrieval payloads; long context multiplies the meter.
  • Batch conversion: move every asynchronous workload to batch pricing.
  • Caching: deduplicate repeated inference at the application layer.
  • Commercial structure: EDP integration, budget caps, and private pricing on sustained volume.

What to do next

The moves below turn the Bedrock meter into a governed, negotiable cost line.

A sequence you can run this quarter

  1. Tag all Bedrock usage by application and task class in cost allocation reports.
  2. Run a routing review: which task classes can drop to smaller or cheaper models.
  3. Convert every asynchronous workload to batch processing.
  4. Set budget alerts and hard caps per application before scaling anything.
  5. Hold provisioned throughput purchases until a full cycle of measured load exists.
  6. Bring measured AI projections into the next EDP conversation as commit currency.
Cover of the AWS Bedrock. Enterprise AI inference licensing white paper from Redress Compliance

White Paper · AWS

AWS Bedrock. Enterprise AI inference licensing

What AWS Bedrock really costs to run at enterprise scale: token pricing, provisioned throughput commits, customization fees, and the EDP rollup. Read it free.

Read the white paper

Frequently asked questions

How is AWS Bedrock licensed?

Bedrock has no license or subscription; it bills for usage, per 1,000 tokens on demand, discounted for batch, or per model unit hour for provisioned throughput. Costs ride your existing AWS agreement and count toward commit programs.

What does Bedrock cost per token?

Rates vary by an order of magnitude across model providers and sizes, with output tokens costing several times input tokens. The current rate card on the AWS Bedrock pricing page is the only reliable reference because providers reprice frequently.

Does Bedrock usage count toward an AWS EDP?

Yes. Bedrock consumption retires EDP commit like other AWS service spend, which makes measured AI growth projections useful currency in commit negotiations and renewals.

When should you buy provisioned throughput on Bedrock?

Only after a full business cycle of measured production load that is steady and latency sensitive. Model units bill hourly whether used or not, and early buyers in our file carried 25 to 40 percent idle capacity.

How do you stop Bedrock costs from running away?

Budget alerts and hard caps per application, a routing standard that assigns the smallest sufficient model per task, and batch pricing for asynchronous work. Routing alone cut 30 to 60 percent in the deployments we benchmarked.

Is Bedrock cheaper than going direct to model providers?

It depends on the model and volume; direct provider contracts can undercut Bedrock at scale, while Bedrock wins on EDP integration and operational simplicity. Price both against your routed workload mix before committing either way.

Free Download

The full AWS Bedrock Licensing White Paper from the AWS practice.

The pricing modes, the EDP interaction, and the levers that cut Bedrock spend.

Used across more than five hundred enterprise engagements. Independent. Buyer side. Built for procurement leaders running the next renewal cycle.

No spam. We will only email you about this download. Privacy.
Run a software spend health check against your AWS estate in under five minutes.
Open the Tool →
30 to 60%
Overspend from unrouted frontier model use
25 to 40%
Idle share of prematurely provisioned units
~50%
Batch discount vs on demand rates

Bedrock is the easiest AI spend to govern because the meter is honest. It is also the easiest to ignore until the invoice is not.

Fredrik Filipsson
Co Founder and Group CEO. Ex Oracle, IBM, SAP.
Deep Library

More on this topic.

AWS Advisory →
Abstract visualization of enterprise AI decision making
GenAI
AI Procurement Strategy
Commit timing, data terms, and exit ramps for AI contracts.
8 min read
Cloud infrastructure racks in an enterprise data center
AWS
AWS EDP Negotiation
The discount tiers and terms that move an EDP.
8 min read
Procurement analyst comparing marketplace offers on screen
AWS
AWS Marketplace Strategy
Retiring commit with third party software spend.
8 min read
Editorial boardroom interior

The advisor your vendors do not want.

500+ enterprise clients. 11 vendor practices. Industry recognized. One conversation can change what you pay for the next three years.

Stay ahead of AWS pricing changes.

One buyer side briefing a week. Bedrock moves, EDP signals, and the levers that work. No vendor spin.