The 99.9 covers availability only. Latency, retirement risk, and support severity are bought separately, and they negotiate.
Azure OpenAI ships a 99.9 percent availability SLA that says nothing about latency, throughput, or model quality. The terms that protect a production AI workload live in provisioned throughput, support tiers, and the contract you negotiate around the SLA.
The Azure OpenAI service carries a 99.9 percent availability SLA, and availability is the entire scope. The online services SLA terms define the metric, the credit ladder, and the claim mechanics.
Latency, throughput, token generation speed, and model output quality sit outside the SLA. For a production AI feature, those are usually the failure modes that matter.
What the SLA covers versus what production needs
| Risk | In the SLA? | Where protection actually comes from |
|---|---|---|
| Service unavailable | Yes, 99.9 percent | Credit claim after breach |
| Slow responses, latency spikes | No | Provisioned throughput units |
| Throttling at high load | No | PTU reservation sizing |
| Model quality regression | No | Deployment pinning and eval gates |
| Model retirement | No | Migration planning, contract notice terms |
They pay as service credits against future spend, only after you file a claim with evidence inside the claim window. Build the monitoring evidence trail before the incident, not after.
Provisioned throughput is the latency instrument. Provisioned throughput units reserve model processing capacity with consistent latency, where standard pay as you go shares capacity and absorbs the noisy neighbor problem.
Above 60 percent sustained utilization before adding capacity. The 35 to 55 percent utilization we found in our reviews means a third to half of reserved AI capacity was paid headroom.
The SLA is not support. Severity response times, escalation paths, and engineering access come from your Azure support plan, purchased separately or wrapped into a Unified agreement.
No. Azure OpenAI is an Azure service under Azure support terms. A Microsoft 365 support relationship does not carry severity commitments for your Azure AI deployment.
The SLA itself rarely moves, but the commercial frame around it does. In our 2024 to 2025 reviews, PTU reservation pricing, term flexibility, and migration support moved 15 to 25 percent of committed AI spend.
Yes. Azure OpenAI consumption draws down a Microsoft Azure Consumption Commitment like any Azure service, which makes the AI line part of your negotiated discount fabric instead of a side purchase.
The standard advice is to scrutinize the SLA percentage and negotiate it upward. We disagree. In roughly 12 to 18 Azure OpenAI commitment reviews Fredrik Filipsson ran in 2024 to 2025, not one production incident that caused business damage was an availability breach; they were latency and throughput degradations the SLA does not cover at any percentage. The buyer side move is to spend the negotiation capital on PTU reservation pricing, term flexibility, and retirement notice terms, and to treat the 99.9 number as marketing furniture. A nine never paid for a slow checkout.
Three cuts of our advisory engagement file frame the size of the opportunity.
Source: Redress Compliance advisory engagement file, 2024 to 2025.
Five moves turn this analysis into a lower invoice on the next renewal.
White Paper · Microsoft
Azure OpenAI Service Commitment Playbook
When Azure OpenAI PTUs beat Pay As You Go and when they do not, plus the model price drops and regional capacity traps that change the commit math. Read it free.
A 99.9 percent availability SLA with service credits as the remedy. It covers whether the service responds, not how fast it responds or how well the model performs, so latency and quality risk need separate instruments.
No. Latency, throttling, and token generation speed are out of SLA scope at any tier. Provisioned throughput units are the mechanism for predictable performance, and they are bought, not promised.
File a claim through Azure support within the claim window defined in the online services SLA terms, with your own monitoring evidence of the breach. Credits offset future spend; they are never paid proactively.
Yes. Monthly and yearly PTU reservations discount steeply against hourly rates, and reservation pricing moved 15 to 25 percent in commitment reviews we advised in 2024 to 2025 when tied to broader Azure growth.
Yes. Azure OpenAI consumption draws down a Microsoft Azure Consumption Commitment like any Azure service, so route the AI line through the MACC to earn your negotiated discount structure.
Deployments on retired versions are forced to migrate on Microsoft's lifecycle schedule, and the SLA pays nothing for the migration work. Negotiate notice periods and migration assistance into the agreement before committing.
PTU sizing math, reservation benchmarks, and the contract terms that matter more than the SLA.
Used across more than five hundred enterprise engagements. Independent. Buyer side. Built for procurement leaders running the next renewal cycle.
The SLA tells you when Microsoft owes you an apology. The PTU reservation tells you when your customers get an answer. Fund the second.
500+ enterprise clients. 11 vendor practices. Industry recognized. One conversation can change what you pay for the next three years.
One buyer side briefing a week. Pricing moves, audit signals, and the levers that work. No vendor spin.