Negotiating Google Cloud CUDs: Securing Maximum Savings Without Overexposure
Google Cloud's Committed Use Discounts offer significant compute savings — but the commitment framework is more rigid than AWS and Azure equivalents. This paper delivers the negotiation strategy, discount mapping, and purchasing methodology to maximise CUD savings while protecting against stranded spend.
Executive Summary
Google Cloud Platform's Committed Use Discount programme is the primary instrument for reducing compute costs below on-demand rates — offering discounts of 28–70% depending on commitment type, term, and resource category. For enterprises spending $1M+ annually on GCP compute, the difference between an optimised CUD portfolio and an unmanaged one is measured in hundreds of thousands of dollars per year. Yet GCP's commitment framework carries specific rigidities that AWS's Reserved Instances and Azure's Reserved VM Instances do not — making the purchasing decision higher stakes and the overcommitment risk more consequential.
The core challenge is that GCP CUDs commit you to specific resource quantities — vCPUs, memory, GPUs — in specific regions for 1 or 3 years, with no marketplace for resale, no exchange mechanism, and no early termination. Unlike AWS (where Standard RIs can be sold on the Marketplace and Convertible RIs can be exchanged) and Azure (where reservations can be exchanged or refunded with limited penalty), GCP CUDs are genuinely locked once purchased. This rigidity demands more precise sizing, more conservative purchasing, and more sophisticated portfolio management than the equivalent decision on AWS or Azure.
GCP's resource-based CUDs are 10–15% more rigid than AWS Reserved Instances and Azure Reserved VM Instances — with no secondary market, no exchange, and no refund mechanism.
This rigidity makes the commitment decision higher stakes. Over-commit and you pay for stranded resources with no exit. Under-commit and you leave savings on the table. The optimal portfolio requires more precise utilisation analysis and more conservative sizing than equivalent AWS or Azure commitments.
Spend-based CUDs — Google's answer to AWS Compute Savings Plans — offer broader flexibility but shallower discounts, and their application mechanics create unexpected gaps.
Spend-based CUDs commit to a dollar-per-hour spend level across eligible services. They're more flexible than resource-based CUDs but apply to a narrower service set than AWS Compute Savings Plans (which cover EC2, Fargate, and Lambda). Understanding which GCP services are and aren't covered is essential to avoiding coverage gaps.
Google's Sustained Use Discounts (SUDs) provide automatic savings for steady-state workloads — but they interact with CUDs in ways that most FinOps teams don't model, potentially reducing the incremental value of CUD purchases.
SUDs provide up to 30% automatic discount for instances that run for a significant portion of the billing month. CUDs stack on top of SUDs in some configurations but replace them in others. Modelling the interaction correctly is the difference between a CUD that delivers 40% incremental savings and one that delivers 15%.
GPU and AI accelerator CUDs carry the deepest discounts (up to 70%) but also the highest stranding risk — AI workload patterns are evolving faster than any 1-year commitment can accommodate.
A100 and H100 GPU CUDs offer compelling unit economics, but committing to specific GPU types for 1–3 years in a market where TPU v5, H200, and next-generation accelerators are launching continuously creates significant technology obsolescence risk.
GCP's commercial negotiation flexibility on CUDs is greater than most enterprises realise — particularly within the context of a broader spend-based commitment agreement.
Google Cloud's enterprise sales organisation has authority to offer custom CUD terms that aren't available through self-service purchasing: enhanced discount tiers, CUD commitment flexibility provisions, cross-project application, and custom term structures. These terms are only available through commercial negotiation — not the console.
Google Cloud's Commitment Architecture
Google Cloud offers three distinct commitment mechanisms, each with different scope, flexibility, and discount characteristics. Understanding the architecture of each — and how they interact with Google's automatic Sustained Use Discounts — is essential to building an optimal commitment portfolio.
Resource-based CUDs commit to specific quantities of vCPUs, memory, GPUs, or local SSD in a specific region for 1 or 3 years. They offer the deepest discounts — up to 57% for general-purpose compute (3-year) and up to 70% for GPUs (3-year). The commitment is to a resource quantity, not a specific machine type — providing some flexibility to change instance sizes within the committed resource pool.
Resource-based CUDs cannot be cancelled, exchanged, or resold. Once purchased, you pay for the committed resources regardless of utilisation. They are scoped to a single region and apply automatically to running instances in the billing project or across projects via shared CUD scope.
Best for: Stable, predictable compute workloads with high confidence in region, resource type, and capacity for the commitment term. Production databases, core application tiers, and steady-state infrastructure.
Spend-based CUDs commit to a minimum hourly spend level on eligible GCP services for 1 or 3 years. The discount is applied automatically against qualifying consumption. Unlike resource-based CUDs, spend-based commitments are not locked to specific resource types, instances, or regions — they apply across a defined set of eligible services.
Discount depth is lower than resource-based CUDs (typically 20–40% depending on spend level and term). Eligible services include Compute Engine, Cloud SQL, GKE, and select other services — but notably do not include BigQuery, Cloud Run, or many managed services, creating coverage gaps that must be understood before committing.
Best for: The flexible base layer of your commitment portfolio — covering compute spend that is predictable in aggregate but where specific resource composition may shift. Also suitable for organisations early in GCP adoption or with evolving architectures.
Sustained Use Discounts are automatic, incremental discounts applied to Compute Engine instances that run for more than 25% of the billing month. At 100% monthly usage, SUDs provide approximately 30% discount on eligible instances. SUDs apply automatically with no commitment — they're Google's mechanism for rewarding steady-state usage without requiring a purchasing decision.
Critically, SUDs interact with CUDs in specific ways: for eligible N1 and N2 machine types, SUDs and CUDs stack (CUD discount is applied to the SUD-discounted rate). For some configurations, CUDs replace SUDs entirely. The interaction must be modelled precisely to calculate the true incremental value of a CUD purchase.
Important: SUDs are being phased out for newer machine series (C3, C3D, H3, and newer). For workloads on these series, CUDs or spend-based commitments become more critical because there's no automatic discount floor. This transition is the single most important change in GCP's commitment economics.
The SUD Phase-Out: A Structural Change
Google is progressively removing Sustained Use Discounts from newer machine series. N1 and N2 instances retain SUDs; C3, C3D, H3, N4, and future series do not. This means that workloads migrating to newer, more performant machine series lose the automatic 30% discount floor — making the gap between on-demand and committed pricing significantly larger. For enterprises planning machine series upgrades, CUD purchasing becomes not just an optimisation opportunity but a necessity to avoid a material cost increase when moving off SUD-eligible instances.
CUD Discount Tiers: Mapping Savings by Resource Type
CUD discounts vary significantly by resource type, commitment term, and machine series. The following breakdown maps the discount landscape and identifies where the deepest savings and highest risks concentrate.
| Resource Category | 1-Year CUD Discount | 3-Year CUD Discount | SUD Available? | Stranding Risk |
|---|---|---|---|---|
| General-Purpose (N2, N2D) | 28–32% | 52–57% | Yes — stacks with CUD | Low — stable demand, broad applicability |
| General-Purpose (C3, C3D, N4) | 28–35% | 52–60% | No — SUDs phased out | Low-Medium — no SUD floor increases CUD importance |
| Memory-Optimised (M2, M3) | 28–32% | 52–55% | Varies by series | Medium — workload-specific, less fungible |
| Compute-Optimised (C2, H3) | 28–35% | 52–60% | C2: Yes / H3: No | Medium — performance-specific workloads |
| NVIDIA A100 GPU | 50–55% | 63–70% | No | High — AI hardware evolving rapidly |
| NVIDIA H100 GPU | 45–52% | 60–67% | No | Very High — next-gen GPUs arriving |
| TPU v4/v5 | Negotiated | Negotiated | No | Very High — TPU generations evolve rapidly |
| Local SSD | 28–32% | 52–57% | N/A | Low — tied to instance commitment |
The GPU CUD Paradox
GPU CUDs present the most compelling discount depth (up to 70%) but also the highest stranding risk. Enterprise AI workloads are evolving faster than any previous technology cycle — the GPU you commit to today (A100 or H100) may be supplanted by a newer, more cost-effective accelerator (H200, B100, or future TPU generations) within the commitment term. A 3-year A100 CUD purchased today that becomes stranded in 18 months when workloads migrate to H200s delivers negative ROI despite the 70% headline discount.
The resolution is term management: GPU CUDs should almost always be 1-year terms (despite the lower discount) unless the enterprise has exceptionally high confidence in both the workload stability and the GPU architecture for the full 3-year period. For most AI workloads in 2025–2026, that confidence doesn't exist.
Negotiated CUD Terms
For enterprises with significant GCP spend ($5M+ annually), Google's enterprise sales team can offer custom CUD terms that aren't available through self-service: deeper discounts at higher commitment volumes, shorter minimum terms for GPU CUDs, CUD commitment flexibility provisions (ability to downgrade by a percentage with notice), and cross-region CUD application. These negotiated terms materially change the risk-reward calculus — particularly for GPU commitments. Always negotiate CUD terms as part of the broader GCP commercial agreement rather than purchasing through the console.
GCP CUDs vs. AWS & Azure: Comparative Analysis
Understanding how GCP's commitment architecture compares to AWS and Azure is essential for multi-cloud enterprises and for benchmarking GCP's commercial flexibility. The differences are significant and affect purchasing strategy directly.
| Attribute | GCP Resource-Based CUD | AWS Standard RI | Azure Reserved VM |
|---|---|---|---|
| Maximum Discount (3-year) | Up to 70% (GPU) / 57% (compute) | Up to 72% | Up to 72% |
| Commitment Basis | Resources (vCPU, memory, GPU) | Instance type (size-flexible within family) | VM series and size |
| Cancellation / Early Exit | None — fully locked | No cancellation, but Marketplace resale | Exchange or refund (with penalty up to $50K) |
| Exchange Mechanism | None | Convertible RI exchange (equal or greater value) | Exchange for different VM size, series, or region |
| Secondary Market | None | RI Marketplace — sell unused RIs | None (but exchange provides similar flexibility) |
| Flexible Alternative | Spend-based CUDs (dollar commitment) | Compute Savings Plans (dollar commitment) | Azure Savings Plans (dollar commitment) |
| Automatic Discount | SUDs (being phased out for new series) | None | None |
| Service Coverage (flexible) | Compute Engine, Cloud SQL, GKE (limited) | EC2, Fargate, Lambda | Compute, App Service, Functions |
The key takeaway: GCP's commitment framework is the most rigid of the three hyperscalers. The absence of a marketplace, exchange mechanism, and cancellation option means that every CUD purchase must be sized with more precision and purchased with more confidence than the equivalent commitment on AWS or Azure. The consequence of over-commitment on GCP is irrecoverable stranded spend; on AWS and Azure, there are escape mechanisms that partially mitigate the same mistake.
This rigidity has a second-order effect on purchasing strategy: GCP CUD portfolios should be sized more conservatively than AWS or Azure equivalents — targeting 65–75% of stable capacity rather than the 80–90% that's appropriate for AWS. The residual on-demand spend is higher, but the stranding risk is materially lower. The lost discount on the conservatively-uncommitted 10–15% is almost always less costly than stranded commitments on the over-purchased 10–15%.
"On AWS, an over-commitment is a recoverable error. On GCP, it's a sunk cost. Size your GCP CUD portfolio for the workload you're certain about — not the workload you hope for."Redress Compliance — Cloud & FinOps Practice
Overcommitment Protection: Terms That Matter
Because GCP CUDs lack the exit mechanisms available on AWS and Azure, contractual protections negotiated at the agreement level become the primary defence against overcommitment. The following provisions should be pursued in every enterprise GCP commercial negotiation.
CUD Commitment Flexibility Provision
Negotiate the ability to reduce CUD commitments by 15–25% with 90 days' notice. Google's self-service CUDs are fully locked, but enterprise agreements can include flexibility provisions that allow commitment reduction — effectively creating a partial exit mechanism. This provision is the single most valuable protection for managing workload volatility on GCP.
Cross-Region CUD Application
Standard resource-based CUDs are region-locked. If workloads migrate between regions (common during disaster recovery events, performance optimisation, or regulatory changes), the CUD doesn't follow. Negotiate cross-region application provisions that allow CUD credits to apply to equivalent resources in different regions — or at minimum, the ability to transfer CUDs between regions with notice.
Machine Series Migration Protection
CUDs committed to N2 vCPUs don't automatically apply to C3 or N4 vCPUs. When Google releases new machine series with better price-performance (which they do regularly), enterprises face a choice: stay on the committed older series or migrate to the newer series and strand the CUD. Negotiate provisions that allow CUD commitments to migrate to successor machine series at equivalent or better discount terms.
GPU CUD Term Flexibility
Standard GPU CUD terms are 1-year and 3-year. For AI workloads where GPU architecture is evolving rapidly, even 1-year terms carry significant technology risk. Negotiate shorter GPU CUD terms (6-month or quarterly) or provisions that allow GPU CUD conversion to newer GPU types when they become available — preserving the commitment economics while accommodating hardware evolution.
CUD Scope Expansion
By default, CUDs are scoped to the purchasing project or shared across the billing account. For enterprises with complex organisational structures (multiple billing accounts, separate production and development organisations), CUD application scope can create gaps where committed resources in one billing context don't apply to consumption in another. Negotiate organisation-wide CUD scope that allows commitments to apply across all projects and billing accounts under the enterprise's GCP organisation.
The CUD Negotiation Framework
Negotiating GCP CUDs effectively requires combining self-service purchasing decisions with enterprise-level commercial negotiation. The framework below addresses both dimensions — the portfolio construction (what to buy) and the commercial terms (how to protect against overexposure).
Utilisation Baselining
Extract 90 days of Compute Engine utilisation data across all projects and regions. For each resource type (vCPUs, memory, GPUs), calculate the stable floor — the minimum resource consumption maintained consistently across the analysis period. The stable floor is your safe CUD commitment level. Anything above the floor is variable demand that should remain on-demand or be covered by spend-based CUDs.
SUD Interaction Modelling
Model the interaction between existing Sustained Use Discounts and proposed CUDs for every resource category. Calculate the incremental discount value of CUDs above the SUD baseline for SUD-eligible machine series (N1, N2). For non-SUD series (C3, C3D, N4), the full CUD discount represents incremental savings versus on-demand — making CUDs significantly more impactful for these workloads.
Portfolio Construction
Build the CUD portfolio in layers. Layer 1: Spend-based CUDs covering 50–60% of the projected stable compute spend — maximum flexibility, minimum stranding. Layer 2: Resource-based CUDs for high-confidence workloads where the resource type, region, and capacity are locked — maximum discount. Layer 3: On-demand for everything else. Target 65–75% total committed coverage (more conservative than the 75–85% target for AWS).
Negotiate Enterprise Terms
Present the CUD portfolio plan to Google's enterprise sales team as part of the broader GCP commercial discussion. Negotiate the five overcommitment protection provisions (Section 05), custom discount tiers for higher commitment volumes, and integration with spend-based commitment agreements. CUD terms negotiated at the enterprise level are materially better than self-service purchases.
Competitive Benchmarking
Present AWS RI/Savings Plan and Azure Reserved VM pricing for equivalent workloads alongside GCP CUD proposals. The comparative data establishes whether GCP's CUD pricing is competitive — and where it isn't, provides factual leverage for requesting improved terms. Google's pricing flexibility is highest when competitive alternatives are documented and costed.
Implement Governance
Establish monthly CUD portfolio monitoring: commitment utilisation rate, stranding exposure, upcoming expirations, and on-demand spend eligible for new CUDs. Quarterly, review the portfolio against evolving workload patterns and machine series adoption. The governance cadence prevents the silent accumulation of stranded commitments that makes GCP's CUD rigidity costly.
6 Common CUD Mistakes & How to Avoid Them
Committing Based on Current Consumption Without Forecasting
Ignoring the SUD Phase-Out
3-Year GPU CUDs in a 12-Month Innovation Cycle
Purchasing CUDs Through the Console Instead of Enterprise Agreement
100% Resource-Based CUDs, Zero Spend-Based
No Coordination Between CUDs and Spend-Based Agreement
Recommendations: 7 Priority Actions
Size CUD Portfolios More Conservatively Than AWS or Azure
Target 65–75% committed coverage on GCP versus the 75–85% appropriate for AWS. The absence of a marketplace, exchange, and refund mechanism means that over-commitment on GCP is irrecoverable. The cost of under-committing by 10% (paying on-demand rates on the uncommitted portion) is almost always less than the cost of stranding 10% of committed capacity.
Model the SUD Phase-Out Before Committing
Map every workload against SUD eligibility. For workloads on SUD-eligible series (N1, N2), calculate the true incremental CUD value above the SUD floor. For workloads migrating to non-SUD series (C3, C3D, N4), include CUD purchasing as a mandatory component of the migration plan to avoid the 30% effective cost increase when SUDs are lost.
Layer Spend-Based and Resource-Based CUDs
Build a layered portfolio: spend-based CUDs as the flexible foundation (50–60% of total commitment), resource-based CUDs for high-confidence stable workloads (20–30%), and on-demand for the remainder. This blended approach trades some discount depth for resilience — the correct trade-off given GCP's rigid commitment framework.
Limit GPU CUDs to 1-Year Terms
The AI hardware cycle is shorter than any CUD term. Until GPU architecture stabilises (which won't happen in the 2025–2027 timeframe), commit to GPUs for 1 year maximum. Negotiate GPU migration provisions through the enterprise agreement so that when successor GPUs launch, your commitment transitions rather than strands.
Negotiate Enterprise CUD Terms — Don't Self-Service
For any CUD commitment above $100K annual value, negotiate through Google's enterprise sales team rather than purchasing through the console. Enterprise-negotiated CUDs deliver deeper discounts, flexibility provisions (commitment reduction, cross-region application, machine series migration), and integration with spend-based agreements that self-service purchasing cannot provide.
Use AWS and Azure Pricing as Negotiation Benchmarks
Present AWS RI/Savings Plan and Azure Reserved VM pricing for equivalent workloads alongside GCP CUD proposals. Google is competing for enterprise cloud share — and their pricing flexibility is highest when competitive alternatives are costed and credible. The benchmark doesn't require a multi-cloud deployment; it requires a multi-cloud pricing comparison.
Implement Monthly CUD Portfolio Monitoring
Track commitment utilisation, stranding exposure, upcoming expirations, and on-demand spend eligible for new CUDs monthly. Because GCP CUDs have no exit mechanism, early detection of stranding risk is the only mitigation available — and it requires continuous monitoring, not periodic review. Integrate CUD monitoring with your broader FinOps governance cadence.
How Redress Can Help
Redress Compliance's Cloud & FinOps Practice provides independent advisory on Google Cloud commitment optimisation — from utilisation analysis and CUD portfolio design through enterprise term negotiation and ongoing governance. We maintain zero commercial relationships with Google, AWS, or Azure.
CUD Portfolio Assessment
Comprehensive audit of existing CUD commitments — utilisation rates, stranding exposure, SUD interaction analysis, and spend-based agreement coordination with specific optimisation recommendations.
Commitment Strategy Design
Data-driven CUD portfolio construction — layered by instrument type, resource category, and term — with financial modelling of savings, stranding risk, and SUD interaction effects.
Enterprise Term Negotiation
Negotiation of enterprise CUD terms — flexibility provisions, cross-region application, machine series migration, GPU term management, and spend-based agreement integration.
Cross-Provider Benchmarking
Comparative pricing analysis across GCP CUDs, AWS RIs/Savings Plans, and Azure Reserved VMs — producing the competitive data that maximises Google's pricing flexibility.
Right-Sizing Analysis
Instance-level right-sizing across your GCP estate — identifying over-provisioned resources that should be resized before CUD purchasing to avoid committing to waste.
Ongoing FinOps Governance
Monthly CUD portfolio monitoring, stranding detection, utilisation reporting, and purchasing recommendations — maintaining optimal commitment coverage as workload patterns evolve.
100% Independent Advisory
Redress maintains zero commercial relationships with Google Cloud, AWS, Azure, or any FinOps tooling vendor. Our only relationship is with you — ensuring our recommendations maximise your cloud savings, not any provider's revenue.
Book a Meeting
Schedule a confidential consultation with our Cloud & FinOps Practice team. We'll review your current GCP commitment portfolio and identify specific opportunities to maximise savings without overexposure.