REDRESSCOMPLIANCE
White Paper — Cloud & FinOps Practice

Negotiating Google Cloud CUDs: Securing Maximum Savings Without Overexposure

Google Cloud's Committed Use Discounts offer significant compute savings — but the commitment framework is more rigid than AWS and Azure equivalents. This paper delivers the negotiation strategy, discount mapping, and purchasing methodology to maximise CUD savings while protecting against stranded spend.

28–70%
CUD Discount Range
3
CUD Types
6
Common Mistakes
7
Priority Actions
Section 01

Executive Summary

Google Cloud Platform's Committed Use Discount programme is the primary instrument for reducing compute costs below on-demand rates — offering discounts of 28–70% depending on commitment type, term, and resource category. For enterprises spending $1M+ annually on GCP compute, the difference between an optimised CUD portfolio and an unmanaged one is measured in hundreds of thousands of dollars per year. Yet GCP's commitment framework carries specific rigidities that AWS's Reserved Instances and Azure's Reserved VM Instances do not — making the purchasing decision higher stakes and the overcommitment risk more consequential.

The core challenge is that GCP CUDs commit you to specific resource quantities — vCPUs, memory, GPUs — in specific regions for 1 or 3 years, with no marketplace for resale, no exchange mechanism, and no early termination. Unlike AWS (where Standard RIs can be sold on the Marketplace and Convertible RIs can be exchanged) and Azure (where reservations can be exchanged or refunded with limited penalty), GCP CUDs are genuinely locked once purchased. This rigidity demands more precise sizing, more conservative purchasing, and more sophisticated portfolio management than the equivalent decision on AWS or Azure.

1

GCP's resource-based CUDs are 10–15% more rigid than AWS Reserved Instances and Azure Reserved VM Instances — with no secondary market, no exchange, and no refund mechanism.

This rigidity makes the commitment decision higher stakes. Over-commit and you pay for stranded resources with no exit. Under-commit and you leave savings on the table. The optimal portfolio requires more precise utilisation analysis and more conservative sizing than equivalent AWS or Azure commitments.

2

Spend-based CUDs — Google's answer to AWS Compute Savings Plans — offer broader flexibility but shallower discounts, and their application mechanics create unexpected gaps.

Spend-based CUDs commit to a dollar-per-hour spend level across eligible services. They're more flexible than resource-based CUDs but apply to a narrower service set than AWS Compute Savings Plans (which cover EC2, Fargate, and Lambda). Understanding which GCP services are and aren't covered is essential to avoiding coverage gaps.

3

Google's Sustained Use Discounts (SUDs) provide automatic savings for steady-state workloads — but they interact with CUDs in ways that most FinOps teams don't model, potentially reducing the incremental value of CUD purchases.

SUDs provide up to 30% automatic discount for instances that run for a significant portion of the billing month. CUDs stack on top of SUDs in some configurations but replace them in others. Modelling the interaction correctly is the difference between a CUD that delivers 40% incremental savings and one that delivers 15%.

4

GPU and AI accelerator CUDs carry the deepest discounts (up to 70%) but also the highest stranding risk — AI workload patterns are evolving faster than any 1-year commitment can accommodate.

A100 and H100 GPU CUDs offer compelling unit economics, but committing to specific GPU types for 1–3 years in a market where TPU v5, H200, and next-generation accelerators are launching continuously creates significant technology obsolescence risk.

5

GCP's commercial negotiation flexibility on CUDs is greater than most enterprises realise — particularly within the context of a broader spend-based commitment agreement.

Google Cloud's enterprise sales organisation has authority to offer custom CUD terms that aren't available through self-service purchasing: enhanced discount tiers, CUD commitment flexibility provisions, cross-project application, and custom term structures. These terms are only available through commercial negotiation — not the console.

Section 02

Google Cloud's Commitment Architecture

Google Cloud offers three distinct commitment mechanisms, each with different scope, flexibility, and discount characteristics. Understanding the architecture of each — and how they interact with Google's automatic Sustained Use Discounts — is essential to building an optimal commitment portfolio.

Resource-Based CUDs
Deepest discount — locked to specific resources

Resource-based CUDs commit to specific quantities of vCPUs, memory, GPUs, or local SSD in a specific region for 1 or 3 years. They offer the deepest discounts — up to 57% for general-purpose compute (3-year) and up to 70% for GPUs (3-year). The commitment is to a resource quantity, not a specific machine type — providing some flexibility to change instance sizes within the committed resource pool.

Resource-based CUDs cannot be cancelled, exchanged, or resold. Once purchased, you pay for the committed resources regardless of utilisation. They are scoped to a single region and apply automatically to running instances in the billing project or across projects via shared CUD scope.

Best for: Stable, predictable compute workloads with high confidence in region, resource type, and capacity for the commitment term. Production databases, core application tiers, and steady-state infrastructure.

Spend-Based CUDs
Broader flexibility — dollar-based commitment

Spend-based CUDs commit to a minimum hourly spend level on eligible GCP services for 1 or 3 years. The discount is applied automatically against qualifying consumption. Unlike resource-based CUDs, spend-based commitments are not locked to specific resource types, instances, or regions — they apply across a defined set of eligible services.

Discount depth is lower than resource-based CUDs (typically 20–40% depending on spend level and term). Eligible services include Compute Engine, Cloud SQL, GKE, and select other services — but notably do not include BigQuery, Cloud Run, or many managed services, creating coverage gaps that must be understood before committing.

Best for: The flexible base layer of your commitment portfolio — covering compute spend that is predictable in aggregate but where specific resource composition may shift. Also suitable for organisations early in GCP adoption or with evolving architectures.

Sustained Use Discounts
Automatic — no commitment required

Sustained Use Discounts are automatic, incremental discounts applied to Compute Engine instances that run for more than 25% of the billing month. At 100% monthly usage, SUDs provide approximately 30% discount on eligible instances. SUDs apply automatically with no commitment — they're Google's mechanism for rewarding steady-state usage without requiring a purchasing decision.

Critically, SUDs interact with CUDs in specific ways: for eligible N1 and N2 machine types, SUDs and CUDs stack (CUD discount is applied to the SUD-discounted rate). For some configurations, CUDs replace SUDs entirely. The interaction must be modelled precisely to calculate the true incremental value of a CUD purchase.

Important: SUDs are being phased out for newer machine series (C3, C3D, H3, and newer). For workloads on these series, CUDs or spend-based commitments become more critical because there's no automatic discount floor. This transition is the single most important change in GCP's commitment economics.

The SUD Phase-Out: A Structural Change

Google is progressively removing Sustained Use Discounts from newer machine series. N1 and N2 instances retain SUDs; C3, C3D, H3, N4, and future series do not. This means that workloads migrating to newer, more performant machine series lose the automatic 30% discount floor — making the gap between on-demand and committed pricing significantly larger. For enterprises planning machine series upgrades, CUD purchasing becomes not just an optimisation opportunity but a necessity to avoid a material cost increase when moving off SUD-eligible instances.

Section 03

CUD Discount Tiers: Mapping Savings by Resource Type

CUD discounts vary significantly by resource type, commitment term, and machine series. The following breakdown maps the discount landscape and identifies where the deepest savings and highest risks concentrate.

Resource Category1-Year CUD Discount3-Year CUD DiscountSUD Available?Stranding Risk
General-Purpose (N2, N2D)28–32%52–57%Yes — stacks with CUDLow — stable demand, broad applicability
General-Purpose (C3, C3D, N4)28–35%52–60%No — SUDs phased outLow-Medium — no SUD floor increases CUD importance
Memory-Optimised (M2, M3)28–32%52–55%Varies by seriesMedium — workload-specific, less fungible
Compute-Optimised (C2, H3)28–35%52–60%C2: Yes / H3: NoMedium — performance-specific workloads
NVIDIA A100 GPU50–55%63–70%NoHigh — AI hardware evolving rapidly
NVIDIA H100 GPU45–52%60–67%NoVery High — next-gen GPUs arriving
TPU v4/v5NegotiatedNegotiatedNoVery High — TPU generations evolve rapidly
Local SSD28–32%52–57%N/ALow — tied to instance commitment

The GPU CUD Paradox

GPU CUDs present the most compelling discount depth (up to 70%) but also the highest stranding risk. Enterprise AI workloads are evolving faster than any previous technology cycle — the GPU you commit to today (A100 or H100) may be supplanted by a newer, more cost-effective accelerator (H200, B100, or future TPU generations) within the commitment term. A 3-year A100 CUD purchased today that becomes stranded in 18 months when workloads migrate to H200s delivers negative ROI despite the 70% headline discount.

The resolution is term management: GPU CUDs should almost always be 1-year terms (despite the lower discount) unless the enterprise has exceptionally high confidence in both the workload stability and the GPU architecture for the full 3-year period. For most AI workloads in 2025–2026, that confidence doesn't exist.

Negotiated CUD Terms

For enterprises with significant GCP spend ($5M+ annually), Google's enterprise sales team can offer custom CUD terms that aren't available through self-service: deeper discounts at higher commitment volumes, shorter minimum terms for GPU CUDs, CUD commitment flexibility provisions (ability to downgrade by a percentage with notice), and cross-region CUD application. These negotiated terms materially change the risk-reward calculus — particularly for GPU commitments. Always negotiate CUD terms as part of the broader GCP commercial agreement rather than purchasing through the console.

Section 04

GCP CUDs vs. AWS & Azure: Comparative Analysis

Understanding how GCP's commitment architecture compares to AWS and Azure is essential for multi-cloud enterprises and for benchmarking GCP's commercial flexibility. The differences are significant and affect purchasing strategy directly.

AttributeGCP Resource-Based CUDAWS Standard RIAzure Reserved VM
Maximum Discount (3-year)Up to 70% (GPU) / 57% (compute)Up to 72%Up to 72%
Commitment BasisResources (vCPU, memory, GPU)Instance type (size-flexible within family)VM series and size
Cancellation / Early ExitNone — fully lockedNo cancellation, but Marketplace resaleExchange or refund (with penalty up to $50K)
Exchange MechanismNoneConvertible RI exchange (equal or greater value)Exchange for different VM size, series, or region
Secondary MarketNoneRI Marketplace — sell unused RIsNone (but exchange provides similar flexibility)
Flexible AlternativeSpend-based CUDs (dollar commitment)Compute Savings Plans (dollar commitment)Azure Savings Plans (dollar commitment)
Automatic DiscountSUDs (being phased out for new series)NoneNone
Service Coverage (flexible)Compute Engine, Cloud SQL, GKE (limited)EC2, Fargate, LambdaCompute, App Service, Functions

The key takeaway: GCP's commitment framework is the most rigid of the three hyperscalers. The absence of a marketplace, exchange mechanism, and cancellation option means that every CUD purchase must be sized with more precision and purchased with more confidence than the equivalent commitment on AWS or Azure. The consequence of over-commitment on GCP is irrecoverable stranded spend; on AWS and Azure, there are escape mechanisms that partially mitigate the same mistake.

This rigidity has a second-order effect on purchasing strategy: GCP CUD portfolios should be sized more conservatively than AWS or Azure equivalents — targeting 65–75% of stable capacity rather than the 80–90% that's appropriate for AWS. The residual on-demand spend is higher, but the stranding risk is materially lower. The lost discount on the conservatively-uncommitted 10–15% is almost always less costly than stranded commitments on the over-purchased 10–15%.

"On AWS, an over-commitment is a recoverable error. On GCP, it's a sunk cost. Size your GCP CUD portfolio for the workload you're certain about — not the workload you hope for."
Redress Compliance — Cloud & FinOps Practice
Section 05

Overcommitment Protection: Terms That Matter

Because GCP CUDs lack the exit mechanisms available on AWS and Azure, contractual protections negotiated at the agreement level become the primary defence against overcommitment. The following provisions should be pursued in every enterprise GCP commercial negotiation.

01

CUD Commitment Flexibility Provision

Negotiate the ability to reduce CUD commitments by 15–25% with 90 days' notice. Google's self-service CUDs are fully locked, but enterprise agreements can include flexibility provisions that allow commitment reduction — effectively creating a partial exit mechanism. This provision is the single most valuable protection for managing workload volatility on GCP.

Negotiation target: "We can reduce resource-based CUD commitments by up to 20% of the committed quantity with 90 days' written notice, with the reduction effective at the next quarterly anniversary."
02

Cross-Region CUD Application

Standard resource-based CUDs are region-locked. If workloads migrate between regions (common during disaster recovery events, performance optimisation, or regulatory changes), the CUD doesn't follow. Negotiate cross-region application provisions that allow CUD credits to apply to equivalent resources in different regions — or at minimum, the ability to transfer CUDs between regions with notice.

Negotiation target: "Resource-based CUDs can be reassigned to a different region with 30 days' notice, up to 2 reassignments per year per CUD."
03

Machine Series Migration Protection

CUDs committed to N2 vCPUs don't automatically apply to C3 or N4 vCPUs. When Google releases new machine series with better price-performance (which they do regularly), enterprises face a choice: stay on the committed older series or migrate to the newer series and strand the CUD. Negotiate provisions that allow CUD commitments to migrate to successor machine series at equivalent or better discount terms.

Negotiation target: "When a committed machine series is superseded by a direct successor series, the CUD commitment can be converted to the successor series at the same discount percentage with 60 days' notice."
04

GPU CUD Term Flexibility

Standard GPU CUD terms are 1-year and 3-year. For AI workloads where GPU architecture is evolving rapidly, even 1-year terms carry significant technology risk. Negotiate shorter GPU CUD terms (6-month or quarterly) or provisions that allow GPU CUD conversion to newer GPU types when they become available — preserving the commitment economics while accommodating hardware evolution.

Negotiation target: "A100 GPU CUDs can be converted to H100 or successor GPU CUDs at equivalent or better pricing when successor GPUs become generally available in the committed region."
05

CUD Scope Expansion

By default, CUDs are scoped to the purchasing project or shared across the billing account. For enterprises with complex organisational structures (multiple billing accounts, separate production and development organisations), CUD application scope can create gaps where committed resources in one billing context don't apply to consumption in another. Negotiate organisation-wide CUD scope that allows commitments to apply across all projects and billing accounts under the enterprise's GCP organisation.

Negotiation target: "All CUDs apply across the entire GCP organisation, regardless of billing account or project boundary, with automatic optimisation by the platform."
Section 06

The CUD Negotiation Framework

Negotiating GCP CUDs effectively requires combining self-service purchasing decisions with enterprise-level commercial negotiation. The framework below addresses both dimensions — the portfolio construction (what to buy) and the commercial terms (how to protect against overexposure).

Phase 1

Utilisation Baselining

Extract 90 days of Compute Engine utilisation data across all projects and regions. For each resource type (vCPUs, memory, GPUs), calculate the stable floor — the minimum resource consumption maintained consistently across the analysis period. The stable floor is your safe CUD commitment level. Anything above the floor is variable demand that should remain on-demand or be covered by spend-based CUDs.

Phase 2

SUD Interaction Modelling

Model the interaction between existing Sustained Use Discounts and proposed CUDs for every resource category. Calculate the incremental discount value of CUDs above the SUD baseline for SUD-eligible machine series (N1, N2). For non-SUD series (C3, C3D, N4), the full CUD discount represents incremental savings versus on-demand — making CUDs significantly more impactful for these workloads.

Phase 3

Portfolio Construction

Build the CUD portfolio in layers. Layer 1: Spend-based CUDs covering 50–60% of the projected stable compute spend — maximum flexibility, minimum stranding. Layer 2: Resource-based CUDs for high-confidence workloads where the resource type, region, and capacity are locked — maximum discount. Layer 3: On-demand for everything else. Target 65–75% total committed coverage (more conservative than the 75–85% target for AWS).

Phase 4

Negotiate Enterprise Terms

Present the CUD portfolio plan to Google's enterprise sales team as part of the broader GCP commercial discussion. Negotiate the five overcommitment protection provisions (Section 05), custom discount tiers for higher commitment volumes, and integration with spend-based commitment agreements. CUD terms negotiated at the enterprise level are materially better than self-service purchases.

Phase 5

Competitive Benchmarking

Present AWS RI/Savings Plan and Azure Reserved VM pricing for equivalent workloads alongside GCP CUD proposals. The comparative data establishes whether GCP's CUD pricing is competitive — and where it isn't, provides factual leverage for requesting improved terms. Google's pricing flexibility is highest when competitive alternatives are documented and costed.

Phase 6

Implement Governance

Establish monthly CUD portfolio monitoring: commitment utilisation rate, stranding exposure, upcoming expirations, and on-demand spend eligible for new CUDs. Quarterly, review the portfolio against evolving workload patterns and machine series adoption. The governance cadence prevents the silent accumulation of stranded commitments that makes GCP's CUD rigidity costly.

Section 07

6 Common CUD Mistakes & How to Avoid Them

1

Committing Based on Current Consumption Without Forecasting

The Mistake
Purchasing CUDs based on today's resource consumption without modelling the 1–3 year outlook. Workloads decommission, migrate to different machine series, containerise, or shift to serverless — each change potentially strands the CUD. GCP's lack of exit mechanisms makes this mistake irrecoverable.
The Fix
Model demand for the full CUD term before purchasing. Factor in planned machine series migrations, containerisation roadmaps, application decommissions, and organic growth. Commit to the confident floor, not the current consumption. On GCP, conservative sizing is always the right strategy.
2

Ignoring the SUD Phase-Out

The Mistake
Planning CUD purchases without accounting for the SUD phase-out on newer machine series. When workloads migrate from N2 (SUD-eligible) to C3 (no SUDs), the on-demand effective rate increases by ~30%. If CUDs aren't purchased to replace the lost SUD discount, the migration creates a material cost increase.
The Fix
Map all planned machine series migrations against SUD eligibility. For every workload moving to a non-SUD series, model the cost impact and include CUD purchasing for the new series as part of the migration plan. The CUD purchase should be a planned component of the migration, not an afterthought.
3

3-Year GPU CUDs in a 12-Month Innovation Cycle

The Mistake
Purchasing 3-year GPU CUDs (A100 or H100) to capture the maximum 70% discount when GPU architecture is evolving on a 12–18 month cycle. The 3-year term locks you into hardware that may be technically and economically superseded within 18 months.
The Fix
Limit GPU CUDs to 1-year terms unless you've negotiated GPU migration provisions (Section 05). The incremental discount of a 3-year term (10–15% above 1-year) is almost always less than the cost of stranding when the next-generation GPU arrives. For AI workloads, flexibility has more value than discount depth.
4

Purchasing CUDs Through the Console Instead of Enterprise Agreement

The Mistake
Using self-service CUD purchasing in the GCP console rather than negotiating CUD terms through the enterprise commercial agreement. Console-purchased CUDs receive standard discount tiers with no flexibility provisions, no custom terms, and no integration with spend-based commitments.
The Fix
All CUD purchases above $100K annual value should be negotiated through Google's enterprise sales team. Enterprise-negotiated CUDs offer deeper discounts, flexibility provisions, cross-region application, and integration with the broader commercial agreement. The negotiation investment is trivial relative to the multi-year savings improvement.
5

100% Resource-Based CUDs, Zero Spend-Based

The Mistake
Purchasing only resource-based CUDs (maximum discount) without a spend-based CUD foundation layer (maximum flexibility). When workloads change — different machine types, different regions, different resource ratios — the rigid resource-based CUDs strand while on-demand spend grows in the new configuration.
The Fix
Layer the portfolio: spend-based CUDs as the flexible foundation (50–60%), resource-based CUDs for high-confidence stable workloads (20–30%), and on-demand for the remainder. The blended portfolio sacrifices some discount depth for resilience against workload change — the right trade-off on GCP's rigid commitment framework.
6

No Coordination Between CUDs and Spend-Based Agreement

The Mistake
Purchasing CUDs independently from the enterprise spend-based commitment agreement. CUD spend counts toward spend-based agreement attainment — but without coordination, you may over-commit on CUDs in a way that inflates spend-based attainment (triggering a higher future commitment) or under-commit in a way that threatens attainment thresholds.
The Fix
Model CUD purchasing within the context of your spend-based commitment. Ensure CUD purchases support — but don't unnecessarily inflate — your spend-based attainment trajectory. The two commitment instruments should be managed as a coordinated portfolio, not independent purchasing decisions.
Section 08

Recommendations: 7 Priority Actions

1

Size CUD Portfolios More Conservatively Than AWS or Azure

Target 65–75% committed coverage on GCP versus the 75–85% appropriate for AWS. The absence of a marketplace, exchange, and refund mechanism means that over-commitment on GCP is irrecoverable. The cost of under-committing by 10% (paying on-demand rates on the uncommitted portion) is almost always less than the cost of stranding 10% of committed capacity.

2

Model the SUD Phase-Out Before Committing

Map every workload against SUD eligibility. For workloads on SUD-eligible series (N1, N2), calculate the true incremental CUD value above the SUD floor. For workloads migrating to non-SUD series (C3, C3D, N4), include CUD purchasing as a mandatory component of the migration plan to avoid the 30% effective cost increase when SUDs are lost.

3

Layer Spend-Based and Resource-Based CUDs

Build a layered portfolio: spend-based CUDs as the flexible foundation (50–60% of total commitment), resource-based CUDs for high-confidence stable workloads (20–30%), and on-demand for the remainder. This blended approach trades some discount depth for resilience — the correct trade-off given GCP's rigid commitment framework.

4

Limit GPU CUDs to 1-Year Terms

The AI hardware cycle is shorter than any CUD term. Until GPU architecture stabilises (which won't happen in the 2025–2027 timeframe), commit to GPUs for 1 year maximum. Negotiate GPU migration provisions through the enterprise agreement so that when successor GPUs launch, your commitment transitions rather than strands.

5

Negotiate Enterprise CUD Terms — Don't Self-Service

For any CUD commitment above $100K annual value, negotiate through Google's enterprise sales team rather than purchasing through the console. Enterprise-negotiated CUDs deliver deeper discounts, flexibility provisions (commitment reduction, cross-region application, machine series migration), and integration with spend-based agreements that self-service purchasing cannot provide.

6

Use AWS and Azure Pricing as Negotiation Benchmarks

Present AWS RI/Savings Plan and Azure Reserved VM pricing for equivalent workloads alongside GCP CUD proposals. Google is competing for enterprise cloud share — and their pricing flexibility is highest when competitive alternatives are costed and credible. The benchmark doesn't require a multi-cloud deployment; it requires a multi-cloud pricing comparison.

7

Implement Monthly CUD Portfolio Monitoring

Track commitment utilisation, stranding exposure, upcoming expirations, and on-demand spend eligible for new CUDs monthly. Because GCP CUDs have no exit mechanism, early detection of stranding risk is the only mitigation available — and it requires continuous monitoring, not periodic review. Integrate CUD monitoring with your broader FinOps governance cadence.

Section 09

How Redress Can Help

Redress Compliance's Cloud & FinOps Practice provides independent advisory on Google Cloud commitment optimisation — from utilisation analysis and CUD portfolio design through enterprise term negotiation and ongoing governance. We maintain zero commercial relationships with Google, AWS, or Azure.

CUD Portfolio Assessment

Comprehensive audit of existing CUD commitments — utilisation rates, stranding exposure, SUD interaction analysis, and spend-based agreement coordination with specific optimisation recommendations.

Commitment Strategy Design

Data-driven CUD portfolio construction — layered by instrument type, resource category, and term — with financial modelling of savings, stranding risk, and SUD interaction effects.

Enterprise Term Negotiation

Negotiation of enterprise CUD terms — flexibility provisions, cross-region application, machine series migration, GPU term management, and spend-based agreement integration.

Cross-Provider Benchmarking

Comparative pricing analysis across GCP CUDs, AWS RIs/Savings Plans, and Azure Reserved VMs — producing the competitive data that maximises Google's pricing flexibility.

Right-Sizing Analysis

Instance-level right-sizing across your GCP estate — identifying over-provisioned resources that should be resized before CUD purchasing to avoid committing to waste.

Ongoing FinOps Governance

Monthly CUD portfolio monitoring, stranding detection, utilisation reporting, and purchasing recommendations — maintaining optimal commitment coverage as workload patterns evolve.

100% Independent Advisory

Redress maintains zero commercial relationships with Google Cloud, AWS, Azure, or any FinOps tooling vendor. Our only relationship is with you — ensuring our recommendations maximise your cloud savings, not any provider's revenue.

Section 10

Book a Meeting

Schedule a confidential consultation with our Cloud & FinOps Practice team. We'll review your current GCP commitment portfolio and identify specific opportunities to maximise savings without overexposure.