Why This Article Exists

In 2025, research showed that 34.8% of employee inputs to ChatGPT contained sensitive data, up from 11% in 2023. Meanwhile, 27% of consumer ChatGPT messages were work-related — meaning corporate data is flowing into AI systems whether your organisation has sanctioned it or not. Your contract is the only enforceable mechanism that governs what happens to that data once it enters a vendor’s system. A well-negotiated data privacy framework in your AI contract is not a compliance exercise; it is your organisation’s last line of defence against data exposure, regulatory penalty, and reputational damage.

1. The Trust Page Is Not Your Contract

Every major AI vendor maintains a public-facing trust or privacy page that describes their data practices. These pages make clear, reassuring statements: data is encrypted, models are not trained on enterprise data, content is not reviewed by humans. Procurement teams and CIOs frequently cite these trust pages as evidence that their data privacy concerns have been addressed.

This is a dangerous assumption. Trust pages are unilateral statements of current practice, not contractual commitments. They differ from your enterprise agreement in three critical ways.

Trust pages can change without notice. Vendors routinely update their privacy practices, data retention policies, and terms of service. Unless your contract incorporates the trust page by reference (which most do not), changes to the trust page do not breach any obligation to you. The vendor that promises “we never train on your data” on their trust page today can update that page tomorrow, and your only recourse is whatever your signed contract says.

Trust pages use qualified language. Read carefully and you will find hedging: “we do not use customer data to train our models by default” (which implies there are non-default configurations where they do); “data is retained for up to 30 days” (which means anywhere from zero to 30 days at the vendor’s discretion); “enterprise-grade security” (which is a marketing term with no legal definition). These qualifications are intentional. They preserve the vendor’s flexibility while creating an impression of commitment.

Trust pages cover “the service” generically. Your organisation may use multiple tiers, multiple models, multiple access methods (API, subscription, embedded), and multiple regions. The trust page typically describes the most favourable data practices (usually the Enterprise tier) without clearly delineating which practices apply to which product, tier, or configuration. A commitment that applies to your Enterprise subscription may not apply to your API usage, your developers’ free-tier experimentation, or your third-party SaaS tools that call AI models in the background.

The contract — specifically, the master service agreement (MSA), data processing addendum (DPA), and any negotiated amendments — is the only document that creates enforceable obligations. Everything else is context, not commitment.

2. The Twelve Provisions Your Contract Must Include

The following twelve provisions represent the minimum contractual framework for enterprise AI data privacy. Each provision addresses a specific risk that trust pages leave unresolved. For each, we describe the risk, the contract language required, and the common gaps we find in standard enterprise AI agreements.

3. Provision 1: Data Training Exclusion

The risk: Your prompts, responses, documents, and data could be used to train, fine-tune, evaluate, or otherwise improve the vendor’s models — making your proprietary information a permanent (and irrecoverable) part of a model that serves your competitors.

What your contract must say: An explicit, unconditional prohibition on using any customer data (including prompts, completions, uploaded documents, embeddings, and metadata) for model training, fine-tuning, evaluation, safety testing, or general model improvement. The prohibition must cover all tiers, all models, all access methods, and all processing environments under the agreement. It must survive termination of the agreement.

The common gap: Many standard agreements exclude training “by default” or “for enterprise customers” without defining what constitutes an enterprise customer or what happens if your account is misconfigured. Some agreements carve out “safety and abuse monitoring” from the training exclusion, which creates an ambiguous pathway for data to be processed by the vendor’s systems in ways that may not align with your expectations. Insist on a blanket exclusion with no carve-outs, or negotiate specific, narrow carve-outs with documented scope and data handling procedures.

4. Provision 2: Data Retention and Deletion

The risk: Your data persists in the vendor’s systems long after the interaction that generated it, creating ongoing exposure to breach, subpoena, or unauthorised access.

What your contract must say: A defined maximum retention period for all categories of customer data: prompts, completions, uploaded documents, conversation logs, metadata, and any cached or intermediate data. The contract should specify what happens at the end of the retention period (automatic deletion), your right to request deletion on demand before the retention period expires, and the vendor’s obligation to provide a deletion certificate or equivalent written confirmation.

The common gap: Standard agreements often state retention periods of “up to 30 days” for abuse monitoring without specifying exactly what is retained, in what form, or who can access it during the retention window. Some vendors retain metadata (timestamps, token counts, model used) indefinitely even after content data is deleted. For zero-data-retention (ZDR) configurations, verify that ZDR applies to all data streams: prompts, completions, system logs, and intermediate processing data — not just the primary input/output.

Zero Data Retention Is Not Always Zero

Some vendors’ “zero data retention” policies still retain metadata, billing records, and system telemetry. Others process data in memory but write temporary files to disk during inference. Confirm exactly what “zero retention” means in your vendor’s implementation, and ensure the contract definition matches the technical reality. Request a technical architecture document describing the data flow through the inference pipeline and identify every point where data touches persistent storage, even temporarily.

5. Provision 3: Data Residency and Processing Location

The risk: Your data is processed in jurisdictions that do not meet your regulatory requirements, or is transferred across borders without appropriate legal mechanisms.

What your contract must say: The specific geographic region(s) where your data will be processed (inference) and stored (at rest). If cross-border transfers are necessary (e.g., for certain model operations), the contract must specify: which transfers occur, to which jurisdictions, under which legal mechanism (Standard Contractual Clauses for GDPR, adequacy decisions, or binding corporate rules), and what additional safeguards apply.

The common gap: Several vendors offer “data residency” for data at rest but process inference requests globally or in a centralised region. This means your stored data stays in the EU, but every time a user asks a question, the prompt and response may be processed on GPU infrastructure outside the EU. Some vendors have disclosed that fine-tuning operations may involve temporary data relocation outside the selected region. Your contract should specify residency for both data at rest and data in transit/processing. Regional endpoint options are available from most major providers, sometimes at a pricing premium (Anthropic charges 10% more for US-only inference; Google and AWS charge regional premiums on certain configurations).

6. Provision 4: Sub-Processor Controls

The risk: The AI vendor shares your data with third-party sub-processors (cloud infrastructure providers, model providers, content moderation services) without your knowledge or consent.

What your contract must say: A current list of all sub-processors who may access your data, with descriptions of their role and the data they access. A notification mechanism (written notice, typically 30–60 days in advance) before any new sub-processor is engaged. Your right to object to new sub-processors and, if the objection cannot be resolved, to terminate the affected service without penalty. Flow-down obligations that require sub-processors to be bound by data protection terms at least as restrictive as those in your agreement.

The common gap: Standard DPAs often reference a sub-processor list hosted on the vendor’s website, which can be updated without individual notice. The notification period may be as short as 10 days, or the contract may say “reasonable notice” without defining what is reasonable. Insist on a minimum 30-day advance notice and an explicit objection right with a contractual remedy (not just “the parties will discuss in good faith”).

7. Provision 5: Derived Data and Model Artifacts

The risk: Even if your raw data is deleted, information derived from your data — embeddings, cached features, fine-tuned model weights, summarisation outputs, vector representations — persists in the vendor’s systems and may be accessible to other customers through the model’s outputs.

What your contract must say: A definition of “derived data” that explicitly includes embeddings, vector representations, fine-tuned model weights, cached intermediate outputs, model checkpoints, and any other artifact generated from your data. The same retention, deletion, and training exclusion obligations that apply to raw data must apply to derived data. If you have fine-tuned a model with your data, the contract must guarantee that the fine-tuned model is for your exclusive use and is deleted upon termination.

The common gap: This is the single most underspecified area in enterprise AI contracts. Most standard agreements define “customer data” as the data you input, not the data derived from your input. A vendor can truthfully say “we deleted your data” while retaining embeddings or model weights that encode information from your data. The derived data definition must be negotiated explicitly.

8. Provision 6: Access Controls and Human Review

The risk: Vendor employees or automated systems access your data for purposes beyond what you authorised — abuse monitoring, content moderation, model evaluation, debugging, or quality assurance.

What your contract must say: A clear statement of who (by role, not by name) within the vendor’s organisation may access your data, under what circumstances, and for what purposes. Whether human reviewers can read your prompts and completions (and if so, under what conditions). Your right to opt out of human review entirely for sensitive workloads. Logging and audit requirements for any access to your data by vendor personnel.

The common gap: Standard agreements often reserve broad rights for the vendor to access data for “service improvement,” “safety monitoring,” or “abuse prevention” without specifying the scope or controls on that access. Some vendors offer a “Limited Access” or “no human review” mode but only upon application and approval, not as a default contract right. If your use case involves sensitive data (financial, legal, healthcare, HR), negotiate no-human-review as a contractual right, not an optional programme you must apply for separately.

9. Provision 7: Breach Notification

The risk: A data breach occurs and you are not notified in time to meet your own regulatory obligations (GDPR requires notification within 72 hours) or to mitigate the impact.

What your contract must say: A maximum notification timeline (48–72 hours from the vendor’s discovery of the breach). A description of what the notification must include: nature of the breach, categories and volume of data affected, likely consequences, measures taken or proposed. The vendor’s obligation to cooperate with your investigation, provide forensic data, and support your regulatory notifications. A commitment that the vendor will not publicly disclose the breach in a way that identifies your organisation without your prior consent (except where legally required).

The common gap: Standard agreements often use language like “without undue delay” or “promptly” rather than specifying a numeric timeline. Some agreements notify you of breaches only if the vendor determines that your data was “likely” affected, giving the vendor discretion over whether to notify at all. Insist on a defined timeline (72 hours maximum) and notification for any breach that “may have” affected your data, not only those where the vendor has confirmed impact.

10. Provision 8: Data Processing Addendum

The risk: Your AI vendor processes personal data without a legally adequate data processing agreement, exposing your organisation to regulatory penalties under GDPR, CCPA, and other privacy laws.

What your contract must say: A signed Data Processing Addendum (DPA) that establishes your organisation as the data controller and the AI vendor as the data processor. The DPA must specify: the categories of data processed, the purposes of processing (limited to providing the AI service), the legal basis for processing, applicable data protection law(s), Standard Contractual Clauses for international data transfers, data subject rights cooperation (the vendor must assist you in responding to access, deletion, and portability requests within regulatory timelines), and data protection impact assessment (DPIA) cooperation.

The common gap: Most major AI vendors offer a standard DPA, but these are template documents designed to be minimally sufficient. Negotiate amendments that: narrow the scope of processing to only what your specific use case requires, add specific timelines for responding to data subject requests (not just “reasonable assistance”), and include your right to audit the vendor’s compliance with the DPA (either directly or through a third-party auditor).

11. Provision 9: HIPAA and Sector-Specific Compliance

The risk: Your organisation is subject to sector-specific data protection requirements (HIPAA for healthcare, PCI-DSS for payment data, GLBA for financial data) that generic AI vendor contracts do not address.

What your contract must say: For HIPAA: a signed Business Associate Agreement (BAA) that binds the AI vendor to HIPAA obligations. The BAA must prohibit the use of Protected Health Information (PHI) for model training, specify data retention limits for PHI, require encryption of PHI at rest and in transit, establish breach notification procedures specific to HIPAA timelines (60 days), and flow down HIPAA obligations to all sub-processors. For PCI-DSS: contractual acknowledgement that the vendor’s processing environment for payment card data meets PCI-DSS requirements. For financial services: compliance with applicable GLBA, SOX, and regulator-specific data handling requirements.

The common gap: AI vendors are increasingly willing to sign BAAs and sector-specific addenda, but the scope of coverage varies. Some BAAs cover only the API endpoint configured for zero data retention, not the subscription interface. Some BAAs exclude specific features (such as web search or fine-tuning). Verify that the BAA covers every AI service and feature your organisation uses, not just the primary inference endpoint.

Need Expert AI Advisory?

Redress Compliance provides independent GenAI licensing advisory — fixed-fee, no vendor affiliations.

Explore GenAI Advisory Services →

12. Provision 10: Data Portability and Exit

The risk: When your contract ends, you cannot extract your data, configurations, or customisations from the vendor’s platform, creating de facto lock-in and potential data loss.

What your contract must say: Your right to export all customer data (conversation logs, uploaded documents, knowledge base content, agent configurations, fine-tuned model weights) in a standard, machine-readable format within a defined period (30–90 days) after contract termination. The vendor’s obligation to delete all customer data and derived data (as defined in Provision 5) within a defined period after export is complete, with written confirmation of deletion. A transition assistance period during which the vendor maintains service availability to support your migration to an alternative provider.

The common gap: Standard agreements rarely specify the export format, the export timeline, or the scope of exportable data. Without explicit portability terms, you may find that conversation histories, agent logic, knowledge base configurations, and fine-tuned models are effectively non-exportable. Negotiate these terms before signing, when you have maximum leverage.

13. Provision 11: Indemnification for Data Incidents

The risk: A data privacy incident caused by the vendor’s failure (breach, unauthorised training, data residency violation) generates regulatory fines, legal costs, and reputational damage for your organisation, with no contractual allocation of responsibility.

What your contract must say: The vendor indemnifies your organisation against losses arising from the vendor’s breach of its data protection obligations under the agreement. The indemnification should cover: regulatory fines and penalties (to the extent legally permissible), legal defence costs, data breach notification and remediation costs, and third-party claims arising from the vendor’s data handling. The indemnification obligation should be uncapped or, at minimum, capped at a level that reflects the severity of potential data incidents (not capped at the contract value, which is typically inadequate for a major data breach).

The common gap: AI vendor standard agreements typically cap total liability at 12 months of fees paid, which may be wholly inadequate for a data breach affecting thousands of records. Some agreements exclude regulatory fines from the indemnification entirely. Negotiate a higher liability cap for data protection breaches (a “super cap”) that reflects the actual risk exposure, and ensure regulatory fines are within scope to the extent permitted by law.

14. Provision 12: Terms Change Notification

The risk: The vendor changes its privacy practices, terms of service, or data processing procedures during your contract term without notice, undermining the protections you negotiated.

What your contract must say: The vendor must provide written notice (minimum 60–90 days) of any material change to its data processing practices, privacy policy, security controls, sub-processor list, or terms of service. Your right to review and approve material changes before they take effect. If a change materially degrades your data protection, your right to terminate the affected service without penalty within a defined window (typically 30–60 days after the change). A “most favourable terms” clause that guarantees you will always receive data protection terms at least as strong as the vendor’s current standard enterprise terms, even if those standard terms improve after your contract was signed.

The common gap: Most standard agreements reserve the vendor’s right to modify terms with minimal notice (sometimes as little as “posting an update on our website”). Without a contractual change-notification right, your negotiated protections can be effectively diluted by the vendor updating their standard practices. This provision ensures that your contract represents a floor, not a ceiling, for data protection.

15. The Current Vendor Privacy Landscape

Anthropic (Claude)

Enterprise plans exclude customer data from model training. Offers HIPAA-ready configuration with BAA. US-only inference available at 10% pricing premium. Custom data retention configurable for Enterprise. Compliance API for audit logs. The gap: dynamic usage limits are opaque, making it difficult to predict when and how data is processed under throttling conditions. Negotiate explicit retention terms and a DPA with audit rights.

OpenAI (ChatGPT/API)

Enterprise and API plans exclude customer data from training. Zero Data Retention (ZDR) available for API customers. BAA available for ChatGPT for Healthcare and API. Data residency at rest in multiple regions (US, EU, UK, Japan, and others). In-region GPU inference available for US and EU. The gap: standard API retains data for up to 30 days for abuse monitoring unless ZDR is configured. Fine-tuning may involve temporary cross-region data processing. Ensure ZDR is contractually guaranteed, not just configured.

Google (Gemini/Vertex AI)

Gemini Enterprise Standard and Plus exclude customer data from training. Workspace Enterprise offers data regions, DLP, and advanced compliance controls. Vertex AI processes data within the selected Google Cloud region. The gap: the entry-level Gemini Enterprise Starter edition does permit data use for service improvement. Consumer Gemini plans (AI Pro, AI Ultra) do not provide enterprise data governance. Ensure your contract explicitly covers all Gemini products your organisation uses, including Workspace-embedded features.

AWS (Bedrock)

Bedrock does not use customer data to train foundation models by default. Data is processed within the selected AWS region. Encryption at rest and in transit. PrivateLink available for network-level isolation. The gap: model provider-specific data practices vary — while AWS does not train on your data, confirm that each third-party model provider on Bedrock (Anthropic, Meta, Cohere) is also contractually bound not to use data received through Bedrock for training. AWS’s shared responsibility model means some data governance obligations remain with you.

Microsoft (Azure OpenAI)

Enterprise data not used for model training. Data at rest stored in the selected Azure region. SOC 2, ISO 27001, HIPAA BAA, FedRAMP available. Limited Access programme provides zero logging and no human review for approved customers. The gap: fine-tuning operations may involve temporary data relocation outside the selected geography. Standard abuse monitoring retains data for 30 days unless Limited Access is approved. Ensure your contract addresses both at-rest and in-processing data residency, and negotiate Limited Access as a contractual right.

16. Pre-Signature Checklist

Before You Sign Any Enterprise AI Agreement

Use this checklist to verify that all twelve provisions are addressed. Any “no” represents an unresolved risk that should be negotiated before signature.

📊 Free Assessment Tool

Does your AI contract protect your data? Our free assessment benchmarks your terms against enterprise best practices.

Take the Free Assessment →

1. Data training exclusion: Is there an explicit, unconditional, contractual prohibition on using your data (including derived data) for model training, evaluation, or improvement? Does it cover all products, tiers, and access methods under the agreement?

2. Retention and deletion: Is the maximum retention period defined in the contract (not just the trust page)? Do you have the right to request deletion on demand with written confirmation?

3. Data residency: Does the contract specify where data is processed (inference) and stored (at rest)? Are cross-border transfers documented with appropriate legal mechanisms?

4. Sub-processors: Do you have a current sub-processor list, advance notice of changes (30+ days), and an objection right with a contractual remedy?

5. Derived data: Does the contract define derived data (embeddings, model weights, cached outputs) and apply the same protections as raw data?

6. Access controls: Is vendor access to your data limited by role and purpose? Do you have the right to opt out of human review?

7. Breach notification: Is there a defined timeline (72 hours maximum)? Does the notification trigger on “may have” affected, not only confirmed impact?

8. DPA: Is a signed DPA in place that establishes controller/processor roles, limits processing purposes, includes SCCs for international transfers, and provides audit rights?

9. Sector compliance: If required, is a BAA (HIPAA), PCI-DSS acknowledgement, or sector-specific addendum signed and in scope for all AI services used?

10. Data portability: Can you export all data and configurations in a standard format within a defined period after termination? Is post-termination deletion guaranteed with written confirmation?

11. Indemnification: Does the vendor indemnify for data protection breaches? Is the liability cap adequate (not just 12 months of fees)?

12. Change notification: Will you receive 60–90 days’ written notice of material changes to data practices? Do you have the right to terminate without penalty if changes degrade your protections?

17. FAQ

Do all enterprise AI vendors offer DPAs?

Yes. All major providers (Anthropic, negotiating OpenAI enterprise contracts, Google, AWS, negotiating Microsoft GenAI contracts) offer standard DPAs. However, these are template documents designed for broad applicability. For organisations with specific regulatory requirements, the standard DPA will need amendments — particularly around processing scope, data subject request timelines, and audit rights. Request the DPA early in the procurement process and involve your legal team in reviewing it before contract negotiations begin.

Is a trust page commitment enough for audit purposes?

No. Auditors and regulators will look for contractual commitments, not website statements. A trust page is useful context but cannot be cited as evidence of an enforceable obligation. Your audit trail should reference the MSA, DPA, BAA (if applicable), and any negotiated amendments.

Can we require zero data retention for all AI services?

For API services, most vendors offer ZDR as a configurable option. For subscription services (the interactive chat interface), ZDR is typically not available because conversation history is a core feature. You can negotiate shorter retention periods (7–14 days vs. the standard 30 days) and the right to delete conversation history on demand. For HIPAA-regulated data, ZDR on the API is generally a requirement.

What is the biggest contractual gap in standard AI agreements?

Derived data (Provision 5). Almost every standard AI agreement defines “customer data” as the data you input, without addressing the artifacts derived from that data. This means a vendor can comply with its deletion obligations by removing your raw prompts and documents while retaining embeddings, model weights, and cached representations that encode your proprietary information. Negotiate an explicit derived data definition and matching protections.

How do we handle employees using consumer AI plans with corporate data?

Consumer AI plans (ChatGPT Plus, Claude Pro, Gemini Advanced) do not provide enterprise data protections. Data entered into these plans may be used for model training, retained indefinitely, and accessed by vendor employees. The contractual solution is two-fold: (1) provide sanctioned enterprise AI tools with better capabilities and equivalent convenience, and (2) include an acceptable use policy in your employment terms that prohibits using consumer AI tools for corporate data. The governance solution is to monitor expense reports for consumer AI subscriptions and redirect users to sanctioned platforms.

Should we negotiate these provisions ourselves or use external help?

The twelve provisions described in this guide are achievable by a well-prepared procurement and legal team. However, the negotiation is more effective with vendor-specific intelligence: knowing what each vendor’s standard terms actually say, where they are flexible, and what other enterprises have successfully negotiated. Redress Compliance provides this intelligence for Anthropic, OpenAI, Google, AWS, and Microsoft AI agreements. Learn more about our GenAI licensing knowledge hub independent GenAI advisory services Services →