OpenAI Negotiations

Data Privacy Risks in OpenAI Contracts

The Data Privacy Risks in OpenAI Contracts

The Data Privacy Risks in OpenAI Contracts โ€” and How to Mitigate Them

OpenAIโ€™s generative AI tools offer game-changing capabilities, but they also raise serious data privacy considerations for enterprises.

Global companies evaluating or negotiating OpenAI agreements must address issues from GDPR/CCPA compliance to the handling of sensitive business data used for model fine-tuning.

This brief overview highlights key privacy risks in OpenAI contracts and provides practical strategies to mitigate them through savvy negotiation and governance.

Protecting Data Privacy and Confidentiality

Protecting sensitive data is paramount when using OpenAIโ€™s services. Enterprise contracts should explicitly safeguard all information you send to the AI (prompts, files, etc.) and any AI-generated output.

Treat this information as confidential and ensure that OpenAI cannot use or share your data beyond providing the service.

By default, OpenAIโ€™s business terms state that they do not train on your data, but itโ€™s wise to cement this in writing.

The agreement should:

  • Forbid secondary use: Prohibit OpenAI from mining your inputs or outputs for model training or any other purpose outside your instructions. Your data remains your property.
  • Ensure confidentiality:ย Require OpenAI to handle your data confidentially, with robust safeguards and no unauthorized disclosure to third parties. This clause reduces the risk of trade secrets or personal information being leaked into the public domain.
  • Control retention: Give your company control over how long data is stored on OpenAIโ€™s servers. Ideally, you can opt for minimal or zero retention of prompts and outputs. Also, secure the right to request deletion of data on demand (and get certification of deletion). This aligns with privacy laws, such as the GDPRโ€™s โ€œright to be forgotten,โ€ and limits exposure by preventing unnecessary data from lingering for months.

Real-world example: In 2023, Samsung discovered engineers had pasted proprietary source code into ChatGPT. The AIโ€™s responses ultimately contained snippets of that code, effectively leaking sensitive information.

Incidents like this underscore the importance of having strict privacy clauses and internal usage policies.

By negotiating strong confidentiality and data-use provisions and training employees not to overshare, you prevent your companyโ€™s data from becoming someone elseโ€™s training material or compliance headache.

Meeting GDPR, CCPA, and Global Compliance

Global privacy regulations such as the EUโ€™s GDPR and Californiaโ€™s CCPA apply fully when you feed personal data into OpenAI.

To use these services legally, sign a Data Processing Addendum (DPA) with OpenAI that spells out each partyโ€™s privacy obligations.

In the DPA, OpenAI will confirm its role as a data processor acting on your instructions, while you remain the data controller responsible for personal data.

Key points to address:

  • Lawful basis & scope: Ensure you have a lawful basis to process any personal data you plan to input. The contract (and DPA) should limit OpenAIโ€™s processing to only what you authorize, for the purposes you specify.
  • Cross-border data transfer: Understand where data will be stored and processed. If you operate in Europe, OpenAI may contract through its EU entity and offer EU data residency options. Confirm that appropriate transfer mechanisms (such as Standard Contractual Clauses) are in place for any data leaving your region.
  • Individual rights & deletion: Under GDPR/CCPA, individuals can request their data be deleted or not sold. Your OpenAI agreement must let you honor these requests. That means OpenAI should commit to assisting with deletions or access requests in a prompt manner. Verify that the DPA includes clear provisions for data erasure on your instruction and audit rights to confirm compliance.
  • Privacy by design: Ask about OpenAIโ€™s built-in compliance features. For example, ChatGPT Enterprise allows users to set retention policies and includes admin tools to monitor usage. Utilizing these features can help you fulfill obligations such as data minimization and monitoring for misuse. Additionally, if you operate in a regulated sector (such as finance or healthcare), ensure the contract explicitly acknowledges any additional requirements (e.g., the need for a HIPAA Business Associate Agreement when using health data).

Be aware of the evolving legal landscape. For instance, a recent U.S. court order required OpenAI to preserve all AI output logs (including those normally deletable) for a lawsuit. This created tension with GDPR and contract promises to delete data.

To protect yourself, include a clause that OpenAI must promptly inform you of any legal demands on your data and work with you to minimize privacy impacts.

Overall, strong contractual privacy commitments, combined with vigilance, ensure your use of OpenAI is compliant with global laws.

Model Fine-Tuning: Hidden Data Pitfalls

OpenAI enables customers to fine-tune specific AI models using their data for more tailored results.

While fine-tuning can be valuable, it introduces privacy pitfalls if not managed carefully. When you fine-tune an AI with proprietary or personal data, that information becomes part of the modelโ€™s training memories.

The risk is that the model might later regurgitate snippets of your sensitive data in its responses.

For example, if a bank fine-tunes a model on real customer emails, a clever prompt might coax the AI to reveal a customerโ€™s details from those training examples.

To mitigate these risks, take a cautious and contractual approach to fine-tuning:

  • Limit sensitive data: Wherever possible, avoid using personally identifiable information or highly confidential data in fine-tuning. If you must, anonymize or mask it. Only include data that you are comfortable potentially echoing in some form.
  • Exclusive use and confidentiality: Ensure the contract specifies that any fine-tuned model trained on your data is for your exclusive use and inherits the same confidentiality protections as the raw data. OpenAI should not use or deploy your fine-tuned model for other customers, and it should be isolated to your account.
  • Right to delete models: Negotiate rights to delete or export the fine-tuned model and underlying training data if you end the service. This ensures your information doesnโ€™t linger in OpenAIโ€™s systems longer than necessary.
  • Test for leaks: After fine-tuning, conduct your own โ€œred-teamโ€ tests. Query the new model with various prompts to see if it inadvertently outputs any private details from your dataset. If it does, you know the model memorized something it shouldnโ€™t, and you can address this (e.g., remove that data and retrain, or adjust prompts). Requiring OpenAIโ€™s assistance in such testing can be part of the contract, too.

By fine-tuning with care and clear contract terms, you can harness custom AI capabilities without unwittingly exposing sensitive information.

Treat fine-tuning as you would any data-heavy project: with strict data handling agreements, minimal necessary data usage, and thorough verification of the results.

Indemnification and Liability Clauses

Even with strong privacy measures, things can go wrong, and thatโ€™s where indemnities and liability limits determine who pays for the damage.

Most OpenAI contracts offer some standard protections, but you should scrutinize these clauses to avoid being stuck with all the risk.

Indemnification:

This is your safety net if a third party brings a legal claim due to your use of OpenAI. OpenAIโ€™s standard terms now include an IP indemnity, meaning they will defend you if someone claims the AIโ€™s output (or the training data behind it) infringes on copyrights, patents, or other intellectual property.

Ensure this IP indemnity is in your contract, and as broad as possible. Given the unsettled landscape (e.g., authors suing over AI outputs), having OpenAI stand behind its model in such cases is critical.

Also, consider other indemnities. For instance, what if the AI outputs defamatory content about someone and you inadvertently publish it? OpenAI may resist covering that, but itโ€™s worth raising.

At minimum, make sure each party indemnifies the other for the risks under their control: OpenAI should cover issues arising from the AI technology (like IP violations or security flaws), and you would indemnify OpenAI if you misuse the service (for example, uploading data you have no right to use or violating OpenAIโ€™s use policies).

Keep your indemnity to OpenAI narrowly scoped to breach of contract or law on your part โ€“ you shouldnโ€™t be on the hook for unpredictable AI behavior when youโ€™re using the service as intended.

Liability limits:

Vendors typically cap their liability and exclude certain damages. OpenAIโ€™s contract likely disclaims indirect damages (like lost profits, lost data) and limits total liability to a multiple of what you paid (sometimes as low as the fees for the past 12 months).

Such a low cap may be unacceptable if youโ€™re relying on OpenAI for mission-critical tasks.

Imagine OpenAIโ€™s system misbehaves and leaks customer data, leading to $5 million in regulatory fines and cleanup costs โ€“ but if you paid only $100k in fees, a strict contract cap might only give you $100k back, leaving your company eating the rest.

Negotiate this.

Aim for a higher cap or, better, carve out specific high-risk events from any cap. Common carve-outs (exceptions where liability is uncapped or less capped) include:

  • Breach of confidentiality or data privacy obligations: If OpenAI violates the privacy terms (e.g., an employee intentionally leaks your data or they fail to follow the DPA), your losses shouldnโ€™t be limited to a token amount. Carve this out so they bear full responsibility for serious data mishaps.
  • Regulatory fines: Itโ€™s tough to get vendors to accept open-ended liability for fines, but you can argue for it in cases where the fine is a direct result of OpenAIโ€™s breach. At a minimum, use this argument to advocate for raising the liability cap for data breaches.
  • Gross negligence or willful misconduct: Most contracts wonโ€™t protect a party who intentionally does wrong or is grossly negligent. Ensure OpenAIโ€™s liability limits do not apply in such scenarios (for example, if they willfully ignore security best practices leading to a breach).

Also, verify that any indemnification obligations from OpenAI are outside the liability cap (so if they have to pay an IP claim on your behalf, itโ€™s not counted against a small cap).

Make the liability provisions mutual and fair โ€“ if you have a higher cap, it can apply to both sides appropriately.

In addition, you might require OpenAI to carry cyber liability insurance to give you confidence that they can pay out if a major incident occurs. The goal is to avoid a situation where you bear all the financial pain for mistakes that are outside your control.

Data Security and Governance Measures

Finally, ensure that both OpenAI and your organization maintain rigorous data security and governance around the AI deployment.

OpenAI, as a provider, should adhere to enterprise-grade security standards โ€“ and you can insist on some of these in the contract.

Look for commitments or include clauses for:

  • Security standards: OpenAI should ensure that it adheres to industry best practices for security (e.g., SOC 2 Type II, ISO 27001 certification, encryption standards). In practice, OpenAI encrypts data both at rest and in transit, and undergoes regular security audits. Your contract can reference these practices and include a promise to maintain them.
  • Breach notification: Time is critical in the event of a data breach. The agreement should oblige OpenAI to notify you immediately (or within 24-72 hours) of any security incident involving your data. This allows you to fulfill your legal duties (like GDPRโ€™s 72-hour breach notice rule) and to respond quickly. Additionally, ensure the contract requires OpenAI to promptly investigate and remediate the issue, coordinating with your team.
  • Access controls and personnel: Clarify who at OpenAI can access your data. Ideally, access is on a need-to-know basis only (for example, for debugging with your permission). OpenAIโ€™s enterprise offerings support features like admin dashboards, role-based access, and single sign-on โ€“ use these to tightly manage who on your side can input or view data, reducing accidental exposure. Consider requesting the right to audit or, at the very least, receive regular summaries of OpenAIโ€™s security and privacy assessments.
  • Data residency and segregation: If your company has strict data locality requirements, take advantage of any data residency options OpenAI provides (such as keeping data stored in the EU or other regions). Also, contract that your data will be logically segregated from other customersโ€™ data in the cloud, to prevent any inadvertent mixing or access.

Remember that technology safeguards alone arenโ€™t enough โ€“ internal governance is equally important. Establish clear policies for your employees and developers on how to use OpenAIโ€™s tools.

For instance, prohibit inputting highly sensitive personal data or company crown jewels unless necessary and approved. Provide a checklist or training so staff recognize what should be excluded from prompts (e.g., passwords, personally identifiable information, unreleased financial data).

Many privacy incidents stem from user error or lack of awareness so that upfront guidance can save headaches later.

By combining contractual security commitments from OpenAI with your strong data governance, you create a defense-in-depth for privacy.

The table below summarizes common risk areas and how to address them in your OpenAI agreement:

Risk AreaMitigation in Contract & Practice
Unauthorized data use (training)
OpenAI using your inputs or outputs to improve their models without permission.
Explicit โ€œno-trainingโ€ clause โ€“ your data may not be used to train or improve AI models outside your own usage. All data remains confidential and solely for your organizationโ€™s service.
GDPR/CCPA non-compliance
Personal data in prompts could violate privacy laws if not handled properly.
Sign a DPA defining OpenAI as processor under GDPR/CCPA. Include obligations for deletion, assistance with data subject requests, and adherence to EU SCCs for cross-border transfers.
Over-retention of data
Sensitive info stored longer than needed, increasing breach risk.
Retention limits โ€“ Negotiate the right to specify data retention period (e.g. X days or immediate deletion). Leverage OpenAI Enterprise features to enforce minimal retention and purge data on request.
Data breach liability
OpenAIโ€™s security failure leads to a leak, but their liability is capped.
Liability carve-outs โ€“ Carve out breaches of confidentiality/privacy from liability limits so OpenAI bears appropriate costs. Require prompt breach notification and vendor cooperation in response efforts.
Fine-tuning leaks
Your fine-tuned model reveals private training data in its responses.
Fine-tune safeguards โ€“ Ensure fine-tuned models are exclusive to you and covered by confidentiality. Test models for inadvertent data leakage. Limit inclusion of direct PII in training datasets whenever possible.

Recommendations

  • Demand a robust DPA: Always execute OpenAIโ€™s Data Processing Addendum (or your own) to ensure GDPR, CCPA, and other regulations are addressed. This should include commitments on data handling, breach notification, and assistance with compliance requests.
  • Lock down data use in contract: Add clear confidentiality and data usage clauses. Specify that OpenAI may only use your data to provide the service to you โ€“ no sharing, no secondary use. This prevents your proprietary or personal data from being used as AI training fodder.
  • Set data controls and retention: Negotiate the ability to control data retention and deletion. For instance, choose a zero-retention policy for sensitive inputs if available. Make sure you can request data deletion at any time (and that OpenAI will comply swiftly).
  • Carve out critical liabilities: Push to carve out key risks from contract liability caps. Data breaches, confidentiality breaches, and IP indemnity obligations should not be subject to tiny liability limits. Obtaining a higher cap (or uncapped) for these areas ensures that OpenAI has a vested interest in the outcome if something goes wrong.
  • Secure an IP indemnity (and more): Ensure the contract includes OpenAIโ€™s indemnification for intellectual property claims related to the AI or its outputs. This is essential, given the copyright concerns associated with AI. If possible, discuss indemnification for other legal issues (e.g., privacy violations or defamation) โ€” even if OpenAI wonโ€™t fully agree, it highlights your concerns and may lead to compromise elsewhere (like stronger warranties or support).
  • Insist on security assurances: Donโ€™t shy away from asking how OpenAI protects your data. Confirm they follow industry security standards (encryption, SOC 2 audits, etc.). Include in the contract that they will maintain these standards, notify you of any incidents, and assist with any security investigations.
  • Prepare internal guidelines: As part of contract planning, develop internal rules for employees on using OpenAI. For example, prohibit entering customer personal data or secret source code into prompts. By controlling what goes into the model, you reduce the chance of a privacy breach.
  • Evaluate the need to share:ย Before sending any dataset to OpenAI (especially for fine-tuning), evaluate whether itโ€™s truly necessary. Share the minimum data required for the AI task. The less data exposed, the lower the privacy risk.

Checklist: 5 Actions to Take

  1. Audit Your Data: Inventory the types of data you plan to send to OpenAI. Classify what is sensitive (personal data, confidential business data) and decide if those should be used at all. Remove or anonymize any high-risk data before it ever reaches the AI.
  2. Get the Paperwork in Place: Request and sign OpenAIโ€™s Data Processing Addendum to cover GDPR/CCPA requirements. If necessary for your industry, also sign any additional agreements (e.g., a Business Associate Agreement, or BAA, for health data). Ensure these documents are attached to your main contract.
  3. Negotiate Key Terms: Review OpenAIโ€™s standard contract and identify any gaps or overly one-sided terms that require negotiation. Propose amendments focusing on: data use limitations, confidentiality, retention/deletion rights, indemnities, and liability carve-outs. Use real examples (like past data leaks or lawsuits) to justify why these terms matter.
  4. Establish Usage Policies: Internally, create a clear policy for your teams on how to use OpenAIโ€™s tools. Define what information is off-limits to input. Train employees about the privacy and security risks. This step is critical โ€“ even the best contract wonโ€™t prevent an accidental sensitive data leak if an untrained user pastes in the wrong content.
  5. Monitor and Adapt: Once the OpenAI service is in use, continuously monitor compliance to ensure ongoing adherence. Ensure OpenAI is honoring deletion requests and security promises (you might schedule periodic check-ins or requests for certifications). Also, monitor the AIโ€™s outputs for any signs of it exposing data it shouldnโ€™t. If regulations or OpenAIโ€™s policies change, be ready to update the agreement or your usage practices accordingly.

FAQ

Q1: Can we use OpenAIโ€™s services without violating GDPR or CCPA?
A: Yes, but you need to take compliance steps. Have OpenAI sign a Data Processing Addendum so they contractually commit to GDPR/CCPA principles (like only processing data on your instructions). Utilize features like data retention controls to comply with deletion requirements. Also, avoid inputting personal data unless necessary. If you do, ensure that you have obtained consent or another legal basis and that OpenAIโ€™s processing is transparent to your users, if required. With the proper contract and configurations, OpenAI can be used in line with GDPR, CCPA, and similar laws.

Q2: Will OpenAI use our data to train its models or improve its services?
A: For enterprise customers and API users, OpenAI does not use your data to train its general models by default. Your prompts and outputs stay isolated. However, you should still confirm this in your contract. Make sure the agreement explicitly states that OpenAI wonโ€™t use your provided data or any AI output for their own research or model improvement without your permission. If there is an opt-in for data usage (sometimes offered to help improve the AI), it should be strictly voluntary. In short, your business data remains yours and wonโ€™t secretly feed the AI engine of OpenAI or any other party.

Q3: What happens if OpenAI has a data breach involving our information?
A: Under the contract, OpenAI should be obligated to inform you immediately if a breach occurs. They will likely investigate and take remedial actions, but you will be responsible for managing the impact on your side (like notifying affected individuals or regulators, if required). Thatโ€™s why itโ€™s important to negotiate liability: if the breach were OpenAIโ€™s fault (say, a security lapse on their end), youโ€™d want them to cover costs such as regulatory fines or customer notifications. In practice, ensure the contract doesnโ€™t limit OpenAIโ€™s liability too strictly in the event of a breach. Additionally, consider requiring OpenAI to maintain cyber insurance. While you canโ€™t eliminate all risk, a solid contract and response plan will make handling a breach scenario much smoother.

Q4: How do we handle highly sensitive or regulated data (like health or financial information) with OpenAI?
A: Cautiously and with extra safeguards. First, check OpenAIโ€™s policies โ€“ certain data, such as Protected Health Information (PHI), may require a special agreement (a HIPAA Business Associate Agreement) or may be disallowed on the standard platform. If you must use regulated data, sign the needed addendum and ensure compliance measures are in place (encryption, access controls, audit logs). You might also choose an option like an on-premise deployment or a specialized cloud region via OpenAIโ€™s partners (for instance, Microsoft Azureโ€™s OpenAI service) to keep data within desired jurisdictions. Importantly, minimize what you share: even with a BAA, donโ€™t feed the AI more patient or customer data than you need to. Every piece of sensitive data that remains outside the system is one less piece that could potentially be exposed.

Q5: Do we own the AIโ€™s outputs and our data, and how do we protect that?
A: Yes โ€“ under OpenAIโ€™s terms, you retain ownership of both the content you input and the content the AI generates for you. OpenAI isnโ€™t claiming your data or your outputs as its property. To protect this, ensure the contract explicitly states thatย all inputs and outputs are your confidential information and thatย you have full rights to them. This allows you to use the AIโ€™s results in your business freely (modify them, publish them, etc.) without fearing a copyright claim from OpenAI or others. It also means OpenAI must treat those outputs with the same care as any of your sensitive data. In summary, you own what the AI produces for you โ€” just ensure the agreement puts that in writing and keeps those outputs just as private as the data you put in.

Read about our GenAI Negotiation Service.

The 5 Hidden Challenges in OpenAI Contractsโ€”and How to Beat Them

Read about our OpenAI Contract Negotiation Case Studies.

Would you like to discuss our OpenAI Negotiation Service with us?

Please enable JavaScript in your browser to complete this form.
Name
Author
  • Fredrik Filipsson

    Fredrik Filipsson is the co-founder of Redress Compliance, a leading independent advisory firm specializing in Oracle, Microsoft, SAP, IBM, and Salesforce licensing. With over 20 years of experience in software licensing and contract negotiations, Fredrik has helped hundreds of organizationsโ€”including numerous Fortune 500 companiesโ€”optimize costs, avoid compliance risks, and secure favorable terms with major software vendors. Fredrik built his expertise over two decades working directly for IBM, SAP, and Oracle, where he gained in-depth knowledge of their licensing programs and sales practices. For the past 11 years, he has worked as a consultant, advising global enterprises on complex licensing challenges and large-scale contract negotiations.

    View all posts

Redress Compliance