Privacy Is Not a Feature You Toggle
In May 2025, a federal magistrate judge in New York ordered OpenAI to preserve and segregate every ChatGPT output log that would otherwise be deleted, including conversations users had explicitly asked to delete. The order, issued in the copyright case brought by The New York Times, overrode OpenAI's own published promise to permanently remove deleted chats within 30 days. For roughly four months, "delete" did not mean delete. It meant retain, in case it's needed later.
Most people never heard about it. The ones who did mostly shrugged. But sit with it for a second, because it's the cleanest illustration of a truth the AI era keeps trying to paper over:
The delete button was always a promise, not a guarantee.
A promise can be revoked: by a court order, a policy change, a breach, a subpoena, or a quiet line in an updated terms of service. The only data that genuinely cannot be produced later is data that was never kept in the first place. Everything else is a question of who is asking and how hard.
This essay is about the gap between the privacy we feel we have when we talk to AI and the privacy we actually have, and about what it really takes to close that gap. Not the marketing version. The expensive, unglamorous, architectural version.
We treat these tools like they forget. They mostly don't.
Talking to a chatbot feels ephemeral, like thinking out loud or muttering to yourself. That feeling is doing a lot of dangerous work.
The behavioral data is stark. In a January 2025 TELUS Digital survey of 1,000 employees at large U.S. companies, 57% admitted entering confidential information into public AI tools like ChatGPT, Gemini, and Copilot, and 68% were doing it through personal accounts, outside any corporate oversight (TELUS Digital, reported by Tech Monitor). A separate enterprise security analysis found that 77% of employees paste data into generative-AI tools, with 82% of that activity flowing through unmanaged personal accounts, making generative AI the single largest channel for corporate data leaving the building (LayerX). The National Cybersecurity Alliance put the share of workers who admit sharing sensitive information with AI at 43%.
We already have the canonical cautionary tale. In 2023, Samsung engineers pasted proprietary semiconductor source code and internal meeting notes into ChatGPT to debug and summarize them. The company's reaction was not a memo. It was a ban. The reason they gave was telling: once the data was transmitted, it was "impossible to retrieve." Amazon, JPMorgan, and a wave of banks reached the same conclusion and clamped down too.
The instinct that AI is a private space is understandable. It's also, by default, wrong.
Retention is the default. Privacy is the exception.
Here's the part that surprises people: the major AI providers are not being sneaky. They tell you, in their own documentation, that they keep your data. You just have to read it.
- OpenAI's API retains inputs and outputs for up to 30 days for abuse monitoring before deletion (OpenAI docs). On the consumer side, deleted chats persist for ~30 days, and free-tier conversations can be used to improve models.
- Anthropic reduced its API log retention from 30 days to 7 in September 2025, a genuine improvement, and also a reminder that the number is a dial the vendor controls, not a law of physics (data overview).
Both companies offer a stronger option, Zero Data Retention (ZDR), where prompts and responses are processed and then immediately discarded, never logged. But ZDR is not the default. It is gated behind enterprise approval and sales conversations, available to qualifying business accounts that ask for it and accept additional terms. The privacy-maximizing setting exists; it's just not the one almost anyone is actually using.
This is the quiet shape of the whole industry: privacy is opt-in, effortful, and mostly unclaimed. The default is retention, because retention is useful: for debugging, for safety, for product improvement, and, increasingly, for litigation hold.
The four levels of "private"
"Private AI" is marketed as a single checkbox. It's really a ladder, and most products sit far lower on it than their copy implies.
Level 0, "Trust me." The privacy is a policy statement. Your data is retained; the company promises to behave. The delete button lives here. So does every tool whose privacy story is a paragraph rather than an architecture. The NYT order is what Level 0 looks like when the promise meets a courtroom.
Level 1, "We won't train on it." A real improvement, and the toggle most enterprises rely on. But "not used for training" is not "not stored." The data still sits in logs for some window, reachable by the company, a breach, or a subpoena.
Level 2, Zero Data Retention. Now the architecture, not the policy, does the work. If nothing is written down, there is nothing to leak, subpoena, or be ordered to preserve. ZDR is the one configuration under which that 2025 court order would have been moot. You cannot retain what does not exist. This is the floor for anyone serious about privacy, and it's still the exception rather than the rule.
Level 3, Regulated-data grade. This is the level almost nobody talks about honestly, because it's the one that costs real money. When the data is health information, legal matters, or anything covered by HIPAA, GDPR, or state privacy law, "we don't retain it" is necessary but nowhere near sufficient. You need a chain of legal and technical controls, and that's where most "HIPAA-friendly AI" marketing quietly falls apart.
What Level 3 actually costs
Say you wanted to build AI that could responsibly handle Protected Health Information. Here is the real bill of materials, the one that doesn't fit in a landing-page bullet.
A signed legal chain, end to end. Under HIPAA, every service that touches the data is a "Business Associate" and must sign a Business Associate Agreement (BAA). Not your app, every link: your database, your file storage, your hosting, your background-processing host, and your model provider. Miss one link and the chain is broken. And these aren't free: a HIPAA-eligible managed database tier runs into four figures a month; a HIPAA-enabled hosting workspace adds its own premium; PHI passing through your web functions can require yet another paid BAA add-on. The platform floor alone is comfortably four figures per month before a single user shows up.
An inference path that can actually sign. This is the trap. Many products reach frontier models through an aggregator or gateway, a single API that routes to OpenAI, Anthropic, Google, and others. Convenient, and often privacy-respecting via ZDR. But these routing layers generally don't sign BAAs, and the responsible ones say so plainly: ZDR is a technical policy, not a legal substitute for a BAA. So the moment you need to handle regulated PHI as a Business Associate, your entire inference layer has to change: you go provider-direct, under provider-specific BAAs (Azure OpenAI, AWS Bedrock, Google Vertex), and you give up the very abstraction that made things simple. Most "we're HIPAA-ready" claims never survive this sentence.
De-identification before the cloud. The most elegant version of Level 3 is to make sure identifiable data never leaves your boundary at all: detect names, record numbers, and identifiers; replace them with tokens; send only the tokenized text to the model; and re-inject the real values locally after the response comes back. Done well, the model literally never sees a patient. Done carelessly, you've added a false sense of safety on top of the same exposure.
Audit logging and key management. Who accessed what, when, and why, written to an append-only ledger you keep for years. Field-level encryption with managed keys, so that even at rest the sensitive columns are unreadable without the key. None of this is visible to a user. All of it is the difference between compliant and confident.
The point of laying this out isn't to scare anyone off. It's to make the economics legible: real privacy at the regulated tier is not a feature you ship in a sprint. It's an infrastructure cost you take on, deliberately, and pay for every month. When a vendor implies otherwise, the honest read is usually that they've stopped at Level 1 and hoped you wouldn't ask.
Why this is worth caring about, beyond compliance
It would be easy to read all of this as a procurement problem, a checklist for legal and security teams. But there's something more human underneath.
Privacy is the precondition for honest thought. People behave differently when they believe they're being watched; they round off the edges, avoid the embarrassing question, self-censor the half-formed idea. Decades of research on the "chilling effect" describe exactly this. And we are now, collectively, beginning to route our most consequential thinking (our health worries, our legal exposure, our financial fears, our creative drafts, our late-night what-ifs) through systems that, by default, remember.
If those systems remember, we will, slowly and unconsciously, ask them smaller questions. We'll bring them the safe version of the problem instead of the real one. The cost isn't just a data breach somewhere down the line; it's a quiet narrowing of what we're willing to think about with the most capable tools we've ever built.
Privacy, in that light, isn't about having something to hide. It's about preserving a space to think out loud without the thought becoming a permanent record. That's worth defending on purpose.
Where we are, and where we're going
A brief, honest note about our own position, since we build in this space.
Today, our product runs on zero-data-retention inference by default, Level 2. We route only to endpoints with an enforced ZDR policy, so prompts and responses are processed and forgotten rather than logged. And we are deliberate about what we don't claim: we are not a HIPAA Business Associate, we don't sign BAAs, and we ask people not to put PHI into the product. We'd rather say a clear "not yet" than imply a Level 3 we haven't built.
That "not yet" is a roadmap, not a shrug. The reason we've mapped the full Level 3 architecture above in such detail is that it's the work in front of us for ARMES Enterprise: taking on the BAA chain, the provider-direct inference path, the de-identification pipeline, field-level encryption, and audit logging required to serve regulated work responsibly. It's expensive and slow on purpose. We think that's the only honest way to do it.
The questions to ask anything you type into
You don't need to read a single architecture diagram to protect yourself. You need three questions, and the willingness to be unsatisfied with vague answers:
- Is it retained? Not "do you train on it." Is it stored at all, and for how long?
- Who can compel it? A court, a parent company, a future acquirer, a breach. If it exists, someone can ask for it.
- What actually happens when I hit delete? Immediately, or after a window? From backups too? Everywhere, or just from my view?
Privacy in the age of AI isn't a setting you flip once and forget. It's an architecture someone chose to build, and a cost someone chose to carry. Demand the kind you can verify, not the kind you're merely promised.
Written by
ARMES Team
From the team building ARMES — private AI that puts every frontier model in one place.