Your IDE is the new HIPAA risk.
Most healthtech compliance reviews look at the database, the API, the cloud BAAs. The IDE rarely gets opened. That's where I keep finding leaks.
I run a software agency that builds in regulated industries. After 10 years of this work, I keep finding myself in the same conversation. A founder shows me their healthtech app, already in production with patient data flowing through it. Their team uses Cursor or Copilot. They've never thought about what those tools do with the codebase they see.
When an engineer asks Cursor to refactor a function, the surrounding context goes through Cursor's AWS infrastructure, then to OpenAI, Anthropic, Google, or Fireworks depending on the model. If indexing is on, embeddings persist in Turbopuffer. If web search runs, Exa sees the query. None of those vendors have signed a Business Associate Agreement with you. Specode wrote up the full data flow if you want to see the diagram.
People assume PHI means patient names and diagnoses. The Security Rule covers more than that. It protects the systems that handle PHI, which means a database schema with patient identifiers, an error log carrying a full request body, a URL with a record ID in the query string, even a .env with a connection string to a PHI database. All of it is regulated. All of it flows out of the machine the moment your engineer hits tab.
The current state of the major tools is more confusing than the marketing suggests.
Cursor doesn't sign BAAs. Privacy Mode prevents your code from being used to train models, which is useful, but every keystroke still routes through Cursor's servers and those of outside vendors. HIPAA applies to that transmission regardless of training settings. SOC 2 Type II covers general security and provides no useful information about HIPAA compliance.
GitHub Copilot Enterprise can be covered, but only by negotiating a BAA directly with Microsoft. Most teams running Copilot today are on Business or Personal plans, where it isn't available.
Claude Code can be covered, but the setup is specific: a Claude Enterprise plan, HIPAA toggled in admin settings, a signed BAA, and Zero Data Retention enabled at the org level. Self-serve doesn't get you there.
Replit, Bolt, and Lovable don't offer BAAs at all. Codeium and Tabnine sidestep the question by offering self-hosted enterprise tiers that keep code entirely within your infrastructure. If a vendor's BAA isn't available, taking their data flow out of the equation is the defensible alternative.
The newer surface, and probably the more dangerous one, is MCP.
Model Context Protocol lets an AI agent talk directly to an EHR, a billing system, a patient CRM. Elegant in design, almost completely unaudited in production. In 2025 alone, CVE-2025-54135 ("CurXecute") let a malicious Slack message, when summarized by Cursor's agent, rewrite an MCP config and execute arbitrary commands. Asana's MCP server had a bug that exposed data across customer instances. Neither is theoretical.
MCP is harder to govern than the IDE because traffic flows there. It runs within the client process and never touches your network the way an API call does, so your DLP and SIEM tools have no visibility into it. The audit obligations still apply.
A few approaches I've seen work in practice:
- Replacing real PHI with synthetic data early in development (Synthea generates fake patient records that pass for real ones in testing, which closes off most accidental exposure).
- Moving the codebase onto a stack with an unbroken BAA chain, which in practice has meant Claude Code on Enterprise with ZDR, or Copilot Enterprise with the BAA negotiated, or self-hosted Codeium for the few teams that wanted maximum isolation.
- Treating MCP like any other production dependency, with allowlisted servers, auth on every connection, and audit logs that survive the six years HIPAA requires.
A BAA covers the legal layer, no more. The technical configuration is a separate problem, and that's where I see most teams stuck. Every AI coding tool I've worked with optimizes for speed by default. That serves most users well, and healthcare priorities run in the opposite direction, with defaults that haven't been adjusted to match.
HIPAA penalties run between $141 and $2.1M per violation, and a full PHI flow audit takes a Tuesday afternoon. The teams that have done one are not the ones I worry about.
Where has your team landed on this? Still on personal-tier IDEs, or already migrated to a BAA-covered stack?