Palo Alto’s Unit 42 Just Found a Way to Hijack AI Agent Conversations - And Your Users Can’t See It Happening

In November 2025, Palo Alto Networks’ Unit 42 research team published a new attack technique they called “Agent Session Smuggling.” The attack exploits a fundamental property of multi-agent systems: agents remember their recent conversations.
Palo Alto’s Unit 42 Just Found a Way to Hijack AI Agent Conversations - And Your Users Can’t See It Happening

The security conversation around AI agent protocols has focused heavily on authentication. Can the sending agent prove its identity? Is it authorized to make this request? Standards like Google’s Agent2Agent (A2A) protocol and Anthropic’s Model Context Protocol (MCP) have invested significant effort in identity verification. The assumption is straightforward: if you know who’s calling and they’re authorized to call, the interaction is safe.

Unit 42’s Agent Session Smuggling research demolished that assumption. The attack doesn’t exploit a vulnerability in A2A’s authentication. It doesn’t bypass credential checks. It weaponizes something more fundamental: the fact that LLM-based agents maintain conversational context across interactions, and that context shapes their behavior in ways that authentication alone cannot control.

How Agent Session Smuggling works

Traditional protocol attacks target authentication or authorization. You forge a credential, escalate a privilege, or intercept a session token. Agent Session Smuggling is different because it targets the agent’s conversational memory - the accumulated context that influences how an LLM-based agent interprets and responds to instructions.

In A2A-based multi-agent systems, agents interact through multi-turn conversational exchanges. Unlike MCP, which follows a stateless client-server model where each request is independent, A2A is inherently stateful. Agents build up context over the course of a conversation. An instruction that would be rejected as suspicious in isolation may be accepted as perfectly reasonable given the right conversational history.

Unit 42’s researchers demonstrated this by building proof-of-concept attacks using Google’s Agent Development Kit (ADK). In one scenario, they set up a multi-agent system with a research assistant agent and a financial assistant agent, both communicating via A2A. The research assistant was compromised - either through a supply chain attack on its MCP tool connections or through a direct compromise of its hosting environment.

The compromised research assistant didn’t attack the financial assistant directly. Instead, it engaged the financial assistant in what appeared to be a legitimate multi-turn conversation, gradually building context that made subsequent requests seem reasonable. Over several exchanges, the research assistant extracted sensitive configuration data, tool schemas, and session history from the financial assistant. Then, leveraging the established conversational context, it escalated to unauthorized stock trades.

The critical insight: the smuggled exchanges were invisible to the end user. In production agent-to-agent UIs, intermediate agent conversations typically aren’t displayed. The user sees the final result - the completed task, the generated report, the executed transaction - without visibility into the agent-to-agent conversations that produced it. The smuggled instructions exist in a layer of the system that users never observe.

Why this is different from prompt injection

The security community’s first reaction to any AI attack vector is to categorize it as a variant of prompt injection. Agent Session Smuggling shares some surface-level similarities - an attacker is manipulating what an AI agent perceives as its instructions. But the mechanism and the implications are fundamentally different.

Prompt injection typically involves a single malicious input - a carefully crafted string embedded in a document, email, or user message - that causes an AI to deviate from its intended behavior in a single interaction. The defense model for prompt injection focuses on input sanitization, output validation, and instruction-data separation within a single request-response cycle.

Agent Session Smuggling is a multi-turn conversational attack. The malicious intent is distributed across multiple exchanges, each of which may appear individually benign. The attack builds context gradually, establishing trust and precedent through legitimate-seeming interactions before leveraging that context for unauthorized actions. There is no single malicious input to filter. The attack surface is the conversation itself.

Mark Fernandes, a cybersecurity researcher at Hodeitek who analyzed the Unit 42 findings, described the distinction: the attack “doesn’t exploit a protocol vulnerability” in A2A but instead “weaponizes the implicit trust relationships between agents that arise from their conversational interactions.” The agents are behaving exactly as designed. The design itself is the vulnerability.

This distinction matters for defense planning. Prompt injection defenses - input filters, guardrails, output validators - offer limited protection against an attack that operates across multiple legitimate-looking exchanges. Defending against session smuggling requires monitoring the conversational trajectory, not just individual messages.

The A2A versus MCP security model

Unit 42’s research implicitly highlighted a security divergence between the two dominant agent communication protocols that enterprises need to understand.

MCP follows a stateless client-server model. Each request from a client to an MCP server is independent. The server doesn’t maintain conversational context between requests. This means session smuggling - as described by Unit 42 - is structurally harder against pure MCP interactions. An attacker can still exploit individual requests (as demonstrated by MCP’s own vulnerability history), but they can’t build up manipulative context across exchanges because MCP doesn’t maintain that context.

A2A, by design, is stateful and conversational. Agents engage in multi-turn task negotiations where context accumulates across interactions. This is what makes A2A powerful for complex collaborative tasks - agents can refer back to earlier exchanges, build on previous agreements, and adapt their behavior based on conversational history. It’s also what makes A2A vulnerable to session smuggling: the conversational memory that enables sophisticated collaboration is the same mechanism that enables multi-turn manipulation.

Cybersecurity News covered the Unit 42 disclosure and noted that the stateful nature of A2A creates a fundamentally different threat model than stateless protocols: “The implicit trust built through multi-turn interactions creates opportunities for sophisticated manipulation that single-request architectures simply don’t expose.”

This doesn’t mean MCP is more secure overall - its stateless model has its own vulnerabilities, particularly around tool poisoning and SSRF attacks. But it means the security architectures for MCP-based and A2A-based systems need to be different. An enterprise running both protocols - which is increasingly common as organizations adopt multiple agent frameworks - needs to understand the distinct threat models and apply protocol-appropriate defenses.

The invisible layer problem

Perhaps the most operationally significant aspect of Agent Session Smuggling is the visibility gap it exploits. In most production multi-agent deployments, the user interacts with a front-end agent and receives results. The agent-to-agent communications that produce those results happen in a layer of the system that is, for practical purposes, invisible.

This isn’t a bug in any particular system. It’s a user experience design choice. Showing users every intermediate agent conversation would create an overwhelming, unusable interface. But from a security perspective, this invisible layer is where session smuggling lives. The attack operates in the space between what the user requests and what the system delivers, in conversations the user never sees and cannot audit.

eSecurity Planet’s analysis of the Unit 42 findings emphasized this point: intermediate conversations between agents are “not typically displayed in production agent-to-agent UIs,” creating a layer where “manipulated exchanges can occur without any visible indication to the end user or operator.”

For enterprise security teams, this means that monitoring agent-to-agent communications is not optional - it’s a core security requirement. If you can’t see what your agents are saying to each other, you can’t detect session smuggling. And if you can’t detect it, you can’t prevent a compromised agent from manipulating trusted agents into executing unauthorized actions.

Building defenses for a conversational attack surface

I work with multi-agent architectures daily. The design challenge that Agent Session Smuggling surfaces is that we’ve been applying network security mental models to conversational systems. In network security, you inspect packets at boundaries. In agent security, you need to inspect conversations over time.

Here’s what that looks like in practice:

Implement session integrity verification at every handoff. When Agent A passes a task to Agent B, Agent B shouldn’t just verify Agent A’s identity. It should verify that the conversational context is consistent with the task’s declared scope. If a research assistant suddenly introduces financial transaction requests into a conversation that started as a literature review, that context shift should trigger an alert.

Establish behavioral baselines for agent conversations. Just as UEBA (User and Entity Behavior Analytics) systems establish normal behavior patterns for human users, multi-agent systems need conversation behavior analytics. What topics does each agent typically discuss? What types of requests does it typically make? What data does it typically request access to? Deviations from these baselines indicate potential smuggling.

Log instruction-level detail, not just outcomes. Most agent monitoring systems log the inputs and outputs of each agent - what was requested and what was produced. Session smuggling lives in the intermediate instructions. You need to log every inter-agent message, including the conversational context that accompanied each request. Without instruction-level logging, forensic analysis of a session smuggling attack is impossible.

Monitor for context escalation patterns. Session smuggling follows a predictable pattern: initial exchanges establish trust and precedent, followed by requests that leverage the established context to access data or execute actions beyond the original scope. Pattern detection for gradual scope expansion across multi-turn agent conversations is the specific detection capability needed.

Design agent architectures with conversation boundaries. Rather than allowing unlimited conversational history between agents, implement mandatory context resets at defined intervals or task boundaries. When an agent completes one task and begins another, its conversational history with collaborating agents should be cleared or compartmentalized. This limits the window within which smuggling can accumulate context.

The broader architectural implication

Agent Session Smuggling is the first demonstrated attack that specifically targets the conversational nature of multi-agent systems. It won’t be the last. As A2A adoption grows and multi-agent systems become more sophisticated, the conversational attack surface will expand.

The lesson from Unit 42’s research is that authentication and authorization - the traditional pillars of access control - are necessary but insufficient for securing multi-agent systems. Identity tells you who’s in the conversation. Authorization tells you what they’re allowed to do. Neither tells you whether the conversation itself has been manipulated to make unauthorized actions appear authorized.

Enterprise security teams deploying multi-agent systems need to add a third pillar to their security architecture: conversational integrity. The ability to verify that the context of an agent interaction hasn’t been manipulated is as important as verifying the identity of the agents involved.

The organizations that recognize this and build conversational monitoring into their agent architectures now will be in a dramatically better position than the ones that learn about session smuggling from their incident response reports.