Image for Your AI Assistant Has a Shadow Resume. And Hackers Are Reading It
Technology Jun 16, 2026 • 14 min read

Your AI Assistant Has a Shadow Resume. And Hackers Are Reading It

Your AI tool's memory isn't just convenient. It's a target. Learn how indirect prompt injection turns accumulated context into a security vulnerability.

Share:
Lee Foropoulos

Lee Foropoulos

14 min read

Continue where you left off?
Text size:

Contents

Your AI assistant knows more about you than you probably realize. Not in a vague, hand-wavy "big tech collects data" sense. In a specific, structured, retrievable sense. It knows the project you mentioned three weeks ago when you were drafting a proposal. It knows the name of your manager, the city you're relocating to, the health question you asked on a Tuesday afternoon when you didn't want to call your doctor. You told it these things because it was helpful, and it was helpful because it remembered.

That memory is also a target.

Security researchers have spent the past two years demonstrating, repeatedly and in production systems, that the same memory features that make AI assistants useful can be turned against the people using them. Not through some exotic theoretical exploit. Through documents. Through webpages. Through PDFs you ask your AI to summarize. The attack surface isn't the AI company's servers. It's the conversation you're having right now.

This is worth understanding before it becomes a problem you're dealing with personally.

The Memory Your AI Keeps That You Forgot About

Most people think of their AI assistant as a sophisticated search box. Ask a question, get an answer, close the tab. What they don't think about is what stays behind.

Context windows: the short-term picture

Every conversation you have with a large language model happens inside a context window: a fixed-size buffer of text that the model can "see" at once. Think of it as working memory. Within a single session, everything you've typed, everything the model has responded, and any documents or links you've shared all live inside that window simultaneously. The model doesn't just answer your last message. It answers in light of everything that came before it in the session.

That context window can hold tens of thousands of words in current systems. A single work session where you're drafting emails, asking follow-up questions, and sharing a document for review can accumulate an enormous amount of personally identifying information before you've noticed you've shared anything sensitive.

Abstract digital data streams suggesting surveillance or data collection
A single AI session can accumulate more personal detail than most people would share in a job interview.

Persistent memory: the long-term dossier

The context window is the short-term picture. Persistent memory is the long game.

Tools like ChatGPT Memory, Microsoft Copilot's user profile system, and Gemini's personalization features are designed to carry information across sessions. They remember your preferences, your professional context, your communication style, and the specifics you've mentioned over time. The explicit goal is continuity. The side effect is accumulation.

Most users have never once opened the memory management panel in their AI tool. They don't know what's stored, and they don't know what it implies about them.

This accumulated profile is what researchers have started calling a shadow resume: an implicit dossier assembled not from what you intentionally submitted, but from the texture of everything you've asked. Your job title emerges from how you describe your work. Your organization surfaces when you mention a meeting. Your anxieties, your projects, your relationships, your credentials: all of it can be inferred and stored without a single deliberate disclosure.

The shadow resume isn't a conspiracy. It's a feature working exactly as designed. The problem is that features designed for helpfulness don't automatically come with features designed for security. And the data sitting in that persistent memory store is precisely the kind of profile an attacker would pay to access.

What Is Indirect Prompt Injection. And Why It's Different From What You've Heard

Prompt injection has become a familiar term in security circles, but familiarity has bred a particular kind of misunderstanding. Most people who've heard the term picture a user typing something clever to make a chatbot misbehave. That's a real thing. It's also the less dangerous version of the problem.

Direct vs. indirect injection: a critical distinction

Direct prompt injection is when a user deliberately crafts input to manipulate an AI's behavior. Jailbreaks fall into this category. So do attempts to override system prompts by embedding conflicting instructions in a user message. The user is the attacker. The user is also the one taking the risk. The harm model is relatively contained.

Indirect prompt injection is structurally different. Here, the malicious instructions don't come from the user at all. They're embedded in external content that the AI reads on the user's behalf: a webpage being summarized, a PDF being analyzed, an email being processed, a search result being synthesized. The user asks the AI to do something ordinary and helpful. The AI, in the course of doing that helpful thing, encounters instructions hidden inside the content. It then follows those instructions.

The user is the victim. The user did nothing wrong. That's the shift in threat model that matters.

Abstract network diagram suggesting data pathways and injection points
The attack doesn't come through the user's input. It comes through the content the AI was trusted to read.

This is meaningfully different from SQL injection or cross-site scripting, and it's worth being precise about why. SQL injection exploits the failure to distinguish between data and executable code in a database query. XSS exploits the failure to sanitize content rendered in a browser. Both are well-understood, and both have established mitigation patterns. Indirect prompt injection exploits something harder to patch: the fact that large language models are designed to follow instructions, and they can't always distinguish between instructions from their operator and instructions embedded in the content they're processing.

The document as a weapon

The attack vector isn't a piece of malware. It's a document.

Threat Model Shift

In indirect prompt injection, the malicious actor doesn't need access to your device, your account, or your network. They need to get you to ask your AI to read something they've prepared. That's a much lower bar.

A crafted PDF, a poisoned webpage, a malicious email attachment: any content that an AI agent reads and acts on becomes a potential delivery mechanism. The instructions can be hidden in white text on a white background, embedded in document metadata, or written in a font size that renders invisibly to human readers while remaining fully legible to the model processing the text.

~100%
of indirect prompt injection attacks require zero malware. Only content the AI is asked to read

The AI isn't broken when this works. It's doing exactly what it was built to do: read content, understand instructions, and act helpfully. The exploit lives in that helpfulness.

Johann Rehberger and the Research That Should Have Alarmed Everyone

If you follow AI security research, you've encountered Johann Rehberger's work. If you don't, you should start. His systematic documentation of prompt injection vulnerabilities in production AI systems is the most thorough public record of what these attacks actually look like against real tools that real people use every day.

The 2024-2025 demonstration timeline

Rehberger's research didn't produce a single dramatic disclosure. It produced a sustained series of demonstrations against production systems, each one documenting a specific attack pathway with enough detail to be reproducible and enough clarity to be undeniable.

His work against ChatGPT demonstrated that indirect prompt injection could be used to manipulate the model's behavior during a session in ways the user would have no reason to notice. His demonstrations against Microsoft Copilot went further, showing that an attacker could craft content that, when processed by Copilot, would cause the assistant to exfiltrate information from the user's context.

"The model is not the attacker's tool. The model is the attacker's delivery mechanism. The user's trust in the model is what makes the attack work.". Johann Rehberger, paraphrased from documented research presentations

The memory poisoning technique was among the most consequential findings. Rehberger demonstrated that malicious instructions embedded in content could cause an AI with persistent memory to write false or attacker-controlled data into its memory store. Once that false memory is written, it persists. Future sessions are then contaminated by the poisoned record, and the user has no obvious signal that anything went wrong.

Memory poisoning doesn't just steal data from one session. It corrupts the foundation that every future session builds on.

Exfiltration via image rendering and markdown

The exfiltration technique Rehberger documented is worth understanding in specific terms, because it's elegant in a way that makes it hard to dismiss as theoretical.

Large language models, when responding to users, can render markdown. Markdown supports image tags. Image tags contain URLs. When a model renders an image tag in its response, the client fetches that URL. The URL can contain parameters.

Those parameters can contain data.

0
user interactions required to trigger silent data exfiltration via rendered markdown images

A malicious document instructs the model to retrieve stored context, format it as URL parameters, and embed it in an image tag pointing to an attacker-controlled server. The model renders the response. The client fetches the image. The attacker's server logs the request, including every parameter in the URL. The user sees a normal-looking summary. The data is already gone.

The broader research community has validated and extended Rehberger's findings through 2025 and into the present. Independent researchers have replicated the core techniques against multiple platforms, documented variations that bypass vendor mitigations, and identified new attack surfaces as AI agents have become more capable and more deeply integrated into productivity workflows. Vendor responses have been mixed: some mitigations have been deployed, some disclosures have been acknowledged, and some attack pathways remain viable. Responsible disclosure timelines in this space have been complicated by the fact that the underlying issue is architectural, not a simple bug with a simple patch.

How a Poisoned Document Reads Your AI's Memory

Abstract threat models are useful. Concrete scenarios are more useful. Here's what an actual attack looks like from the user's perspective, which is to say: it looks like nothing at all.

Step-by-step anatomy of an attack

You're working on a vendor evaluation. Someone sends you a link to a whitepaper, or you find a PDF through a search. You drop it into your AI assistant and ask for a summary. This is normal behavior. Millions of people do exactly this every day.

The document looks like a whitepaper. It reads like a whitepaper. Somewhere in it, invisible to you, there are instructions written for the AI, not for you. They might be in white text. They might be in a comment field. They might be in a font rendered at 0.1 point size. The model reads all of it.

Laptop screen showing a document with a chat interface open beside it
The user sees a document. The AI reads a document plus a set of instructions the user never knew were there.

The hidden instructions tell the AI to do several things. Retrieve stored context. Look for names, job titles, organization details, project names, anything in persistent memory. Format that information as URL parameters. Embed those parameters in an image tag pointing to an external server. Then provide the user with a helpful summary of the document.

The AI follows these instructions. It's designed to follow instructions. It doesn't flag the request as suspicious because nothing in its training or its guardrails clearly marks "retrieve context and embed in image URL" as an attack pattern rather than a legitimate task.

What the User Sees

A clean, accurate summary of the document they asked about. No error messages. No unusual behavior. No indication that anything else happened.

What data is actually at risk

The question of what's actually in that exfiltrated context depends on how long you've been using the tool and how much you've shared.

Persistent memory in current AI tools can contain your full name, your job title, your employer, the names of colleagues and managers you've mentioned, active projects and their status, credentials or account details you've referenced, personal health or financial information you've discussed, and the general shape of your professional and personal life as it's emerged across dozens of conversations.

That's not a hypothetical worst case. That's what the memory store looks like for an average engaged user of a modern AI assistant after a few months of regular use. The shadow resume is real, it's detailed, and a single poisoned document is enough to read it and send it somewhere you never intended.

Cross-Session Data Leakage: When Yesterday's Conversation Funds Today's Attack

The attack scenarios described so far involve a single session, a single document, a single moment of exfiltration. That's already serious. The cross-session dimension makes it significantly worse.

How persistent memory bridges attack sessions

Persistent memory doesn't reset between conversations. That's the entire point of it. What you told your AI assistant six weeks ago about your company's upcoming acquisition is still in there. The credentials you mentioned while troubleshooting an integration two months ago may still be retrievable. The personal details you shared during a stressful week in January are part of the profile.

A poisoned document you encounter today doesn't just access what you've shared today. It can access everything the memory store contains, which means everything you've ever shared across every session where memory was active. The longer you've used the tool, the richer the target. An attacker who gets lucky with a single poisoned document against a user who's been actively using an AI assistant for eighteen months has access to a remarkably complete profile.

18+
months of accumulated context available to a single successful injection attack against a long-term AI user
The value of the attack scales with the user's trust in the tool. The more someone relies on their AI assistant, the more an attacker can extract from a single successful injection.

Real-world leakage demonstrations

Research published and presented through 2025 and into the present has documented cross-session leakage as a concrete, demonstrated phenomenon, not a theoretical concern. Controlled experiments have shown that data shared in one conversation, stored in persistent memory, can be retrieved and exfiltrated in a completely separate conversation through indirect prompt injection. The sessions don't need to be temporally close. The gap between the original disclosure and the attack can be weeks or months.

What makes this particularly difficult from a user awareness standpoint is that there's no obvious signal. You don't receive a notification that your memory was accessed. You don't see a log of what was retrieved. The AI's response in the poisoned session looks normal because the summary it provides is normal. The exfiltration happens in the background, in a rendered image request that the client handles automatically.

Memory as attack surface is now a formal framing in AI security research, and it deserves to be treated with the same seriousness that session token theft or credential stuffing receives in traditional application security. The surface is different. The stakes are comparable. And unlike a stolen session token, a compromised memory store contains context that no password reset can undo.

The Threat Model Most Security Teams Are Missing

Traditional perimeter security operates on a clear assumption: sensitive data lives inside a defined boundary, and the job is to keep attackers outside it. Firewalls, endpoint detection, access controls, zero-trust architecture. All of it is built around the idea that you know where your data is. AI memory systems break that assumption completely. The data isn't sitting in a database with an access log. It's woven into a model's context, surfaced on demand, and accessible through the same conversational interface your employees use to ask what time a meeting is.

Why Enterprise AI Deployments Amplify the Risk

Consider what accumulates in an employee's AI assistant over a typical quarter. Product roadmap discussions. Competitive analysis. Personnel decisions. Customer contract details. Vendor negotiation strategy. None of that was entered into a system of record. It was typed into a chat window, processed, and retained. Most enterprise security teams have mapped their sensitive data stores with reasonable precision. Almost none of them have mapped what's sitting inside their employees' AI memory profiles.

The problem compounds at scale. A company with 500 employees using AI assistants daily has effectively created 500 undocumented context repositories, each holding a partial but detailed picture of how that employee thinks, what they work on, and who they talk to. No audit trail. No data classification. No retention policy.

Server room with glowing blue lights and rows of hardware
Enterprise AI deployments create distributed data stores that most security architectures weren't designed to see, let alone protect.

Policy Gap Worth Noting

Most enterprise AI acceptable-use policies were written in 2023 or early 2024. They address data sharing in broad strokes. Almost none of them specify how AI memory should be managed, audited, or cleared.

The gap between AI safety discourse and AI security practice is worth naming directly. The public conversation about AI risk has been dominated by philosophical questions: alignment, existential risk, bias, fairness. Those are real concerns. But the operational security questions, the ones that affect enterprises right now, have received far less attention from the people building policy.

Agentic AI Raises the Stakes Further

Agentic AI systems don't just remember. They act. Tools like Copilot with plugin access, AutoGPT-style orchestration frameworks, and AI assistants connected to email and calendar don't wait for a prompt. They browse, draft, send, and execute on behalf of the user. That architecture turns a memory exploitation into a multi-hop attack chain.

Here's how that plays out in practice. An attacker embeds a prompt injection payload in a publicly accessible document. An employee's AI assistant, configured to monitor relevant web sources, retrieves and processes that document. The payload instructs the assistant to forward a summary of recent email threads to an external address. The employee never sees the instruction. The exfiltration happens inside a workflow that looks, from the outside, like normal assistant activity.

This isn't a theoretical scenario. Researchers demonstrated functionally equivalent attack chains against major platforms through 2024 and 2025. The agentic layer doesn't just expand the attack surface. It adds velocity to the exploitation, because the AI can act faster than any human reviewer would catch it.


Vendor Responses: What the Platforms Are (and Aren't) Doing

When researchers began publishing serious indirect injection findings in 2023 and 2024, the major platforms responded. OpenAI, Microsoft, and Google all engaged with disclosed vulnerabilities, issued acknowledgments, and deployed mitigations. The response was faster and more substantive than what the security community typically sees from enterprise software vendors. That's worth acknowledging. So is what the mitigations can and can't actually do.

Mitigations Deployed Through 2025 and 2026

OpenAI introduced memory access controls that let users review and delete stored context. Microsoft added sandboxing layers to Copilot's plugin architecture and implemented output filtering designed to catch obvious injection patterns. Google's Gemini team deployed similar filtering on tool-use outputs and restricted the conditions under which the model would act on instructions embedded in retrieved content.

17
distinct prompt injection techniques documented against major LLM platforms through early 2026, per published security research

These mitigations have had real effect on the most naive attacks. Simple injections that instruct the model to "ignore previous instructions and do X" are caught at higher rates than they were two years ago. The filtering has matured. Researchers who were bypassing defenses with a single sentence in 2023 now need more elaborate constructions to achieve the same result.

Filtering for known bad patterns is a reasonable defense against yesterday's attacks. It's not a defense against a researcher who spent three days finding a new one.

Why Patches Are Structurally Difficult

The fundamental problem isn't implementation quality. It's architecture. Prompt injection is hard to patch at the model level because the model can't reliably distinguish between instructions from a trusted user and instructions embedded in untrusted content it was asked to process. That's not a bug in the code. It's a consequence of how language models work. They process text. All text looks like text.

The tension between helpfulness and security makes this harder. A model that refuses to act on any instruction found in retrieved content is a model that can't summarize a webpage, extract action items from a document, or do most of what makes agentic AI useful. Vendors are trying to draw a line between "helpful processing of external content" and "executing instructions from external content." That line is genuinely difficult to define in a way that holds under adversarial pressure.

Responsible disclosure has improved the research-to-vendor pipeline considerably. Researchers like Johann Rehberger and teams at companies including Embrace the Red have maintained ongoing dialogue with platform security teams. Some findings get fixed. Others get classified as accepted risks, which is its own kind of answer.


What You Can Do Right Now: Auditing and Hardening Your AI Footprint

The good news is that the controls exist. You don't have to wait for vendors to solve the underlying architecture problem. You can audit what your AI tools currently hold, reduce what they retain, and build habits that limit exposure going forward. None of this requires technical expertise. It requires about thirty minutes and a willingness to treat your AI assistant the way you'd treat any other system that holds sensitive information.

Reviewing What Your AI Actually Remembers

In ChatGPT, navigate to Settings, then Personalization, then Manage Memory. You'll see a list of what the system has explicitly stored. Read it carefully. You may find business context, personal details, and behavioral patterns you didn't consciously intend to save. Delete anything sensitive. If you're using ChatGPT for high-stakes work, consider disabling persistent memory entirely for those sessions.

Microsoft Copilot memory behavior varies by deployment. Enterprise deployments controlled by your IT organization may have centralized policies. Personal Copilot usage follows Microsoft account settings. Check the privacy dashboard at account.microsoft.com and review what data is associated with your AI interactions. The controls are there; they're just not surfaced prominently.

Google Gemini stores interaction history tied to your Google account. You can review and delete this history through myactivity.google.com. Gemini Apps Activity can be paused, which stops new interactions from being stored. If you process sensitive documents through Gemini, pausing activity before those sessions is a practical precaution.

Person reviewing data on a laptop screen in a dimly lit workspace
Auditing your AI memory takes less time than you think. It should be a regular part of your security hygiene, not a one-time event.

Operational Hygiene for AI Tool Users

Compartmentalization is the most underused tool available right now. If you use AI assistants for both general productivity and sensitive work, consider maintaining separate accounts or browser profiles for each. What the assistant doesn't know about your sensitive projects can't be extracted from it.

AI Footprint Audit Checklist 0/8

For organizations, the mandate is straightforward: treat AI memory as a data store subject to the same classification, retention, and access policies as any other enterprise system. That means written policy, not just guidance. It means training employees on what not to share with AI tools. And it means including AI tool usage in security audits.


The Bigger Picture: AI Assistants as High-Value Targets

A credential breach gives an attacker a username and a password. Useful, but narrow. What sits inside a well-used AI assistant's memory is something different in kind, not just degree. It's behavioral. It's relational. It captures how someone thinks, what they prioritize, who they trust, and what they're worried about. That profile is harder to generate through any other means and harder to change once it's been exfiltrated.

Why Your AI Profile Is Worth More Than Your Password

You can reset a password in thirty seconds. You can't reset six months of strategic context that an attacker has already read. The accumulated memory of an executive's AI assistant might contain the clearest picture of an organization's near-term plans that exists anywhere outside the boardroom. Attackers who understand this will treat AI memory as a primary target, not a secondary one.

A stolen password gets you through a door. A stolen AI memory profile tells you which doors are worth opening and what's behind them.

The Coming Arms Race in LLM Security

LLM security research is moving fast. What was a niche area of academic interest in 2023 is now a serious professional discipline, with dedicated teams at major security firms, active bug bounty programs, and a growing body of published methodology. The trajectory points toward more sophisticated attacks, not fewer.

Automated poisoned document campaigns are a near-term concern. Instead of targeting a specific employee, an attacker distributes malicious documents broadly, knowing that some percentage will be processed by AI assistants with tool-use capabilities. Supply chain attacks on AI plugins are another emerging vector: compromise a plugin that thousands of enterprise deployments trust, and you have a position inside the AI's tool-use layer across all of them.

Treat AI Tools Like the Data Systems They Are

AI assistants aren't productivity accessories anymore. They're data stores with natural language interfaces, connected to your files, your email, and your calendar. The security discipline you apply to those systems should extend to the AI tools connected to them.

The endpoint security era taught organizations that every device is a potential entry point. The cloud security era taught them that the perimeter is everywhere. The AI security era is teaching the same lesson again, applied to a new layer. The organizations that learn it early will be in a better position than the ones that wait for a breach to make the point for them.

How was this article?

Share

Link copied to clipboard!

You Might Also Like

Lee Foropoulos

Lee Foropoulos

Business Development Lead at Lookatmedia, fractional executive, and founder of gotHABITS.

🔔

Never Miss a Post

Get notified when new articles are published. No email required.

You will see a banner on the site when a new post is published, plus a browser notification if you allow it.

Browser notifications only. No spam, no email.

0 / 0