Agentic Design

Patterns

Memory & Context Attacks

Memory poisoning, RAG exploitation, and context manipulation techniques

13
Techniques
9
high
high Complexity
4
medium
medium Complexity

Available Techniques

🧠

Agent Memory Poisoning

(AMP)
high

Injection of malicious or manipulative information into an agent's long-term memory systems, corrupting its knowledge base and influencing future behaviors.

Key Features

  • Long-term memory corruption
  • Knowledge base manipulation
  • Persistent behavior modification

Primary Defenses

  • Memory validation and sanitization
  • Source verification for stored information
  • Memory access controls

Key Risks

Persistent behavior corruptionLong-term security compromiseKnowledge base integrity lossDifficulty in detection and remediation
📜

Context Window Manipulation

(CWM)
medium

Exploitation of context window limitations to hide malicious instructions, overflow context buffers, or manipulate conversation history to bypass security controls.

Key Features

  • Context overflow attacks
  • History manipulation
  • Hidden instruction injection

Primary Defenses

  • Context window management
  • Priority-based context retention
  • System instruction protection

Key Risks

System instruction lossSecurity control bypassContext confusionInstruction priority manipulation
📚

RAG System Poisoning

(RAGP)
high

Injection of malicious documents or data into Retrieval-Augmented Generation (RAG) knowledge bases to poison retrieved context and manipulate agent responses.

Key Features

  • Knowledge base document poisoning
  • Retrieval result manipulation
  • Vector embedding corruption

Primary Defenses

  • Document validation and sanitization
  • Source authentication
  • Retrieval result verification

Key Risks

Compromised knowledge retrievalPoisoned agent responsesMisinformation propagationSource attribution manipulation
🔑

Agent Session Hijacking

(ASH)
medium

Unauthorized takeover of an ongoing agent conversation or session, gaining access to conversation history, context, and the ability to inject malicious instructions.

Key Features

  • Session token exploitation
  • Context injection mid-conversation
  • History access and manipulation

Primary Defenses

  • Strong session management
  • Token encryption and rotation
  • Session validation on each request

Key Risks

Conversation takeoverSensitive data accessContext manipulationIdentity impersonation
💉

Direct Memory Injection

(DMI)
high

Direct injection of malicious content into an agent's memory storage systems, bypassing normal conversation flows to insert false memories or corrupted knowledge.

Key Features

  • Direct storage manipulation
  • Memory API exploitation
  • Conversation bypass

Primary Defenses

  • Strict API authentication
  • Memory write access controls
  • Input validation on storage operations

Key Risks

Unauthorized memory modificationKnowledge base corruptionSecurity bypass through memoryData integrity compromise
☣️

Context Contamination Attack

(CCA)
medium

Gradual contamination of conversation context through subtle injections across multiple interactions, slowly corrupting the agent's understanding and behavior.

Key Features

  • Gradual contamination
  • Multi-turn corruption
  • Subtle manipulation

Primary Defenses

  • Continuous context validation
  • Contamination detection algorithms
  • Periodic context sanitization

Key Risks

Slow undetected corruptionCumulative security degradationDifficult remediationCross-session contamination
🔄

Episodic Memory Replay Attack

(EMRA)
high

Manipulation of episodic memory replay mechanisms to reinforce malicious patterns, false information, or harmful behaviors through repeated memory activation.

Key Features

  • Memory replay manipulation
  • Pattern reinforcement exploitation
  • Memory consolidation abuse

Primary Defenses

  • Replay validation
  • Consolidation security controls
  • Recall frequency limits

Key Risks

Reinforced malicious patternsBehavior entrenchmentDifficult pattern removalMemory priority corruption
🗃️

Semantic Memory Corruption

(SMC)
high

Corruption of an agent's semantic memory - general knowledge and facts - through injection of false information that becomes part of the agent's core understanding.

Key Features

  • Factual knowledge corruption
  • Concept relationship manipulation
  • General knowledge poisoning

Primary Defenses

  • Fact verification systems
  • Knowledge source tracking
  • Semantic consistency checks

Key Risks

Fundamental knowledge corruptionPersistent false beliefsSystem-wide incorrect behaviorsDifficult detection and correction
♾️

Memory Persistence Exploitation

(MPE)
medium

Exploitation of memory persistence mechanisms to ensure malicious content remains in agent memory across resets, updates, or cleanup operations.

Key Features

  • Cleanup bypass
  • Reset resistance
  • Update persistence

Primary Defenses

  • Comprehensive memory cleanup
  • Complete reset procedures
  • Persistence validation

Key Risks

Persistent compromiseIncomplete remediationLong-term security degradationHidden malicious content
🔓

Cross-Session Memory Leakage

(CSML)
high

Exploitation of memory isolation vulnerabilities to access or leak information from other users' sessions or conversations through shared memory systems.

Key Features

  • Session boundary bypass
  • Memory isolation exploitation
  • Cross-user data access

Primary Defenses

  • Strong memory isolation
  • Session-specific memory spaces
  • Access control enforcement

Key Risks

Privacy violationsSensitive data exposureCross-user information leakageCompliance violations
🎓

Learning Process Exploitation

(LPE)
high

Attacking agent learning processes by introducing biased, incomplete, or malicious data during incremental updates, online learning, or feedback loops, causing the agent to learn harmful behaviors or incorrect patterns.

Key Features

  • Biased training data injection
  • Incremental learning manipulation
  • Feedback loop exploitation

Primary Defenses

  • Learning input validation and sanitization
  • Anomaly detection for learning updates
  • Learning rate limits and boundaries

Key Risks

Systematic bias introductionHarmful behavior learningModel degradation over timeSafety alignment compromise
🔄

Knowledge Update Mechanism Vulnerability

(KUMV)
high

Exploiting vulnerabilities in the agent's knowledge update mechanisms by injecting unauthorized updates, bypassing authentication and integrity checks, or manipulating version control systems to introduce malicious knowledge.

Key Features

  • Unauthorized update injection
  • Authentication bypass
  • Version control manipulation

Primary Defenses

  • Strong update authentication
  • Cryptographic integrity checks
  • Version control with audit trails

Key Risks

Unauthorized knowledge modificationPersistent malicious updatesKnowledge base corruptionAuthentication system bypass
🔗

Cross-Agent Knowledge Poisoning

(CAKP)
high

Attacking shared knowledge bases used by multiple agents to create systemic poisoning, where corrupted knowledge propagates across interconnected agents, causing cascading errors and compromised decision-making throughout the agent network.

Key Features

  • Shared knowledge base poisoning
  • Cross-agent propagation
  • Systemic knowledge corruption

Primary Defenses

  • Shared knowledge validation
  • Cross-referencing between agents
  • Knowledge source verification

Key Risks

Systemic knowledge corruptionMulti-agent compromiseCascading decision errorsNetwork-wide false information

Ethical Guidelines for Memory & Context Attacks

When working with memory & context attacks techniques, always follow these ethical guidelines:

  • • Only test on systems you own or have explicit written permission to test
  • • Focus on building better defenses, not conducting attacks
  • • Follow responsible disclosure practices for any vulnerabilities found
  • • Document and report findings to improve security for everyone
  • • Consider the potential impact on users and society
  • • Ensure compliance with all applicable laws and regulations

AI Red Teaming

closed

Loading...