Domain-Camouflaged Injection Attacks Evade Detection in Multi-Agent LLM Systems
sbulaev
35 points
4 comments
May 22, 2026
Related Discussions
Found 5 related stories in 85.3ms across 8,303 title embeddings via pgvector HNSW
- Document poisoning in RAG systems: How attackers corrupt AI's sources aminerj · 98 pts · March 12, 2026 · 51% similar
- Simple Sabotage of Agents Tallain · 11 pts · April 26, 2026 · 50% similar
- My minute-by-minute response to the LiteLLM malware attack Fibonar · 336 pts · March 26, 2026 · 50% similar
- LLMs can unmask pseudonymous users at scale with surprising accuracy Gagarin1917 · 42 pts · March 04, 2026 · 49% similar
- CrabTrap: An LLM-as-a-judge HTTP proxy to secure agents in production pedrofranceschi · 106 pts · April 21, 2026 · 49% similar
Discussion Highlights (4 comments)
BarryMilo
This is an "uh oh" moment, isn't it?
simonw
It concerns me that anyone with anything important to protect might trust what this paper calls "Injection detectors deployed to protect LLM agents" - Llama Guard and the like. There are unlimited combinations of tokens that can be used to attack an LLM system. The idea that some kind of "detector" can catch them all just feels inherently absurd to me.
buppermint
The paper title is a bit misleading. The tested detectors and models here are small and rather dated (Llama 3.1 8B and Gemini Flash 2.0 - these are basically in the level of a modern 1B model), and the actual paper says this only shows vulnerability in small model systems.
dwa3592
Why weren't these attacks tested on the frontier models? The models they tested these on can also be fooled by poems and rhymes.