Skip to content
ATR-2026-00011highTool Poisoningexperimental

Instruction Injection via Tool Output

Detects hidden instructions embedded in tool outputs that attempt to manipulate the agent's subsequent behavior. Tool responses may contain injected directives disguised as data that instruct the agent to perform unauthorized actions, change behavior, or exfiltrate information. Detection covers: urgency-prefixed directives addressing the agent, direct agent manipulation commands, information suppression directives, tool invocation instructions, data exfiltration commands, hidden instruction tags, response injection directives, conversational steering, system-pretending tokens, fake API response structures, subtle action-required patterns, and steganographic instruction embedding. Patterns are designed to require multiple signals where possible to reduce false positives.

嚴重度
high
類別
Tool Poisoning
掃描目標
mcp
作者
ATR Community

建議回應

block toolalertsnapshotescalate

參考資料

OWASP Agentic
ASI02:2026 - Tool Misuse and Exploitation
OWASP LLM
LLM01:2025 - Prompt InjectionLLM05:2025 - Improper Output Handling
MITRE ATLAS
AML.T0053 - LLM Plugin CompromiseAML.T0051.001 - Indirect Prompt Injection