ATR-2026-00001highPrompt Injectionexperimental
Direct Prompt Injection via User Input
Detects direct prompt injection attempts where a user embeds malicious instructions within their input to override the agent's intended behavior. This rule uses layered detection covering: instruction override verbs with target nouns, persona switching, temporal behavioral overrides, fake system delimiters, restriction removal, encoding- wrapped payloads (base64, hex, unicode homoglyphs), and zero-width character obfuscation of injection keywords. Patterns are designed for evasion resistance with word boundary anchors, flexible whitespace, and synonym coverage based on published attack taxonomies.
嚴重度
high
類別
Prompt Injection
掃描目標
mcp
作者
ATR Community
建議回應
block inputalertsnapshotescalate
參考資料
OWASP Agentic
ASI01:2026 - Agent Goal Hijack
OWASP LLM
LLM01:2025 - Prompt Injection
MITRE ATLAS
AML.T0051 - LLM Prompt InjectionAML.T0051.000 - Direct