Skip to content
ATR-2026-00001highPrompt Injectionexperimental

Direct Prompt Injection via User Input

Detects direct prompt injection attempts where a user embeds malicious instructions within their input to override the agent's intended behavior. This rule uses layered detection covering: instruction override verbs with target nouns, persona switching, temporal behavioral overrides, fake system delimiters, restriction removal, encoding- wrapped payloads (base64, hex, unicode homoglyphs), and zero-width character obfuscation of injection keywords. Patterns are designed for evasion resistance with word boundary anchors, flexible whitespace, and synonym coverage based on published attack taxonomies.

嚴重度
high
類別
Prompt Injection
掃描目標
mcp
作者
ATR Community

建議回應

block inputalertsnapshotescalate

參考資料

OWASP Agentic
ASI01:2026 - Agent Goal Hijack
OWASP LLM
LLM01:2025 - Prompt Injection
MITRE ATLAS
AML.T0051 - LLM Prompt InjectionAML.T0051.000 - Direct