Skip to content

Open standard · draft RFC · call for collaborators · companion to ATR

Agentic Threat Detection

The open, executable detection standard for agent-native threats — the runnable layer beneath OWASP, MITRE ATLAS, CWE and AVID.

Editor's Draft — pre-governance·Version 0.1.0·Date 2026-06-13·Canonical /atd·Editor Adam Lin
24
techniques cataloged
9
CVE-backed
9
with a live rule
100% · 0 FP
recall · 0 FP (skill corpus, n=341)
98%
recall (garak in-the-wild, n=650)

ATD's detection rules ship in ATR — in production in Microsoft Agent Governance Toolkit and Cisco AI Defense, and mapped in MISP/CIRCL. Consumers integrate ATR rules; this is not an endorsement of this draft standard. DOI 10.5281/zenodo.19178002.

AbstractInformative

Agentic Threat Detection (ATD) is an open, machine-readable, executable enumeration of agent-native threat techniques. Each technique is bound to detection logic and mapped to the established prose frameworks — OWASP Agentic Top 10, MITRE ATLAS, CWE, and AVID.

Those frameworks tell you what can go wrong with an agent. ATD is the layer underneath that tells you how to detect it at runtime — and ships with a measurable false-positive rate. It is to agentic security what Sigma is to SIEM detection.

Status: an Editor's Draft — a request for comments and an open invitation to co-author. It is not yet a ratified standard; it earns that title when it meets the §8 neutrality bar. We publish it openly to gather collaborators and mapping partners, not to claim authority.

Scope — what ATD is and is notNormative

ATD enumerates agent-runtime threat techniques detectable in agent I/O: prompt/content, tool calls, tool responses, inter-agent messages, memory operations, traces, screen state, and payment mandates.

  • ATD MUST NOT mint vulnerability instance identifiers. A specific CVE in a specific MCP server lives in CVE / GHSA / OSV / AVID; ATD references it, it does not replace it.
  • ATD MUST NOT present itself as a competing top-level risk taxonomy. The "top ten agentic risks" framing belongs to OWASP ASI; ATD sits beneath it and makes it runnable.
  • ATD MUST map every technique to at least one upstream framework, or document the gap where none exists yet.

This boundary is deliberate. Vulnerability registries that mirrored CVE's territory (e.g. the now-archived GSD) fragmented and stalled. ATD fills the uncovered executable detection layer, it does not re-fight a settled lane.

Conceptual modelNormative

ATD uses two axes, borrowed from proven standards:

  • Tactic / Technique matrix (MITRE ATLAS model): a Tactic is the adversary's goal-phase; a Technique is a concrete, detectable attack pattern; a Sub-technique is a variant.
  • Abstraction hierarchy (CWE model): each technique is tagged pillar / class / base / variant.

Identifiers. Tactics are ATD-TA1ATD-TA9. Techniques are ATD-T0001 (zero-padded, sequential, permanent, never reused); sub-techniques use dotted notation ATD-T0001.001. Each individual detection rule additionally carries a UUIDv4 so rules survive renaming — techniques are the catalog, rules are the executable instances bound to them.

Maturity. Every technique and every rule carries a status on the ladder experimental → test → stable (plus deprecated). Production consumers SHOULD auto-sync only stable.

Technique catalogNormative

24 techniques across 9 tactics. Each maps to OWASP ASI / MITRE ATLAS / CWE with a real CVE or research citation; 9 already ship a live ATR detection rule. Entries with no public instance are marked research / aspirational. Framework ids verified against primary sources (2026-06-14).

ATD-TA1 · Protocol & Interconnect (9)

ATD-T0001Shell metacharacter injection through MCP tool parameterslive rule ↗

Unsanitized MCP tool input reaches execSync/exec, yielding RCE on the server host.

CVE-2025-53355ASI05ASI02AML.T0053CWE-77
ATD-T0002curl-fallback command injection in an MCP serverlive rule ↗

A failed fetch falls back to exec'ing curl with an unsanitized URL, enabling RCE.

CVE-2025-53967ASI05ASI02AML.T0053CWE-420
ATD-T0003Command injection in a scaffolded MCP stdio serverlive rule ↗

Generated server concatenates tool input into exec(), giving RCE to anything built from it.

CVE-2025-54994ASI04ASI05AML.T0010.005CWE-78
ATD-T0004Line-jumping — tool-description injection at listing timelive rule ↗

Hidden instructions in a tool description enter the model context at tools/list, before any call.

research ↗ASI04ASI01AML.T0110AML.T0104CWE-1427
ATD-T0005Rug pull — silent mutation of an approved toollive rule ↗

A server approved once later changes a tool's definition with no integrity re-check.

research ↗ASI04AML.T0109AML.T0110CWE-494
ATD-T0006Missing-auth MCP proxy command execution

No auth between client and MCP proxy lets any local/web-driven request spawn MCP processes.

CVE-2025-49596ASI03ASI05CWE-306
ATD-T0007RCE from a malicious upstream MCP server

A client is RCE'd via a crafted authorization_endpoint URL in an untrusted server's response.

CVE-2025-6514ASI04ASI05AML.T0010.005CWE-78
ATD-T0008DNS-rebinding to a localhost MCP server

A malicious page rebinds DNS to reach an unauthenticated localhost MCP server cross-origin.

CVE-2025-66416ASI07ASI03CWE-1188
ATD-T0009Session ID / auth token placed in a URL query stringlive rule ↗

A credential in the query string leaks via server logs, proxies, CDNs, history, and Referer.

research ↗ASI03ASI07CWE-598

ATD-TA2 · Memory & Context Integrity (2)

ATD-T0010Serialized-object smuggling through an LLM response field

Injected output carries a serialization marker; deserialization rehydrates it as trusted and exfiltrates secrets.

CVE-2025-68664ASI06ASI01AML.T0051.001CWE-502
ATD-T0011Persistent memory / context-store poisoning

Attacker-controlled content is written into data the agent later reads back as trusted context.

research ↗ASI06AML.T0080CWE-349

ATD-TA3 · Goal, Planning & Reasoning (2)

ATD-T0012Indirect prompt injection via tool / API responselive rule ↗

Malicious text in returned tool data overrides the agent's plan and redirects its actions.

research ↗ASI01AML.T0051.001AML.T0099CWE-1427
ATD-T0013System-prompt / guardrail extraction to plan evasion

Crafted queries coerce the agent to reveal its hidden system prompt, exposing control logic.

research ↗ASI01ASI09AML.T0056CWE-200

ATD-TA4 · Identity, Authz & Delegation (1)

ATD-T0014Confused-deputy token passthrough in an MCP server

A server forwards a held token to a downstream API without audience validation, escalating privilege.

research ↗ASI03CWE-441

ATD-TA5 · Tool & Supply Chain (2)

ATD-T0015Agent reads .env / secret files without consentlive rule ↗

An agent tool reads credential files (.env, credentials, .npmrc) outside any user-approved scope.

research ↗ASI02ASI06AML.T0053CWE-538
ATD-T0016Hallucinated-dependency squatting (slopsquatting)

The model recommends a fabricated package name an attacker pre-registers, pulling code into the agent env.

research ↗ASI04ASI08AML.T0060AML.T0062CWE-1427

ATD-TA6 · Execution & Autonomy (3)

ATD-T0017Path-traversal blacklist bypass via non-canonical pathslive rule ↗

Exact-string path checks are bypassed with ../, /./, redundant slashes to reach sensitive files.

CVE-2025-66689ASI02ASI03AML.T0053CWE-22
ATD-T0018MCP filesystem sandbox escape via symlink following

A symlink inside an allowed directory resolves to an out-of-scope path, granting system file access.

CVE-2025-53109ASI02ASI05AML.T0053CWE-59
ATD-T0019Prompt-injection-to-RCE via an agent's file-write capability

Injected instructions drive the agent to write a startup/config file that yields persistent code execution.

research ↗ASI05ASI01AML.T0053AML.T0051.001CWE-94

ATD-TA7 · Multi-Agent Dynamics (2)

ATD-T0020Agent Card poisoning to capture A2A task routing

A rogue A2A agent advertises an instruction-laden Agent Card so the orchestrator routes tasks to it.

research ↗ASI07ASI10AML.T0051.001CWE-345
ATD-T0021Cross-agent injection propagation (cascading compromise)

One compromised agent emits content that injects the next downstream agent, cascading through the swarm.

research ↗ASI08ASI07ASI10AML.T0051.001AML.T0080CWE-1427

ATD-TA8 · Model-Intrinsic & Governance (1)

ATD-T0022Trace tampering / non-tamper-evident agent audit logs

An agent logs reasoning but not the actual tool call, or logs are mutable — defeating after-the-fact audit.

research ↗ASI10CWE-778

ATD-TA9 · Agentic Commerce (forward) (2)

ATD-T0023Adversarial transaction steering of a purchasing agent

Injected content in a listing/page steers an autonomous-commerce agent to overpay or leak payment authority.

aspirationalASI01ASI02ASI09CWE-1427
ATD-T0024Payment-mandate forgery in an agent-to-agent handshake

A rogue agent spoofs delegated payment authority or mandate scope in an agentic-commerce exchange.

aspirationalASI03ASI07CWE-345

Entry schemaNormative

A technique entry is a YAML/JSON document validated against a normative JSON Schema (Draft 2020-12), shipped in-repo. It is designed to be compatible with the OSV and Sigma ecosystems. Required fields:

atd_id            ATD-T####            permanent
schema_version    SemVer, no "v"       additive-minor guarantee (OSV)
title             short imperative phrase
tactic            ATD-TA#
abstraction       pillar | class | base | variant
status            experimental | test | stable | deprecated
description       the technique and its attack mechanism
detection_surface content | tool_input | tool_response |
                  inter_agent_msg | memory_op | trace | screen | payment_mandate
mappings          owasp_asi[] · mitre_atlas[] · cwe[] · avid[] · maestro_layer[]
references         advisory / CVE / research URLs (evidence it is real)
detection_rules   UUIDv4[] into the rule corpus

A detection rule reuses the ATR rule schema already in production: regex/conditions, true_positives and true_negatives (a precision test), false-positive notes, response actions, and a valid agent_source.type.

Mappings & interoperabilityInformative

Legitimacy comes from interoperability, not from declaring authority. Every ATD technique maps to OWASP ASI, MITRE ATLAS and CWE where a slot exists; AVID and the MAESTRO layer are added where they apply. ATD borrows the authority of these frameworks and feeds its ★gaps back to them as proposed techniques and case studies.

ATD is designed to interoperate with — not compete against — AVID, which already operates a governed, submission-open AI vulnerability registry. ATD supplies the executable detection AVID's "Detection" report type expects.

ConformanceNormative

  • A conformant technique entry MUST validate against the JSON Schema, carry at least one framework mapping (or a documented gap), and cite at least one reference.
  • A conformant detection rule MUST pass the precision gate: every declared true-positive matches, zero false positives on the published benign corpus, and no cross-rule conflict.
  • A conformant consumer MUST honor the maturity field and SHOULD expose the ATD-T#### id on every detection it emits.

Governance & statusInformative

ATD is governed by a written charter and published as an open draft and an invitation to co-author. It earns the title standard when it meets a concrete, public neutrality bar adopted from the OpenSSF project lifecycle — at least three maintainers across at least two organizations — and we are actively seating them.

Governance follows a Minimal Viable Governance charter: lazy consensus of maintainers, a simple-majority Technical Steering Committee fallback, charter amendments by a two-thirds vote, and no single-person veto. The specification is CC-BY-4.0 and forkable by design — no party can hold it hostage — with a committed roadmap to a neutral foundation home (an OpenSSF working group, or the OWASP GenAI Security Project).

Call for collaborators — if your team works on agent security, map your tooling to ATD, submit a technique, or take a maintainer seat. See /contribute.

Downloads & artifactsInformative


Editor: Adam Lin — Specification CC-BY-4.0 · Schema & tooling Apache-2.0 · Rule corpus MIT — ISO 8601 2026-06-13 Editor's Draft, pre-governance; not a ratified standard.