Skip to content
RFC-001 · Quality Standard

The first AI agent rule standard
with provenance tracking

Every rule has a confidence score you can compute yourself. Every mapping has a provenance you can audit. No black boxes, no vendor lock-in — just a public formula, open-source code, and wild-validated data.

34
ATR rules merged into Cisco AI Defense
96,096
Real agent skills scanned across 6 registries
99.6%
Precision on PINT adversarial benchmark
100%
Recall on SKILL.md corpus, 0.20% FP rate
The Formula

Confidence is a number, not an opinion

Every component is computed from measurable facts. Run it yourself — the formula is public.

confidence = 0.4 × precision + 0.3 × wild + 0.2 × coverage + 0.1 × evasion
Precision40%
(1 − wild_fp_rate) × 100

Measured false-positive rate on real-world corpora.

Wild validation30%
min(wild_samples / 10,000, 1) × 100

How much real data the rule has survived.

Coverage20%
min(conditions / 5, 1) × 100

Detection depth — distinct attack layers covered.

Evasion docs10%
min(documented_evasions / 5, 1) × 100

Honest acknowledgment of known bypass techniques.

90–100 · Very High
Safe to block in production
60–79 · Medium
Alert-only with monitoring
<40 · Draft
Do not deploy
The Differentiator

Two-dimensional compliance model

The industry first: separating 'does the rule have the metadata' from 'who verified it'.

Dimension 1 · Technical compliance

Does the rule have the required metadata? Detection conditions, test cases, OWASP and MITRE references, false positive documentation. Machine-verifiable in under a millisecond.

validateRuleMeetsStandard(rule)
Dimension 2 · Trust compliance

Who verified the metadata? human-reviewed, community-contributed, auto-generated, or llm-generated. Stable promotion requires verified provenance — not just presence.

metadata_provenance: { mitre_atlas: human-reviewed }
Why this matters

Traditional rule standards (Sigma, YARA, OWASP CRS) treat compliance as binary — either the metadata is there or it is not. This creates a perverse incentive: vendors pad metadata to pass the check without doing the underlying review work.

ATR separates the two dimensions. Auto-generated mappings can pass the experimental gate for fast iteration. Stable promotion — the level enterprises block in production — requires human review. Fast iteration and honest trust, at the same time.

The Ladder

Every rule has an explicit gate to climb

Promotion requires passing specific, mechanical criteria. Demotion is automatic on quality regression.

Draft
Promotion gate

Valid schema · ≥1 TP + 1 TN · no ReDoS

Deployment guidance

Not deployed

Experimental
Promotion gate

3+ TP + 3+ TN · CI pass · OWASP + MITRE mapping encouraged (not required) · evasion tests encouraged (not required)

Deployment guidance

Alert-only

Stable
Promotion gate

Wild-validated (1,000+ samples) · FP rate ≤ 0.5% · human-verified provenance · ≥3 evasion tests

Deployment guidance

Block in production

Automatic demotion

Stable rules with a wild false positive rate above 2%, or three unresolved false positive reports within 30 days, are automatically demoted to experimental. No human decision required. The system self-corrects.

The Gauntlet

Six stages before a rule reaches production

An LLM-drafted rule passes through six independent verification stages before it ever protects a user. Each stage has mechanical, public criteria.

Stage 1

LLM Drafter

Claude Sonnet generates a YAML rule against a strict prompt requiring 3+ conditions, 5+ TP/TN, 3+ evasion tests, and OWASP + MITRE mapping.

Stage 2

Syntax Gate

Regex extraction, invalid pattern rejection, PCRE-to-JS normalization. Broken rules are dropped with logged reasons.

Stage 3

Quality Gate

The RFC-001 formula runs: detection depth, test coverage, reference mapping, documentation completeness. Below the bar — rejected.

Stage 4

Canary Observation

Accepted rules enter a canary window. Independent confirmations and wild FP measurements gate further promotion.

Stage 5

Human Review

Provenance starts as llm-generated. Human review upgrades to human-reviewed before the rule can reach stable.

Stage 6

Upstream PR

Promoted rules open pull requests against the public ATR repository for open review and merge.

Live Crystallization Output · Gate PassedATR-2026-DRAFT-8f3c9a72

Hidden Credential Exfiltration with Silent Execution Override

severity · critical
5
Detection layers
5 + 5
TP + TN cases
3
Evasion tests
100%
Required fields
OWASP
LLM01:2025 — Prompt Injection
ASI01:2026 — Agent Behaviour Hijack
MITRE ATLAS
AML.T0051 — LLM Prompt Injection
Provenance
llm-generated

Tagged honestly as LLM-generated. Confidence capped at 70 until human review upgrades it.

The Landscape

How ATR compares to existing rule standards

Sigma, YARA, OWASP CRS, and Suricata solved this for malware, SIEM, WAF, and IDS. Nobody had solved it for AI agents — until now.

FeatureATRSigmaYARAOWASP CRSSuricata
Maturity ladder with explicit gates
Formula-based confidence score (0–100)~
Wild validation required for production
Per-field provenance tracking
Automatic demotion on quality regression
Open-source reference implementation

ATR is the only standard requiring wild-scan validation with measured FP rates and automatic demotion on quality regression.

Verify It Yourself

Don't trust us — run the validator

Every function is pure, open-source, and documented. Score your own rules — or ours — in under a minute.

Install
npm install agent-threat-rules
Score any rule
import {
  parseATRRule,
  computeConfidence,
  validateRuleMeetsStandard,
} from 'agent-threat-rules/quality';

const rule = parseATRRule(yamlContent);
const score = computeConfidence(rule);
const gate = validateRuleMeetsStandard(rule, 'stable');

console.log('Confidence:', score.total);    // 0-100
console.log('Passes stable:', gate.passed);
console.log('Issues:', gate.issues);

Measurable. Auditable. Open.

The ATR Quality Standard is live, in production, and ready to adopt. Any scanner — ATR, Cisco, Snyk, Microsoft AGT, or yours — can score rules on the same axes with the same library.