The first AI agent rule standard
with provenance tracking
Every rule has a confidence score you can compute yourself. Every mapping has a provenance you can audit. No black boxes, no vendor lock-in — just a public formula, open-source code, and wild-validated data.
Confidence is a number, not an opinion
Every component is computed from measurable facts. Run it yourself — the formula is public.
Measured false-positive rate on real-world corpora.
How much real data the rule has survived.
Detection depth — distinct attack layers covered.
Honest acknowledgment of known bypass techniques.
Two-dimensional compliance model
The industry first: separating 'does the rule have the metadata' from 'who verified it'.
Does the rule have the required metadata? Detection conditions, test cases, OWASP and MITRE references, false positive documentation. Machine-verifiable in under a millisecond.
validateRuleMeetsStandard(rule)Who verified the metadata? human-reviewed, community-contributed, auto-generated, or llm-generated. Stable promotion requires verified provenance — not just presence.
metadata_provenance: { mitre_atlas: human-reviewed }Traditional rule standards (Sigma, YARA, OWASP CRS) treat compliance as binary — either the metadata is there or it is not. This creates a perverse incentive: vendors pad metadata to pass the check without doing the underlying review work.
ATR separates the two dimensions. Auto-generated mappings can pass the experimental gate for fast iteration. Stable promotion — the level enterprises block in production — requires human review. Fast iteration and honest trust, at the same time.
Every rule has an explicit gate to climb
Promotion requires passing specific, mechanical criteria. Demotion is automatic on quality regression.
Valid schema · ≥1 TP + 1 TN · no ReDoS
Not deployed
3+ TP + 3+ TN · CI pass · OWASP + MITRE mapping encouraged (not required) · evasion tests encouraged (not required)
Alert-only
Wild-validated (1,000+ samples) · FP rate ≤ 0.5% · human-verified provenance · ≥3 evasion tests
Block in production
Stable rules with a wild false positive rate above 2%, or three unresolved false positive reports within 30 days, are automatically demoted to experimental. No human decision required. The system self-corrects.
Six stages before a rule reaches production
An LLM-drafted rule passes through six independent verification stages before it ever protects a user. Each stage has mechanical, public criteria.
LLM Drafter
Claude Sonnet generates a YAML rule against a strict prompt requiring 3+ conditions, 5+ TP/TN, 3+ evasion tests, and OWASP + MITRE mapping.
Syntax Gate
Regex extraction, invalid pattern rejection, PCRE-to-JS normalization. Broken rules are dropped with logged reasons.
Quality Gate
The RFC-001 formula runs: detection depth, test coverage, reference mapping, documentation completeness. Below the bar — rejected.
Canary Observation
Accepted rules enter a canary window. Independent confirmations and wild FP measurements gate further promotion.
Human Review
Provenance starts as llm-generated. Human review upgrades to human-reviewed before the rule can reach stable.
Upstream PR
Promoted rules open pull requests against the public ATR repository for open review and merge.
ATR-2026-DRAFT-8f3c9a72Hidden Credential Exfiltration with Silent Execution Override
llm-generatedTagged honestly as LLM-generated. Confidence capped at 70 until human review upgrades it.
How ATR compares to existing rule standards
Sigma, YARA, OWASP CRS, and Suricata solved this for malware, SIEM, WAF, and IDS. Nobody had solved it for AI agents — until now.
| Feature | ATR | Sigma | YARA | OWASP CRS | Suricata |
|---|---|---|---|---|---|
| Maturity ladder with explicit gates | ✓ | ✓ | — | ✓ | ✓ |
| Formula-based confidence score (0–100) | ✓ | — | — | — | ~ |
| Wild validation required for production | ✓ | — | — | — | — |
| Per-field provenance tracking | ✓ | — | — | — | — |
| Automatic demotion on quality regression | ✓ | — | — | — | — |
| Open-source reference implementation | ✓ | ✓ | ✓ | ✓ | ✓ |
ATR is the only standard requiring wild-scan validation with measured FP rates and automatic demotion on quality regression.
Don't trust us — run the validator
Every function is pure, open-source, and documented. Score your own rules — or ours — in under a minute.
npm install agent-threat-rules
import {
parseATRRule,
computeConfidence,
validateRuleMeetsStandard,
} from 'agent-threat-rules/quality';
const rule = parseATRRule(yamlContent);
const score = computeConfidence(rule);
const gate = validateRuleMeetsStandard(rule, 'stable');
console.log('Confidence:', score.total); // 0-100
console.log('Passes stable:', gate.passed);
console.log('Issues:', gate.issues);Measurable. Auditable. Open.
The ATR Quality Standard is live, in production, and ready to adopt. Any scanner — ATR, Cisco, Snyk, Microsoft AGT, or yours — can score rules on the same axes with the same library.