ATR-2026-00052high過度自主experimental

Cascading Failure Detection in Agent Pipelines

Detects cascading failure patterns in automated agent pipelines where a false signal, error, or compromised output propagates through multiple stages with escalating impact. Covers auto-approval chains, error propagation without human checkpoints, automated rollback triggers from unverified sources, and pipeline stages that amplify incorrect signals. These patterns exploit the "trust the previous stage" assumption in multi-step agent workflows. Note: This rule detects textual descriptions of cascading failure patterns, not live cascading failures. Structural cascade prevention requires behavioral monitoring.

嚴重度

high

類別

過度自主

掃描目標

mcp

作者

ATR Community

建議回應

alertsnapshot

參考資料

OWASP Agentic

ASI08:2026 - Cascading Failures

OWASP LLM

LLM06:2025 - Excessive AgencyLLM05:2025 - Improper Output Handling

MITRE ATLAS

AML.T0053 - LLM Plugin CompromiseAML.T0046 - Spamming ML System with Chaff Data

實地驗證

驗證日期

2026-04-08

樣本數

53,577

誤報率

偵測條件

組合方式: any

01
Automated approval bypassing human review checkpoint
欄位: content運算子: regex
02
Error propagating downstream through pipeline stages
欄位: content運算子: regex
03
Multiple pipeline stages failing in sequence
欄位: content運算子: regex
04
Stage consuming upstream output without validation
欄位: content運算子: regex
05
Destructive action triggered by automated pipeline signal
欄位: content運算子: regex
06
Error amplification across multiple pipeline stages
欄位: content運算子: regex
07
Fully automated pipeline with no human-in-the-loop for critical actions
欄位: content運算子: regex
08
Blind execution of upstream agent recommendations
欄位: content運算子: regex
09
Explicit cascading failure terminology (excludes retrospective analysis context)
欄位: content運算子: regex
10
Cascading retry/fallback loops across stages
欄位: content運算子: regex
11
Multiple downstream components affected by single failure
欄位: content運算子: regex

攻擊範例（規則會觸發）

Auto-approved deployment without human review based on AI test results. Pushing to production.

Stage 2 failed with timeout. Stage 3 also failed due to missing input from stage 2. Stage 4 subsequently failed.

Rollback all production services triggered by automated monitoring agent alert.

Trusting previous stage output without validation. Proceeding to deploy the generated configuration.

Cascading failure detected: payment service down caused order service, notification service, and analytics to fail.

Fully automated pipeline with destructive teardown stage. No human in the loop for the delete operation.

Blindly executing recommendations from upstream AI agent. Applying all suggested database schema changes.

Retry triggered automatically by previous failure. Fallback triggered by retry timeout. Second fallback triggered.

Amplifying the error across all 12 downstream services. Impact spreading to production cluster.

All 8 dependent agents affected by the coordinator failure. Entire pipeline halted.

以上為真實攻擊 payload 脫敏版本。用於 regression testing。

正常樣本（規則不會觸發）

Normal pipeline completion with human approval gate

Pipeline completed successfully. All 5 stages passed validation. Awaiting human approval for deployment.

Proper circuit breaker stopping cascade and alerting human

Stage 3 encountered an error. Circuit breaker activated. Alerting on-call engineer for review.

Intentional fault injection for resilience testing

Running chaos engineering test: simulating stage 2 failure to verify pipeline resilience.

Post-incident analysis discussing past cascade

Post-incident review: the cascading failure last week was caused by an unhandled null pointer in stage 1.

Automated tests with proper human-in-the-loop before action

Automated tests passed. Sending results to reviewer for manual sign-off before merge.

已知誤報情境

▸Legitimate pipeline monitoring reporting multi-stage errors for debugging
▸Chaos engineering or fault injection testing in controlled environments
▸Post-incident analysis discussing cascading failure root causes
▸Circuit breaker implementations that intentionally detect cascade patterns

已記錄的規避手法

手法: language switch
```
自动批准部署，无需人工审查。基于AI测试结果推送到生产环境。
```
Chinese translation of auto-approval without human review bypasses all regex patterns which only match English keywords.
手法: unicode homoglyph
```
The domino effect caused the entire cluster to go down within minutes.
```
Zero-width characters inserted within 'domino' and 'effect' break the regex pattern matching for 'domino effect' in pattern 9.
手法: casual paraphrase
```
We just let the upstream agent's output flow right into production without anyone double-checking it. The whole thing snowballed from there.
```
Casually describes trusting upstream output without validation using informal language that avoids the specific verb+noun structures required by pattern 4 (trusting/accepting/using + previous/upstream/prior + stage/step/agent + output/result/response + without/no + validation/verification/check).

這些是公開記錄的繞過手法。誠實揭露限制，而不是假裝不存在。

完整 YAML 定義

在 GitHub 編輯 →

title: Cascading Failure Detection in Agent Pipelines
id: ATR-2026-00052
rule_version: 1
status: experimental
description: >
  Detects cascading failure patterns in automated agent pipelines where

  a false signal, error, or compromised output propagates through

  multiple stages with escalating impact. Covers auto-approval chains,

  error propagation without human checkpoints, automated rollback

  triggers from unverified sources, and pipeline stages that amplify

  incorrect signals. These patterns exploit the "trust the previous

  stage" assumption in multi-step agent workflows.

  Note: This rule detects textual descriptions of cascading failure patterns, not live cascading failures. Structural
  cascade prevention requires behavioral monitoring.
author: ATR Community
date: 2026/03/09
schema_version: "0.1"
detection_tier: pattern
maturity: experimental
severity: high
references:
  owasp_llm:
    - LLM06:2025 - Excessive Agency
    - LLM05:2025 - Improper Output Handling
  owasp_agentic:
    - ASI08:2026 - Cascading Failures
  mitre_atlas:
    - AML.T0053 - LLM Plugin Compromise
    - AML.T0046 - Spamming ML System with Chaff Data

compliance:
  eu_ai_act:
    - article: "14"
      context: "Cascading failures propagating across agent pipelines make human intervention increasingly difficult as the failure scope widens; early detection satisfies Article 14's requirement for effective human override capability."
      strength: primary
    - article: "15"
      context: "Article 15 accuracy and robustness requirements demand that high-risk AI systems handle failure propagation gracefully; this rule provides the monitoring signal required to contain cascading events."
      strength: secondary
  nist_ai_rmf:
    - subcategory: "GV.1.2"
      context: "Cascading failures exploit the absence of defined human-in-the-loop checkpoints in agent pipeline accountability structures; GV.1.2 requires that these roles and checkpoints are defined and enforced before automated pipelines propagate errors."
      strength: primary
    - subcategory: "MG.3.2"
      context: "Multi-stage pipeline failures are AI incidents requiring predefined response processes; MG.3.2 mandates that cascading failure response procedures exist so that failure scope can be contained before all downstream agents are affected."
      strength: secondary
  iso_42001:
    - clause: "6.2"
      context: "ISO 42001 clause 6.2 risk treatment activities must cover cascading failure scenarios in multi-agent pipelines; this rule detects the propagation patterns and auto-approval chains that trigger uncontrolled cascade events."
      strength: primary
    - clause: "8.6"
      context: "Clause 8.6 operational controls require that AI pipeline stages execute with appropriate verification gates; detection of blind upstream trust and automated destructive triggers enforces the human checkpoint requirements in pipeline design."
      strength: secondary

tags:
  category: excessive-autonomy
  subcategory: cascading-failure
  scan_target: mcp
  confidence: medium
agent_source:
  type: llm_io
  framework:
    - any
  provider:
    - any
detection:
  conditions:
    - field: content
      operator: regex
      value: >-
        (?i)auto[_-]?approv(?:ed|ing|al).{0,30}(?:without|no|skipping)\s+(?:human|manual|review|check|verification|validation)
      description: Automated approval bypassing human review checkpoint
    - field: content
      operator: regex
      value: >-
        (?i)(?:propagat(?:ed|ing)|forward(?:ed|ing)|pass(?:ed|ing))\s+(?:error|failure|exception|fault)\s+(?:to|through|downstream|to\s+next\s+(?:stage|step|agent))
      description: Error propagating downstream through pipeline stages
    - field: content
      operator: regex
      value: >-
        (?i)(?:stage|step|phase)\s+\d+\s+(?:failed|error|timed\s+out).*(?:stage|step|phase)\s+\d+\s+(?:also|subsequently|then)\s+(?:failed|errored)
      description: Multiple pipeline stages failing in sequence
    - field: content
      operator: regex
      value: >-
        (?i)(?:trusting|accepting|using)\s+(?:previous|upstream|prior)\s+(?:stage|step|agent)\s+(?:output|result|response)\s+(?:without|no)\s+(?:validation|verification|check)
      description: Stage consuming upstream output without validation
    - field: content
      operator: regex
      value: >-
        (?i)(?:rollback|revert|undo|destroy|delete|terminate)\s+(?:all|everything|entire|production|live).{0,40}(?:based\s+on|triggered\s+by|due\s+to)\s+(?:automated|agent|AI|pipeline)
      description: Destructive action triggered by automated pipeline signal
    - field: content
      operator: regex
      value: >-
        (?i)(?:amplif(?:ied|ying)|escalat(?:ed|ing)|magnif(?:ied|ying))\s+(?:the\s+)?(?:error|issue|problem|signal|impact|damage)\s+(?:across|through|to)\s+(?:multiple|all|downstream)
      description: Error amplification across multiple pipeline stages
    - field: content
      operator: regex
      value: >-
        (?i)(?:no\s+(?:human|manual)\s+(?:in|checkpoint|review|gate|approval)\s+(?:the|this)?\s*(?:loop|pipeline|chain|workflow)|(?:fully|completely)\s+automated\s+(?:pipeline|workflow|chain)\s+(?:with|including)\s+(?:destructive|production|critical))
      description: Fully automated pipeline with no human-in-the-loop for critical actions
    - field: content
      operator: regex
      value: >-
        (?i)(?:blind(?:ly)?|unconditional(?:ly)?|automatic(?:ally)?)\s+(?:executing|running|applying|deploying).{0,60}(?:previous|upstream|AI|agent|automated)(?:\s+(?:AI|agent|system|model))?
      description: Blind execution of upstream agent recommendations
    - field: content
      operator: regex
      value: >-
        (?i)(?!(?:post[-
        ]?(?:incident|mortem)|review|analysis|report|last\s+(?:week|month|time)|caused\s+by|root\s+cause|lessons?\s+learned|investigation|retrospective).{0,80})(?:chain\s+reaction|domino\s+(?:effect|failure)|failure\s+cascade|cascading\s+(?:error|failure|shutdown|rollback))(?!.{0,80}(?:was\s+caused\s+by|root\s+cause|post[-
        ]?(?:incident|mortem)|lessons?\s+learned|last\s+(?:week|month|time)))
      description: Explicit cascading failure terminology (excludes retrospective analysis context)
    - field: content
      operator: regex
      value: >-
        (?i)(?:retry|fallback)\s+(?:triggered|activated|initiated)\s+(?:automatically|by\s+(?:previous|upstream)).*(?:retry|fallback)\s+(?:triggered|activated|initiated)
      description: Cascading retry/fallback loops across stages
    - field: content
      operator: regex
      value: >-
        (?i)(?:all|every)\s+\d+\s+(?:downstream|dependent|connected)\s+(?:agents?|services?|stages?|steps?)\s+(?:affected|impacted|failed|halted)
      description: Multiple downstream components affected by single failure
  condition: any
  false_positives:
    - Legitimate pipeline monitoring reporting multi-stage errors for debugging
    - Chaos engineering or fault injection testing in controlled environments
    - Post-incident analysis discussing cascading failure root causes
    - Circuit breaker implementations that intentionally detect cascade patterns
response:
  actions:
    - alert
    - snapshot
  auto_response_threshold: high
  message_template: |
    [ATR-2026-052] HIGH: Cascading failure pattern detected in agent
    pipeline. Signal propagating through multiple stages without human
    checkpoint. Pattern: {matched_pattern}. Pipeline paused for review.
    Session: {session_id}. Verify upstream signal integrity.
confidence: 81
wild_validated: 2026/04/08
wild_samples: 53577
wild_fp_rate: 0
test_cases:
  true_positives:
    - input: Auto-approved deployment without human review based on AI test results. Pushing to production.
      expected: triggered
      description: Auto-approval bypassing human review for production deployment
    - input: Stage 2 failed with timeout. Stage 3 also failed due to missing input from stage 2. Stage 4 subsequently failed.
      expected: triggered
      description: Sequential multi-stage pipeline failure cascade
    - input: Rollback all production services triggered by automated monitoring agent alert.
      expected: triggered
      description: Destructive rollback triggered by automated signal
    - input: Trusting previous stage output without validation. Proceeding to deploy the generated configuration.
      expected: triggered
      description: Pipeline stage consuming unvalidated upstream output
    - input: >-
        Cascading failure detected: payment service down caused order service, notification service, and analytics to
        fail.
      expected: triggered
      description: Cascading failure across dependent services
    - input: Fully automated pipeline with destructive teardown stage. No human in the loop for the delete operation.
      expected: triggered
      description: Fully automated pipeline with destructive actions and no human gate
    - input: Blindly executing recommendations from upstream AI agent. Applying all suggested database schema changes.
      expected: triggered
      description: Blind execution of AI-generated database changes
    - input: >-
        Retry triggered automatically by previous failure. Fallback triggered by retry timeout. Second fallback
        triggered.
      expected: triggered
      description: Cascading retry-fallback loop across stages
    - input: Amplifying the error across all 12 downstream services. Impact spreading to production cluster.
      expected: triggered
      description: Error amplification across multiple downstream services
    - input: All 8 dependent agents affected by the coordinator failure. Entire pipeline halted.
      expected: triggered
      description: Single failure impacting all downstream agents
  true_negatives:
    - input: Pipeline completed successfully. All 5 stages passed validation. Awaiting human approval for deployment.
      expected: not_triggered
      description: Normal pipeline completion with human approval gate
    - input: Stage 3 encountered an error. Circuit breaker activated. Alerting on-call engineer for review.
      expected: not_triggered
      description: Proper circuit breaker stopping cascade and alerting human
    - input: "Running chaos engineering test: simulating stage 2 failure to verify pipeline resilience."
      expected: not_triggered
      description: Intentional fault injection for resilience testing
    - input: "Post-incident review: the cascading failure last week was caused by an unhandled null pointer in stage 1."
      expected: not_triggered
      description: Post-incident analysis discussing past cascade
    - input: Automated tests passed. Sending results to reviewer for manual sign-off before merge.
      expected: not_triggered
      description: Automated tests with proper human-in-the-loop before action
evasion_tests:
  - input: 自动批准部署，无需人工审查。基于AI测试结果推送到生产环境。
    expected: not_triggered
    bypass_technique: language_switch
    notes: >-
      Chinese translation of auto-approval without human review bypasses all regex patterns which only match English
      keywords.
  - input: The domino effect caused the entire cluster to go down within minutes.
    expected: not_triggered
    bypass_technique: unicode_homoglyph
    notes: >-
      Zero-width characters inserted within 'domino' and 'effect' break the regex pattern matching for 'domino effect'
      in pattern 9.
  - input: >-
      We just let the upstream agent's output flow right into production without anyone double-checking it. The whole
      thing snowballed from there.
    expected: not_triggered
    bypass_technique: casual_paraphrase
    notes: >-
      Casually describes trusting upstream output without validation using informal language that avoids the specific
      verb+noun structures required by pattern 4 (trusting/accepting/using + previous/upstream/prior + stage/step/agent
      + output/result/response + without/no + validation/verification/check).

修訂歷史

建立於

2026-03-09

最後修改

2026-05-24

在 GitHub 查看完整 commit 歷史 →