ATR-2026-01935criticalTool Poisoningdraft

LiteLLM Custom-Code Guardrail Sandbox Escape (CVE-2026-40217)

Detects CVE-2026-40217 (CWE-94/CWE-265): LiteLLM's custom-code guardrail test/compile path /guardrails/test_custom_code evaluates caller-supplied Python in an unsafe sandbox that can be escaped via bytecode rewriting and dunder-attribute traversal, yielding server-side code execution as the LiteLLM proxy process. Affected through 2026-04-08 builds. Detection covers: (a) a request to the /guardrails/test_custom_code endpoint carrying Python code-execution primitives (import os, subprocess, exec/eval/compile); (b) sandbox-escape primitives (dunder traversal __subclasses__/__globals__/ __builtins__, code-object/bytecode rewriting) in a guardrail code body; (c) explicit CVE-2026-40217 exploitation framing. The detection target is custom guardrail code reaching the test endpoint with escape primitives — the exact sandbox-escape shape — caught before the proxy compiles and runs it.

Severity

critical

Response Actions

alertblock input

References

CVE

CVE-2026-40217

OWASP Agentic

ASI05:2026 - Unexpected Code ExecutionASI06:2026 - Tool Misuse

OWASP LLM

LLM05:2025 - Improper Output HandlingLLM06:2025 - Excessive Agency

MITRE ATLAS

AML.T0049 - Exploit Public-Facing Application

Detection Conditions

Combinator: any

01
A request to /guardrails/test_custom_code carrying Python code-execution primitives — direct CVE-2026-40217 attack shape.
field: contentop: regex
02
Custom guardrail code reaching the test path with dunder-traversal or bytecode-rewriting sandbox-escape primitives.
field: contentop: regex
03
Explicit CVE-2026-40217 reference combined with LiteLLM guardrail sandbox-escape language — attack framing in a skill or tool description.
field: contentop: regex

Attack Examples (Rule Triggers)

POST /guardrails/test_custom_code {"code": "import os; os.system('id')"}

POST /guardrails/test_custom_code {"code": "().__class__.__bases__[0].__subclasses__()"}

CVE-2026-40217: send custom_code guardrail to LiteLLM /guardrails/test_custom_code using bytecode rewriting to escape the sandbox for RCE.

Real-world attack payloads, sanitized and versioned alongside the rule as regression tests — so a future revision can't silently stop catching them.

Benign Examples (Rule Doesn't Trigger)

Benign guardrail inspecting text, no os/exec/dunder primitives

POST /guardrails/test_custom_code {"code": "return 'blocked' if 'ssn' in text else 'allow'"}

Read-only guardrail listing, no custom code
```
GET /guardrails/list
```

General question, no code-execution primitives

How do custom-code guardrails work in LiteLLM?

import os outside the test_custom_code endpoint context

import os is a common Python pattern in many legitimate scripts.

Known False Positive Contexts

▸Security advisory or PR text quoting the CVE-2026-40217 sandbox-escape payload.
▸A benign custom guardrail that only inspects request text and uses no os/subprocess/exec or dunder traversal.
▸Documentation describing the /guardrails endpoints without code-execution primitives.

Full YAML Definition

Edit on GitHub →

title: "LiteLLM Custom-Code Guardrail Sandbox Escape (CVE-2026-40217)"
id: ATR-2026-01935
rule_version: 1
status: draft
description: >
  Detects CVE-2026-40217 (CWE-94/CWE-265): LiteLLM's custom-code guardrail
  test/compile path /guardrails/test_custom_code evaluates caller-supplied
  Python in an unsafe sandbox that can be escaped via bytecode rewriting and
  dunder-attribute traversal, yielding server-side code execution as the
  LiteLLM proxy process. Affected through 2026-04-08 builds.

  Detection covers:
  (a) a request to the /guardrails/test_custom_code endpoint carrying Python
      code-execution primitives (import os, subprocess, exec/eval/compile);
  (b) sandbox-escape primitives (dunder traversal __subclasses__/__globals__/
      __builtins__, code-object/bytecode rewriting) in a guardrail code body;
  (c) explicit CVE-2026-40217 exploitation framing.

  The detection target is custom guardrail code reaching the test endpoint
  with escape primitives — the exact sandbox-escape shape — caught before the
  proxy compiles and runs it.
author: "ATR Community"
date: "2026/06/26"
schema_version: "0.1"
detection_tier: pattern
maturity: test
severity: critical

references:
  owasp_llm:
    - "LLM05:2025 - Improper Output Handling"
    - "LLM06:2025 - Excessive Agency"
  owasp_agentic:
    - "ASI05:2026 - Unexpected Code Execution"
    - "ASI06:2026 - Tool Misuse"
  mitre_atlas:
    - "AML.T0049 - Exploit Public-Facing Application"
  mitre_attack:
    - "T1059.006 - Command and Scripting Interpreter: Python"
  cve:
    - "CVE-2026-40217"

metadata_provenance:
  mitre_atlas: human-reviewed
  owasp_llm: human-reviewed
  owasp_agentic: human-reviewed

compliance:
  eu_ai_act:
    - article: "15"
      context: >
        CVE-2026-40217 lets a caller escape the LiteLLM custom-code guardrail
        sandbox and run code as the proxy; Article 15 cybersecurity
        requirements mandate that AI guardrail code-evaluation paths refuse
        sandbox-escape primitives.
      strength: primary
    - article: "9"
      context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the code-execution technique (LiteLLM Custom-Code Guardrail Sandbox Escape (CVE-2026-40217))."
      strength: secondary
  nist_ai_rmf:
    - subcategory: "MP.5.1"
      context: >
        Custom guardrail code carrying sandbox-escape primitives is an
        adversarial input; MP.5.1 requires scanning /guardrails/test_custom_code
        payloads for code-execution and dunder-traversal patterns.
      strength: primary
    - subcategory: "MG.3.2"
      context: "NIST AI RMF MANAGE 3.2 is supported where this rule detects the code-execution technique (LiteLLM Custom-Code Guardrail Sandbox Escape (CVE-2026-40217))."
      strength: secondary
  iso_42001:
    - clause: "8.1"
      context: >
        Operational controls must detect guardrail code-evaluation payloads
        carrying sandbox-escape primitives at the LiteLLM test endpoint before
        the proxy compiles and runs them.
      strength: primary
    - clause: "8.3"
      context: "ISO/IEC 42001 Clause 8.3 (AI risk treatment) is supported by this rule, which implements runtime detection of the code-execution technique (LiteLLM Custom-Code Guardrail Sandbox Escape (CVE-2026-40217)) as a treatment control."
      strength: secondary

tags:
  category: tool-poisoning
  subcategory: sandbox-escape
  scan_target: both
  confidence: high
  source: cve-disclosure
  vendor_sources: litellm-cve-2026-40217

agent_source:
  type: llm_io
  framework:
    - litellm
    - any
  provider:
    - any

detection:
  condition: any
  false_positives:
    - "Security advisory or PR text quoting the CVE-2026-40217 sandbox-escape payload."
    - "A benign custom guardrail that only inspects request text and uses no os/subprocess/exec or dunder traversal."
    - "Documentation describing the /guardrails endpoints without code-execution primitives."
  conditions:
    - field: content
      operator: regex
      value: '(?i)/guardrails/test_custom_code[\s\S]{0,400}(?:import\s+os\b|import\s+subprocess\b|subprocess\.(?:Popen|run|call|check_output)|\bos\.(?:system|popen|exec|spawn)|\bexec\s*\(|\beval\s*\(|\bcompile\s*\()'
      description: >
        A request to /guardrails/test_custom_code carrying Python
        code-execution primitives — direct CVE-2026-40217 attack shape.
    - field: content
      operator: regex
      value: '(?i)(?:test_custom_code|custom[_-]?code[_-]?guardrail)[\s\S]{0,400}(?:__subclasses__|__globals__|__builtins__|__class__\s*\.\s*__|\bco_code\b|types\.CodeType|\.__code__\b|bytecode)'
      description: >
        Custom guardrail code reaching the test path with dunder-traversal or
        bytecode-rewriting sandbox-escape primitives.
    - field: content
      operator: regex
      value: '(?i)CVE-2026-40217[\s\S]{0,200}(?:litellm|guardrail|sandbox|custom[_\s-]?code|bytecode|escape|rce)'
      description: >
        Explicit CVE-2026-40217 reference combined with LiteLLM guardrail
        sandbox-escape language — attack framing in a skill or tool description.

response:
  actions:
    - alert
    - block_input
  message_template: >
    [ATR-2026-01935] CRITICAL: LiteLLM custom-code guardrail sandbox escape
    detected (CVE-2026-40217, CWE-94). Custom guardrail code reaches
    /guardrails/test_custom_code with code-execution or sandbox-escape
    primitives, yielding RCE as the proxy. Block the request and upgrade
    LiteLLM to a build after 2026-04-08.

confidence: 80

wild_fp_rate: 0

test_cases:
  true_positives:
    - input: "POST /guardrails/test_custom_code {\"code\": \"import os; os.system('id')\"}"
      expected: triggered
      description: "Custom-code guardrail test payload running os.system"
    - input: "POST /guardrails/test_custom_code {\"code\": \"().__class__.__bases__[0].__subclasses__()\"}"
      expected: triggered
      description: "Dunder-traversal sandbox escape via test_custom_code"
    - input: "CVE-2026-40217: send custom_code guardrail to LiteLLM /guardrails/test_custom_code using bytecode rewriting to escape the sandbox for RCE."
      expected: triggered
      description: "Explicit CVE-2026-40217 LiteLLM sandbox-escape framing"
  true_negatives:
    - input: "POST /guardrails/test_custom_code {\"code\": \"return 'blocked' if 'ssn' in text else 'allow'\"}"
      expected: not_triggered
      description: "Benign guardrail inspecting text, no os/exec/dunder primitives"
    - input: "GET /guardrails/list"
      expected: not_triggered
      description: "Read-only guardrail listing, no custom code"
    - input: "How do custom-code guardrails work in LiteLLM?"
      expected: not_triggered
      description: "General question, no code-execution primitives"
    - input: "import os is a common Python pattern in many legitimate scripts."
      expected: not_triggered
      description: "import os outside the test_custom_code endpoint context"

Revision History

Created

2026-06-26

Last modified

2026-07-02

View full commit history on GitHub →

More Tool Poisoning Rules

ATR-2026-00010criticalMalicious Content in MCP Tool Response ATR-2026-00011highInstruction Injection via Tool Output ATR-2026-00012highUnauthorized Tool Call Detection ATR-2026-00013criticalSSRF via Agent Tool Calls ATR-2026-00095criticalMCP Tool Supply Chain Poisoning