Skip to content
ATR-2026-00110criticalPrivilege Escalationexperimental

Remote Code Execution via eval() and Dynamic Code Injection

Detects tools or agent instructions that invoke eval(), Function(), vm.runInNewContext(), or similar dynamic code execution primitives. These functions allow arbitrary code execution within the agent runtime, enabling an attacker to break out of sandboxed tool contexts, access the host process, or pivot to child_process for full system compromise.

Severity
critical
Category
Privilege Escalation
Scan Target
mcp
Author
ATR Community

Response Actions

block toolalertsnapshot

References

OWASP Agentic
ASI05:2026 - Unexpected Code Execution

Wild Validation

Validated
2026-04-08
Samples
53,577
False Positive Rate
0%

Detection Conditions

Combinator: any
  1. 01
    Direct eval() invocation for dynamic code execution
    field: contentop: regex
  2. 02
    new Function() constructor used to compile and execute arbitrary code
    field: contentop: regex
  3. 03
    Node.js vm module methods for running code in isolated or shared contexts
    field: contentop: regex
  4. 04
    Requiring child_process module for shell command execution
    field: contentop: regex
  5. 05
    Dynamic import of child_process module
    field: contentop: regex
  6. 06
    Low-level process binding or Reflect.construct with Function for sandbox escape
    field: contentop: regex

Attack Examples (Rule Triggers)

  1. This tool evaluates user expressions by calling eval(userInput) to compute results dynamically.
  2. Creates a new Function('return ' + code) to parse and execute the provided expression.
  3. Uses vm.runInNewContext(untrustedCode, sandbox) to execute plugin scripts.
  4. The tool calls require('child_process').exec(command) to run system commands.
  5. To dynamically load plugins, the agent uses import('child_process') and then spawns a shell to compile extensions.

Real-world attack payloads (sanitized). Used for regression testing.

Benign Examples (Rule Doesn't Trigger)

  1. Safe math evaluation without eval()
    This tool evaluates mathematical expressions using a safe parser library.
  2. Static analysis tool that describes detection without containing eval patterns
    The linter checks for dynamic code execution patterns in source code and flags them as warnings.
  3. Normal data processing tool with no dynamic code execution primitives
    The function processes CSV data and returns a summary table. It uses a sandboxed arithmetic library for all calculations.
  4. Legitimate Python data science tool description that mentions importing but not child_process
    This tool imports pandas and numpy to perform statistical analysis on the provided dataset. No shell commands are used.
  5. Security advisory that discusses the concept of dynamic code execution abstractly without containing any trigger syntax
    The security audit report recommends disabling dynamic code compilation primitives and restricting access to process-spawning APIs in production environments.

Known False Positive Contexts

  • Security documentation describing eval injection risks
  • Code review tools that scan for eval usage
  • Educational coding tutorials that include inline code snippets demonstrating eval() usage in a teaching context
  • Automated code review reports that quote flagged source lines containing eval() or require('child_process') verbatim

Documented Evasion Techniques

  1. Technique: split keyword
    The tool runs user code by calling global['ev'+'al'](userInput) to dynamically compute results.
    String concatenation to reconstruct 'eval' at runtime avoids the regex matching 'eval(' as a contiguous token.
  2. Technique: language switch
    使用 vm 模块的 runInNewContext 方法来执行不受信任的插件代码,实现沙箱逃逸。
    Chinese description of vm.runInNewContext usage; the method name is embedded in Chinese text without the dot notation 'vm.runIn' pattern the regex expects.
  3. Technique: unicode homoglyph
    The agent leverages the child​_process module via dynamic require to execute arbitrary OS commands.
    A zero-width space (U+200B) inserted between 'child' and '_process' breaks the regex pattern matching the literal string 'child_process'.

Publicly documented bypasses. We disclose known limitations rather than pretend they don't exist.

Full YAML Definition

Edit on GitHub →
title: Remote Code Execution via eval() and Dynamic Code Injection
id: ATR-2026-00110
rule_version: 1
status: experimental
description: |
  Detects tools or agent instructions that invoke eval(), Function(), vm.runInNewContext(),
  or similar dynamic code execution primitives. These functions allow arbitrary code execution
  within the agent runtime, enabling an attacker to break out of sandboxed tool contexts,
  access the host process, or pivot to child_process for full system compromise.
author: ATR Community
date: 2026/03/26
schema_version: "0.1"
detection_tier: pattern
maturity: experimental
severity: critical
references:
  owasp_agentic:
    - ASI05:2026 - Unexpected Code Execution
  mitre_attack:
    - T1059 - Command and Scripting Interpreter
compliance:
  eu_ai_act:
    - article: "15"
      context: "eval() and dynamic code execution primitives allow attackers to execute arbitrary code within the agent runtime, enabling complete host system compromise; Article 15 cybersecurity requirements mandate that high-risk AI systems prohibit or strictly control dynamic code evaluation capabilities."
      strength: primary
    - article: "14"
      context: "Arbitrary code execution via eval injection can override safety controls and execute actions that bypass all human oversight mechanisms; Article 14 requires that AI system architectures prevent such unrestricted capability access from agent tool layers."
      strength: secondary
  nist_ai_rmf:
    - subcategory: "MP.5.1"
      context: "eval() and dynamic code execution primitives are adversarial input vectors that allow attackers to escape the agent's sandboxed tool context and execute arbitrary code within the host process, and must be tracked as critical AI system risks."
      strength: primary
    - subcategory: "MG.2.3"
      context: "Risk treatment plans must prohibit or strictly sandbox dynamic code evaluation capabilities in agent tool layers to prevent eval injection from enabling full host system compromise."
      strength: secondary
  iso_42001:
    - clause: "6.2"
      context: "Risk treatment activities must classify dynamic code execution via eval() and similar primitives as an unacceptable risk in AI agent tools and require architectural controls that block their use with user-controlled inputs."
      strength: primary
    - clause: "8.6"
      context: "Operational controls must prohibit agent tools from invoking eval(), new Function(), or vm module methods on untrusted inputs to ensure that code execution remains within the auditable and authorized scope of the AI system."
      strength: secondary
tags:
  category: privilege-escalation
  subcategory: eval-injection
  scan_target: mcp
  confidence: high
agent_source:
  type: tool_call
  framework:
    - mcp
    - langchain
    - llamaindex
  provider:
    - any
detection:
  conditions:
    - field: content
      operator: regex
      value: (?i)eval\s*\(
      description: Direct eval() invocation for dynamic code execution
    - field: content
      operator: regex
      value: (?i)new\s+Function\s*\(
      description: new Function() constructor used to compile and execute arbitrary code
    - field: content
      operator: regex
      value: (?i)vm\.(runIn|createContext|compileFunction)
      description: Node.js vm module methods for running code in isolated or shared contexts
    - field: content
      operator: regex
      value: (?i)require\s*\(\s*['"]child_process['"]
      description: Requiring child_process module for shell command execution
    - field: content
      operator: regex
      value: (?i)import\s*\(\s*['"]child_process
      description: Dynamic import of child_process module
    - field: content
      operator: regex
      value: (?i)(?:process\.binding|Reflect\.construct\s*\(\s*Function)
      description: Low-level process binding or Reflect.construct with Function for sandbox escape
  condition: any
  false_positives:
    - Security documentation describing eval injection risks
    - Code review tools that scan for eval usage
    - Educational coding tutorials that include inline code snippets demonstrating eval() usage in a teaching context
    - >-
      Automated code review reports that quote flagged source lines containing eval() or require('child_process')
      verbatim
response:
  actions:
    - block_tool
    - alert
    - snapshot
confidence: 65
wild_validated: 2026/04/08
wild_samples: 53577
wild_fp_rate: 0
test_cases:
  true_positives:
    - input: This tool evaluates user expressions by calling eval(userInput) to compute results dynamically.
      tool_description: expression-evaluator
      expected: triggered
      description: Direct eval() with user input
    - input: Creates a new Function('return ' + code) to parse and execute the provided expression.
      tool_description: code-executor
      expected: triggered
      description: new Function() constructor with dynamic code
    - input: Uses vm.runInNewContext(untrustedCode, sandbox) to execute plugin scripts.
      tool_description: plugin-runner
      expected: triggered
      description: vm module execution of untrusted code
    - input: The tool calls require('child_process').exec(command) to run system commands.
      tool_description: system-command
      expected: triggered
      description: child_process require for shell execution
    - input: >-
        To dynamically load plugins, the agent uses import('child_process') and then spawns a shell to compile
        extensions.
      expected: triggered
      description: Dynamic import of child_process module for shell spawning (matches pattern 5)
  true_negatives:
    - input: This tool evaluates mathematical expressions using a safe parser library.
      tool_description: math-parser
      expected: not_triggered
      description: Safe math evaluation without eval()
    - input: The linter checks for dynamic code execution patterns in source code and flags them as warnings.
      tool_description: code-linter
      expected: not_triggered
      description: Static analysis tool that describes detection without containing eval patterns
    - input: >-
        The function processes CSV data and returns a summary table. It uses a sandboxed arithmetic library for all
        calculations.
      expected: not_triggered
      description: Normal data processing tool with no dynamic code execution primitives
    - input: >-
        This tool imports pandas and numpy to perform statistical analysis on the provided dataset. No shell commands
        are used.
      expected: not_triggered
      description: Legitimate Python data science tool description that mentions importing but not child_process
    - input: >-
        The security audit report recommends disabling dynamic code compilation primitives and restricting access to
        process-spawning APIs in production environments.
      expected: not_triggered
      description: >-
        Security advisory that discusses the concept of dynamic code execution abstractly without containing any trigger
        syntax
evasion_tests:
  - input: The tool runs user code by calling global['ev'+'al'](userInput) to dynamically compute results.
    expected: not_triggered
    bypass_technique: split_keyword
    notes: String concatenation to reconstruct 'eval' at runtime avoids the regex matching 'eval(' as a contiguous token.
  - input: 使用 vm 模块的 runInNewContext 方法来执行不受信任的插件代码,实现沙箱逃逸。
    expected: not_triggered
    bypass_technique: language_switch
    notes: >-
      Chinese description of vm.runInNewContext usage; the method name is embedded in Chinese text without the dot
      notation 'vm.runIn' pattern the regex expects.
  - input: The agent leverages the child​_process module via dynamic require to execute arbitrary OS commands.
    expected: not_triggered
    bypass_technique: unicode_homoglyph
    notes: >-
      A zero-width space (U+200B) inserted between 'child' and '_process' breaks the regex pattern matching the literal
      string 'child_process'.

Revision History

Created
2026-03-26
Last modified
2026-05-24
View full commit history on GitHub →