Introduction
An unsafe eval() call buried in a math grading utility within ByteDance's verl framework means that anyone who can influence a training dataset can achieve arbitrary code execution on the machine running model evaluation. For organizations training large language models with reinforcement learning, this vulnerability turns a routine reward scoring step into a remote code execution vector.
ByteDance's verl is an open source reinforcement learning training framework for LLMs, supporting algorithms like PPO and GRPO. With over 19,000 GitHub stars, it has become a widely adopted tool in the ML research community. Its role in the training pipeline makes it a high value target: compromise here can affect model integrity, training infrastructure, and downstream deployments.
Technical Information
Root Cause: Unsanitized eval() on Model Output
The vulnerability resides in the math_equal() function within verl/utils/reward_score/prime_math/grader.py, specifically at lines 298 through 301. This function is responsible for comparing a model's generated answer against a ground truth reference during reward scoring. When the ground truth answer is a matrix type problem (indicated by the presence of \begin{pmatrix} in the reference string) and the model's extracted prediction starts with [ and ends with ], the code passes the prediction string directly into Python's built in eval() function.
Here is the vulnerable code:
elif r"\begin{pmatrix}" in reference and prediction.startswith("[") and prediction.endswith("]"): if isinstance(eval(prediction), list): # ← SINK: directly eval untrusted input pred_matrix = eval(prediction) # ← second eval
There is no input sanitization, no allowlisting of permitted characters, and no sandbox isolation. The eval() call will execute any valid Python expression contained in the prediction string.
Attack Flow: Indirect Prompt Injection to RCE
The exploitation path is indirect prompt injection, a technique where an attacker does not interact with the vulnerable system directly but instead poisons the data that flows into it. The attack proceeds as follows:
-
Dataset Poisoning: The attacker embeds malicious instructions into a public math training dataset. These instructions are designed to coerce the language model into outputting a specific payload string when it encounters a matrix type math problem.
-
Model Output Manipulation: During training or evaluation, the model processes the poisoned data and produces an output containing the attacker's payload, formatted to look like a list literal (starting with
[and ending with]). -
Trigger Conditions Met: The reward scoring pipeline calls
match_answer(), which extracts the prediction. For the vulnerable branch to execute, three conditions must be satisfied simultaneously:- The ground truth reference must contain
\begin{pmatrix}(a matrix type problem). - The extracted prediction must start with
[and end with]. - The prediction must not contain an underscore
_, becausehandle_base()would truncate the string and raise aValueError.
- The ground truth reference must contain
-
Code Execution: When
math_equal()reaches the matrix comparison branch, it callseval(prediction). The attacker's payload, embedded within the list brackets, executes as arbitrary Python code on the training server.
Bypassing Restrictions
The underscore restriction is notable because it prevents the use of Python dunder methods like __import__(). Attackers bypass this by using exec() instead, which does not require underscores. The match_answer() function also requires at least one digit character in the response, which is a trivial constraint to satisfy within a math problem context.
Additional eval() Instances
The vulnerability report also identifies another instance of unsafe dynamic code evaluation in the handle_pi() function at line 82 of the same grader.py file, suggesting a broader pattern of unsafe coding practices in this module.
Proof of Concept
A fully verified, end to end public PoC exploit exists for CVE-2026-6878. It was published by ZAST.AI in their vulnerability reports repository on GitHub and is referenced by both NVD and VulDB. A companion video demonstration is also available. The vendor issue (verl-project/verl#5331) remains open.
The PoC script (verl_rce.py) performs the following steps:
- Launches Ollama with the
qwen2.5:14b-instructmodel. - Sends a sequence of prompt injection payloads (system override, fake math problem, few shot guidance) to induce the model to output a crafted answer string.
- The crafted RCE payload is:
RCE_PAYLOAD = 'exec("import os; os.system(\'echo PWNED1 by ZAST.AI > /tmp/poc/verl-rce-proof.txt\')")'
- The model's output (e.g.,
The answer is [exec("import os; os.system('echo PWNED1 by ZAST.AI > /tmp/poc/verl-rce-proof.txt')")]) is fed into verl's reward scoring pipeline viamatch_answer()thenmath_equal(). - Inside
math_equal(), the matrix comparison branch callseval(prediction), executing the embeddedexec()call, which runsos.system()and writes a proof file to/tmp/poc/verl-rce-proof.txt. - The script checks for the existence and content of the proof file to confirm successful RCE.
Reproduction steps:
# 1. Install Ollama and pull the model curl -fsSL https://ollama.com/install.sh | sh ollama pull qwen2.5:14b-instruct # 2. Clone verl and install git clone https://github.com/verl-project/verl.git cd verl pip install -e . # 3. Run the PoC python3 verl_rce.py
If the prompt injection strategies against the local LLM all fail, the script includes a deterministic fallback that manually constructs the malicious model output string (The answer is [<payload>]) and feeds it into math_equal() directly, confirming the eval() sink is exploitable regardless of the injection step. The PoC was verified on macOS with Ollama and verl 0.7.0.
Affected Systems and Versions
The vulnerability affects the following versions of ByteDance's verl framework:
| Vendor | Product | Affected Versions |
|---|---|---|
| ByteDance | verl | 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7.0 |
The main branch was also confirmed to be vulnerable at the time of the initial analysis. Any deployment using verl up to and including version 0.7.0 that processes matrix type math problems through the reward scoring pipeline is affected.
Vendor Security History
ByteDance was added as a CVE Numbering Authority on January 13, 2026, for vulnerabilities in their own products. Despite this, the vendor was contacted early about CVE-2026-6878 but did not respond in any way. VulDB acted as the CNA for this vulnerability in the absence of vendor engagement. The open GitHub issue (verl-project/verl#5331) remains unresolved.
References
- NVD Entry for CVE-2026-6878
- CVE Record: CVE-2026-6878
- ZAST.AI Vulnerability Report (verl RCE)
- ZAST.AI PoC Exploit Script (verl_rce.py)
- verl GitHub Issue #5331
- VulDB Entry: CVE-2026-6878
- VulDB Submission Details
- VulDB CTI Data for CVE-2026-6878
- ZAST.AI Vulnerability Reports Repository
- ByteDance Added as CVE Numbering Authority



