Introduction
Attackers can gain arbitrary file write capabilities on systems running vulnerable Keras installations by exploiting a flaw in how tar archives are extracted. This vulnerability has direct implications for any machine learning workflow that downloads and unpacks datasets or models from external sources using Keras utilities.
Keras is a widely adopted deep learning framework maintained by Google and the open-source community. It is used by researchers, enterprises, and cloud services globally, with millions of downloads and integration into major platforms. The keras.utils.get_file() function is a common utility for dataset and model management, making any vulnerability in this area highly impactful for the machine learning ecosystem.
Technical Information
The vulnerability tracked as CVE-2025-12060 affects the keras.utils.get_file() function when used to download and extract tar archives. When the extract parameter is set to True, get_file() delegates extraction to Python's tarfile.extractall() method. In affected versions, this call does not use the filter='data' parameter, which is required to prevent path traversal attacks that leverage symlinks within the archive.
Attackers can create tar archives containing symbolic links that point outside the intended extraction directory. When such an archive is processed by a vulnerable Keras installation, files can be written to arbitrary locations on the filesystem, limited only by the permissions of the running process. This can lead to system compromise or arbitrary code execution if sensitive locations are targeted.
The vulnerability is further complicated by a bug in symlink resolution related to path length limits. Keras attempts to filter unsafe paths using a function called filter_safe_paths(), but this occurs before extraction. During extraction, the tarfile module's symlink resolution can fail due to these path length issues, bypassing the intended security checks.
The patch for this vulnerability introduces a new extraction function that uses filter='data' on Python 3.12 and later, which provides robust protection against path traversal. For earlier Python versions, a custom filter is implemented to validate symlink targets more effectively. The vulnerability is closely related to Python's CVE-2025-4517, which affects tarfile's handling of symlinks and path traversal.
No public proof of concept code is available at this time, but the technical details are sufficient for attackers to construct malicious archives capable of exploiting the flaw.
Affected Systems and Versions
- Keras versions prior to 3.12.0 are affected.
- The vulnerability impacts any use of keras.utils.get_file() with extract=True on tar or tar.gz archives.
- Systems running Python versions vulnerable to CVE-2025-4517 are at additional risk if Keras is not updated.
Vendor Security History
Keras has previously experienced vulnerabilities related to file handling and model deserialization. Notable examples include issues with Lambda layer deserialization and unsafe model loading, which have been addressed in past releases. The Keras team participates in coordinated disclosure programs such as huntr and has demonstrated rapid response to security reports. The fix for CVE-2025-12060 was released promptly in version 3.12.0, with improvements to extraction logic for both new and legacy Python environments.



