Warning added to Python documentation was deemed preferable to a patch

Python path traversal bug from 2007 still present in 350k open source repos

An estimated 350,000 open source repositories are affected by a 15-year old path traversal vulnerability in Python’s tarfile module, according to security researchers.

Having “stumbled across” the unpatched issue while investigating an unrelated vulnerability, they initially thought the flaw was a new zero-day bug before realizing it actually dated back to 2007.

Originally designated a severity score of CVSS 6.8, CVE-2007-4559 allows attackers to gain code execution from the file-write “in most cases”, said Trellix researcher Kasimir Schulz in a blog post published yesterday (September 21).


Catch up with the latest open source software security news


Schulz explained how Trellix researchers successfully achieved this outcome against wireless protocol analyzer Universal Radio Hacker, IT infrastructure management service Polemarch, and Spyder IDE, an open-source development environment for scientific programming.

Patching marathon

In a separate Trellix blog post, Douglas McKee said 61% of a sample of around 300,000 files containing the tarfile module, which makes it possible to read and write tar archives, were vulnerable to the bug.

Trellix has created patches for around 11,000 projects and anticipates that a further 70,000 projects will receive a fix in the coming weeks.

The vulnerability arose “from two or three lines of code using unsanitized tarfile.extract() or the built-in defaults of tarfile.extractall(),” explains a third Trellix blog post by Charles McFarland. “Failure to write any safety code to sanitize the members files before calling tarfile.extract() or tarfile.extractall() results in a directory traversal vulnerability, enabling a bad actor access to the file system.”

If an attacker adds ‘..’ with the separator for the operating system (‘/’ or ‘\’) into the file name, they can escape the directory the file is supposed to be extracted into.

Just a warning

However, a Python bug thread from 2007, which has been reopened in recent days, concluded with a maintainer asserting that “tarfile.py does nothing wrong, its behaviour conforms to the pax definition and pathname resolution guidelines in POSIX. There is no known or possible practical exploit”.

The maintainer did add a warning to the Python documentation, which remains in place, that urges developers to “never extract archives from untrusted sources without prior inspection”.

Corner cases

Trellix’s McKee acknowledged that there are legitimate use cases for preserving behavior that could be abused for malicious purposes. However, he said, “in this instance I believe the risk outweighs the reward for accommodating a few corner cases”, especially given “most” third party online tutorials seemed to “incorrectly demonstrate the insecure use of the tarfile module”.

Trellix created and open-sourced a tool to aid the research and patching process that automatically detects the vulnerability in source code by leveraging AST intermediate representation.

Called Creosote, the utility provides a means to scan closed source repositories too.

“This vulnerability is incredibly easy to exploit” and prevalent in the wild, making Python’s tarfile module “a massive supply chain issue threatening infrastructure around the world”, concluded Schulz.


YOU MIGHT ALSO LIKE Parse Server fixes brute-forcing bug that put sensitive user data at risk