10.3. Python

As with any language, beware of any functions which allow data to be executed as parts of a program, to make sure an untrusted user can’t affect their input. This includes exec(), eval(), and execfile() (and frankly, you should check carefully any call to compile()). The input() statement is also surprisingly dangerous. [Watters 1996, 150].

Python programs with privileges that can be invoked by unprivileged users (e.g., setuid/setgid programs) must not import the “user” module. The user module causes the pythonrc.py file to be read and executed. Since this file would be under the control of an untrusted user, importing the user module allows an attacker to force the trusted program to run arbitrary code.

Python does very little compile-time checking -- it has essentially no compile-time type information, for example. This is unfortunate, resulting in a lot of latent bugs (both John Viega and I have experienced this problem). Hopefully someday Python will implement optional static typing and type-checking, an idea that’s been discussed for some time. A partial solution for now is PyChecker, a lint-like program that checks for common bugs in Python source code. You can get PyChecker from http://pychecker.sourceforge.net

Before Python version 2.3, Python included support for “Restricted Execution” through its RExec and Bastion classes. The RExec class was primarily intended for executing applets and mobile code, but it could also be used to try to limit privilege in a program even when the code has not been provided externally. The Bastion module was intended to supports restricted access to another object. For more information, see Kuchling [2000]. Earlier versions of this book identified these functions but noted them as "programmer beware", and I was right to be concerned. More recent analysis has found that RExec and Bastion are fundamentally flawed, and have unfixable exploitable security flaws. Thus, these classes have been removed from Python 2.3, and should not be used to enforce security in any version of Python. There is ongoing work to develop alternative approaches to running untrusted Python code, such as the experimental Sandbox.py module. Do not use this experimental Sandbox.py module for serious purposes yet.

Supporting secure execution of untrusted code in Python turns out to be a rather difficult problem. For example, allowing a user to unrestrictedly add attributes to a class permits all sorts of ways to subvert the environment because Python’s implementation calls many “hidden” methods. By default, most Python objects are passed by reference; if you insert a reference to a mutable value into a restricted program’s environment, the restricted program can change the object in a way that’s visible outside the restricted environment. Thus, if you want to give access to a mutable value, in many cases you should copy the mutable value. Fundamentally, Python is designed to be a clean and highly reflexive language, which is good for a general-purpose language but makes handling malicious code more difficult.

Python supports operations called "pickle" and "unpickling" to conveniently store and retrieve sets of objects. NEVER unpickle data from an untrusted source. Python 2.2 did a half-hearted job of trying to support unpickling from untrusted sources (the __safe_for_unpickling__ attribute), but it was never audited and probably never really worked. Python 2.3 has removed all of this, and made explicitly clear that unpickling is not a safe operation. For more information, see PEP 307.