I like Python; code written in Python tends to be very easy-to-read, and the massive number of libraries make it easy to make useful programs.
| It was a mistake to have not implemented both Python 2 and Python 3 in a single executable. |
Python 3 is in many ways an improvement over Python 2, but it was a mistake to have not implemented both Python 2 and Python 3 in a single executable. This makes it very difficult to transition to Python 3. If your program doesn't use any libraries, transitioning tends to be easy; just use the "2to3" conversion program, fix up some problems manually, and you are off and running. But that's a weird special case, because Python is popular and has spawned a massive number of libraries. Most Python programs depend on libraries, which depend on other libraries, which depend on other libraries (you get the point). If any library anywhere doesn't support Python 3, then nothing else that depends on it can use Python 3 either. Linux distributors find it painful to support Python 3 because of this (see this Fedora thread and the Fedora Python3 F13 page). Because of the library bottleneck, and the basic incompatibilities between Python 2 and Python 3, Python 3 uptake has been slow.
I think it was a terrible mistake to have combined changing the programming language with switching to a different C implementation. It is perfectly possible to have a single "python" executable that implements both version 2 and version 3 semantics to allow mixing of the different notations. Then, any program could be written with the version 3 semantics, yet call on libraries that were written with the version 2 semantics. There is a reason that people don't like backwards-incompatible changes; it makes transition rediculously hard.
In a lot of cases it would better if we write Python programs that worked, without change, on both Python 2 and Python 3. Then we don't have to muck with 2to3 (or 3to2) and other nonsense. Python 2.6 includes many capabilities that make it easier to write code that works on both 2.6 and 3. As a result, you can program in Python 2 but using certain Python 3 extensions... and the resulting code works on both. Python 2.6 has been out for a while, so for many people, requiring 2.6 is a reasonable precondition. Where that's too difficult, write in Python 2.6+, but add some of the syntactic and semantic niceties of Python 3. Mark Summerfield has a nice summary of Python 2 and 3 idiom differences.)
A simple way to do this is to use Python 2.6 for development and begin each of your Python .py files with the following:
from __future__ import print_function, unicode_literals from __future__ import absolute_import, division
These switch to Python 3 meanings for key constructs. Now you use print(...) instead of a print statement, unicode strings, imports will always be absolute, and division will create floating-point values as needed (i.e., 1/2 now returns 0.5).
Python 2.6 includes a number of Python 3 features by default (like support for "bytes"), so you can just use them directly. In some cases, you should avoid using certain constructs and replace them with another (my thanks to Running the same code on Python 2.x and 3.x which points out some of these). For example:
| Instead of | Use |
|---|---|
| d.has_key(k) | k in d |
| d.itervalues() | d.values() |
| callable(o) | hasattr(o, '__call__') |
Some code constructs require a little extra work to make them work the same way in Python 2 and Python 3. For example, Python 3's "range" is the same as Python 2's "xrange". We can do this by inserting after the "from __future__" statements the following:
try:
xrange = xrange
# We have Python 2
except:
xrange = range
# We have Python 3
Now we can use "xrange(...)" in the rest of the file, and it will work correctly. (You could use "range()" directly, but in Python 2 this can be very inefficient.)
You can also import packages and rename them.
One of the advantages of Python is that it's a clean language to read; too much of this stuff makes it too complicated. There's a philosophical question as to whether or not you write in Python 2 (with some modifications), or in a Python 3 that happens to work in Python 2. For example, do you choose to use "xrange" or "range" as the name in the code? I prefer working in "python 2 with specific modifications" right now, for the following reasons:
Python 2 and Python 3 have some different library interfaces, and trying to deal with all of that in each file can be awful.
Writing code that works in both 2 and 3 can become a serious pain, so when it gets too difficult, I abandon it, and make the code work on a Python 2 that adds some features of 3. Over time, I can modify the code to be more 3-like, presuming that future versions of Python 2 add notation from Python 3. This gives me a practical way to transition to Python 3, gradually, and then use 2to3 when all libraries have made that final step. By that point, the code differences will be trivial instead of the current chasm.
I hope that the Python 2 C implementation will continue to be upgraded until it supports nearly all of the Python 3 features. In particular, I'd love to see "import __future__ python3", which would try to make it as python3-like as possible, including the new Python 3 names and interfaces. Then programs and libraries could easily switch to version 3 features at their own pace, instead of requiring a "flag day". It would also mean that code could be quite clean.
Lots of other pages have similar info on making code work on both 2 and 3 directly. A lot of them include doing this for versions of Python before 2.6, which tends to be more work. These include:
Feel free to see my home page at http://www.dwheeler.com. You may also want to look at my paper Why OSS/FS? Look at the Numbers! and my book on how to develop secure programs.
(C) Copyright 2009 David A. Wheeler.