RSS Feed
Download our iPhone app
Browse DevX
Sign up for e-mail newsletters from DevX


The Python 2.5 Goodie Bag: Language Enhancements and Modules : Page 5

The 2.5 version of Python offered lots of useful enhancements. In this article, you'll learn about some specific modules, as well as performance improvements, that are likely to bring big smiles to the faces of many Python developers.

Optimizations and Internal Changes
Python is not the fastest language out there. Most of the time it's fast enough. If you need more performance you can write performance critical code in C or C++ extensions and call it from Python. Nevertheless, sometimes you will wish Python was a little faster so you can develop more applications in pure Python. Your wish is the Python developers command. Python 2.5 introduces multiple performance enhancements.

Py_ssize_t as index
Python used to store various counts in a variable of the C type int. This is a 32-bit type, which meant that lists or tuples couldn't have more than 2,147,483,647 bits. On 32-bit systems you couldn't fit more than that in the entire 32-bit addressable memory. On 64-bit systems you have much more addressable memory, so this number isn't so big anymore.

Python 2.5 uses the 64-bit Py_ssoze_t typedef for indices and counts, which allows you to fully utilize the memory of 64-bit systems. This change affects mostly C extension writers. Read PEP- 353 if you want all the gory details.

Memory Functions
Python as a runtime virtual machine does a lot of memory management on behalf of your code. Small objects are allocated in 256KB arenas. When you allocate a small object of any size the memory will either be allocated from an existing arena with available space or from a new 256KB arena.. This arrangement allows you to amortize the cost of frequent memory allocations at the cost of inconsistent allocation time. This is a reasonable tradeoff for a language like Python. Nevertheless, Python 2.4 never released empty arenas. Thus, if, in the beginning of your program, you allocated lots of small objects and then your program switched to a state that used a small number of objects, all the arenas that were allocated initially just sat there and hogged memory. Python 2.5 addresses this issue: Empty arenas are de-allocated and the memory is returned to the operating system.

This change resulted in different types of memory functions in the Python C API. Prior to Python 2.5 the various memory function families were all reduced to the system malloc. Now, some functions use obmalloc and some use the plain malloc. This means that it is important to free memory using the correct function. This should concern only extension writers.

The Need for Speed Sprint
The NeedForSpeed sprint it was a privately sponsored event that took place from May 21 to 28, 2006, in Reykjavik, Iceland. Several prominent Python hackers were flown in and spent a week improving Python's performance. The results were integrated into Python 2.5. The major successes were significant improvements to repeated function calls (by caching the associated frame object), huge gains in string performance and string to int conversions, reduced interpreter startup time, and faster exceptions. The event produced several orders of magnitude performance improvements!

Author's Note: The order of magnitude improvements apply to Psyco only; Psyco is a dynamic just-in-time compiler. Psyco is not part of standard Python and it doesn't work on Mac, so the orders of magnitude performance improvements should refer to Psyco only.

Metadata for Python packages
The only chink in Python's armor is its relatively weak support for installation, deployment, and updates of large systems with many dependencies. It might not be important for the typical utility or administration script, but Python is used more and more for developing large-scale systems. The distutils module is the official way of creating and distributing Python packages. It is based on a set up script that can create source and binary distributions, including metadata, for different platforms. Until Python 2.5 it lacked any notion of dependency between packages.

Python 2.5 added a few metadata fields (based on PEP-314): 'requires', 'obsoletes,' and 'download_url'. Python also has an online repository for packages called the cheeseshop , which contains an index of downloadable packages. Unfortunately, it seems the new metadata fields don't really solve the dependency issues because there is no semantics attached to these fields and no tool support.

Python's salvation may be the setuptools project , again by the prolific Philip J. Eby. This project aims to enhance the distutils module and be compatible with it. It is the de facto standard for distributing and installing Python packages. It is at version 0.6c3 and quite usable, but it's not perfect yet.

Balance and Traction
Python 2.5 is a mostly backward compatible and balanced release. It introduced multiple language enhancements, several new and improved modules in the standard library, and lots of performance enhancements. Best of all, it created healthy traction and innovation without disrupting its growing user base.

Python is well poised to target larger and more complicated systems, while preserving its essential simplicity and the friendliness that attracted so many developers in the first place.

Gigi Sayfan specializes in cross-platform object-oriented programming in C/C++/C#/Python/Java with an emphasis on large-scale distributed systems. He is currently trying to build brain-inspired intelligent machines at Numenta.
Email AuthorEmail Author
Close Icon
Thanks for your registration, follow us on our social networks to keep up-to-date