Login | Register   
Twitter
RSS Feed
Download our iPhone app
TODAY'S HEADLINES  |   ARTICLE ARCHIVE  |   FORUMS  |   TIP BANK
Browse DevX
Sign up for e-mail newsletters from DevX


advertisement
 

A Brief Rundown of Changes and Additions in Python 3.1

Changes to the core language, the standard library, and some welcome performance improvements make Python 3.1 a balanced and worthwhile release.


advertisement

he previous articles in this series (see the Related Resources section in the left column) covered the releases of Python 3.0 and Python 2.6. Despite the relative youth of these versions, the Python core developers have already created Python 3.1, which was released on June 27, 2009—less than seven months after the release of Python 3.0. While the 3.1 release has much smaller scope than Python 3.0, it still brings several interesting features, additions—and everybody's favorite—performance improvements!

Core Language Changes

I'll cover changes to the core language first, and then move on to changes in the standard library and performance improvements.

String Formatting

One welcome feature is the ability to auto-number format fields. Formatting strings is a very common operation in many programs. Python 2.x has the [s]printf-like percent operation:



>>> '%s,  %s!' % ('Hello', 'World')
'Hello,  World!'

Python 3.0 added advanced string formatting capabilities (PEP-3101) modeled after C#'s format syntax:

>>> '{0},  {1}!'.format('Hello', 'World')
'Hello,  World!'

This is better for many reasons (see the Advanced String Formatting topic in this earlier article), but Python 3.1 improves it further. In Python 3.0 you had to specify the index of each positional argument whenever you referred to them in the format string. In Python 3.1 you can simply drop the index and Python will populate the arguments in sequence:

>>> '{},  {}!'.format('Hello', 'World')
'Hello,  World!'

PEP-378: Format Specifier for Thousands Separator

In financial applications, a thousands separator is the norm. Bankers and accountants don't write "You owe me $12345678," but rather "You owe me $12,345,678," with commas (or another character) as separators. Here's how you achieve that in Python:

>>> format(12345678, ',')
'12,345,678'

You can combine it with other specifiers. The width specifier (8 in the example below) includes the commas and decimal point:

>>> format(12345.678, '8,.1f')
'12,345.7'

A comma is the default separator character; if you want to use a different separator you'll need to substitute the character you prefer using replace:

>>> format(1234, ',').replace(',', '_')
'1_234'

Of course, you can also use the format function as a string method:

>>> '{0:8,.1f}'.format(123.456)
'   123.5'

This seems like a minor addition to me; basically it simply adds one more display specifier to the format function that still doesn't handle the more difficult cases of formatting content for different locales; that remains your responsibility. Still, the addition got its own PEP that encouraged a lively discussion, with at least two proposals.

The maketrans Function

Together, the maketrans() and translate() functions let you replace a set of characters with a different set. Although I have never used maketrans()/translate() in a real application, I assume that they're highly efficient. Using the functionality is a little cumbersome, because it requires that you build a translation table using maketrans() that maps input characters to output characters. You then pass the resulting translation table to the translate() function. The string module still has its own maketrans() function, but that has been deprecated in Python 3.1 in favor of separate maketrans() functions that operate on bytes, bytearrays, and str.

Here's an example that demonstrates how to use maketrans() and translate() with a bytes object. Note that the translation table for bytes has 256 entries (one for each possible byte), and this example maps most bytes to themselves—the exceptions are 1, 2, and 3, which the table maps to 4, 5 and 6 respectively:

>>> tt = bytes.maketrans(b'123', b'456')
>>> len(tt)
256
>>> tt
b'\x00\x01\x02\x03\x04\x05\x06\x07\x08\
t\n\x0b\x0c\r\x0e\x0f\x10\x11\x12\x13\
x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\
x1e\x1f !"#$%&\'()*+,-./0456456789:;<=>
?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcd
efghijklmnopqrstuvwxyz{|}~\x7f\x80\x81\
x82\x83\x84\x85\x86\x87\x88\x89\x8a\x8b\
x8c\x8d\x8e\x8f\x90\x91\x92\x93\x94\x95\
x96\x97\x98\x99\x9a\x9b\x9c\x9d\x9e\x9f\
xa0\xa1\xa2\xa3\xa4\xa5\xa6\xa7\xa8\xa9\
xaa\xab\xac\xad\xae\xaf\xb0\xb1\xb2\xb3\
xb4\xb5\xb6\xb7\xb8\xb9\xba\xbb\xbc\xbd\
xbe\xbf\xc0\xc1\xc2\xc3\xc4\xc5\xc6\xc7\
xc8\xc9\xca\xcb\xcc\xcd\xce\xcf\xd0\xd1\
xd2\xd3\xd4\xd5\xd6\xd7\xd8\xd9\xda\xdb\
xdc\xdd\xde\xdf\xe0\xe1\xe2\xe3\xe4\xe5\
xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef\
xf0\xf1\xf2\xf3\xf4\xf5\xf6\xf7\xf8\xf9\
xfa\xfb\xfc\xfd\xfe\xff'

After you have the translation table you just pass it to the translate() function:

>>> b'123456'.translate(tt)
b'456456'

You may also pass an additional argument that simply deletes characters:

>>> b'123456'.translate(tt, b'5')
b'45646'

It's interesting to see that the original 5 from 123456 was deleted, but the translated 5 (remember, the table translates 2s to 5s) wasn't. That implies that translate first deletes the characters from the original string and then applies the translation.

Translating strings is a little different. Rather than a fixed translation table of all possible characters (remember, strings are Unicode now) the string version of maketrans returns a dictionary.

>>> tt = str.maketrans('123', '456')
{49: 52, 50: 53, 51: 54}
>>> '123456'.translate(tt)
'456456'


Comment and Contribute

 

 

 

 

 


(Maximum characters: 1200). You have 1200 characters left.

 

 

Sitemap