Login | Register   
RSS Feed
Download our iPhone app
Browse DevX
Sign up for e-mail newsletters from DevX


A Brief Rundown of Changes and Additions in Python 3.1 : Page 2

Changes to the core language, the standard library, and some welcome performance improvements make Python 3.1 a balanced and worthwhile release.


Math-Related Changes

The 3.1 release also includes some math-related changes.

Int Gets a bit_length Method

The venerable int gained a bit_length method that returns the number of bits required to represent the int in binary form. For example the number 19 is 10011 in binary form, which requires 5 bits:

>>> int.bit_length(19) 5 >>> bin(19) '0b10011'

I'm not sure what it's useful for, but maybe you can figure something out.

Rounding Floats

In Python 3.0 and earlier the round() function was a little inconsistent. If you provided no precision digits it always returned an int, but if you provided precision digits, it returned the type passed in:

>>> round(1000) 1000 >>> round(1000.0) 1000 >>> round(1000, 2) 1000 >>> round(1000.0, 2) 1000.0

In Python 3.1 round() always returns an int if the input number is an integer—even if that integer is represented as a float (e.g. 1000.0):

>>> round(1000) 1000 >>> round(1000.0) 1000 >>> round(1000, 2) 1000 >>> round(1000.0, 2) 1000

Floating Point Number Representation

Real numbers are represented in most of today's hardware and operating systems in either 32 bits (single precision) or 64 bits (double precision) according to IEEE-754. However, that means some real numbers can't be represented precisely. Due to the binary nature of computer storage, the best representation for some numbers with a concise decimal representation is not so concise in the floating point scheme (see this section of the Wikipedia Floating Point entry). For example, in 32-bits (single precision), the number 0.6 is represented as 0.59999999999999998:

>>> 0.6 0.59999999999999998

This is as accurate as possible, given the representation scheme, but isn't user-friendly. Python 3.1 employs a new algorithm that looks for the most concise representation that keeps the original value intact. So in Python 3.1 the same input results in:

>>> 0.6 0.6

That's fine until you hit another "gotcha" of floating number representation—arithmetic operations. For example, what is the value of the expression 0.7 + 0.1 in 32-bit floating point representation? If you thought it was 0.79999999999999993 you were spot on. Now, what is the value of the number 0.8? That's right, 0.80000000000000004. But those results imply that 0.7 + 0.1 is not equal to 0.8, which can lead to some pretty nasty bugs. As an example, this innocent looking while loop will never stop:

>>> x = 0.0 >>> while x != 1.0: ... print(repr(x)) ... x += 0.1 Output: 0 0.10000000000000001 0.20000000000000001 0.30000000000000004 0.40000000000000002 0.5 0.59999999999999998 0.69999999999999996 0.79999999999999993 0.89999999999999991 0.99999999999999989 1.0999999999999999 1.2 1.3 1.4000000000000001 1.5000000000000002 1.6000000000000003 ...

In Python 3.0 the repr() function returns the actual representation. In Python 3.1 it returns the concise representation. In both Python 3.0 and Python 3.1 the print() function prints the concise representation:

>>> print(0.1) 0.1 >>> print(0.10000000000000001) 0.1

Author's Note: For cross-platform compatibility, the text pickle protocol still uses the actual representation.

Python also has a module called decimal for precise real number representation. It's slower then floating point numbers and uses a different representation scheme, but it can represent real numbers with as many digits as available memory allows—and it doesn't suffer from rounding errors when doing arithmetic. In Python 3.0, the Decimal type gained a new method that initialized the value from a string; Python 3.1 adds another new method, from_float(), that accepts a float. Note, that even when using from_float(), the decimal module uses higher precision than 32-bits.

>>> from decimal import Decimal >>> Decimal.from_float(0.1) Decimal('0.1000000000000000055511151231257827021181583404541015625')

Improved with Statement

The with statement, which helps guarantee timely release of resources, was introduced in Python 2.5 as a __future__ feature, and officially brought into the language in Python 3.0. Python 3.1 extends its reach to support multiple resources in the same statement. The most common case is probably opening input and output files and closing both when the processing completes. In Python 3.0 you either had to use nested with statements or explicitly close at least one of the files. Here's a Python 3.0 example that opens an input file, reads its contents as a string, title-cases the contents (using the string's title() method), and writes the result to an output file.

The example contains two nested with statements. Note the last line of the nested with block. When the code is trying to read form out.txt the result is empty, because the file is buffered and nothing has been written yet. When the with block completes, Python closes the files, so the last line (after the nested with block) asserts that the contents of out.txt is indeed the capitalized text

open('in.txt', 'w').write('abc def') with open('in.txt') as in_file: with open('out.txt', 'w') as out_file: text = in_file.read() assert text == 'abc def' text = text.title() assert text == 'Abc Def' out_file.write(text) assert open('out.txt').read() == '' assert open('out.txt').read() == 'Abc Def'

While not bad, the nested with statements are a little annoying. The intention here is to open two files and close them when the processing is done. (If you needed to open three files (e.g. for a three-way merge) you would need three nested with statements.) Python 3.1 lets you open both files using a single with statement:

with open('in.txt') as in_file, open('out.txt', 'w') as out_file: text = in_file.read() assert text == 'abc def' text = text.title() assert text == 'Abc Def' out_file.write(text) assert open('out.txt').read() == '' assert open('out.txt').read() == 'Abc Def'

Another Python 3.1 improvement is that the gzip.GzipFile and bz2.BZ2File now support the context manager protocol, and can be used in a with statement. These are compressed file formats. Here's a code sample that stores 5000 bytes in both a gzip file and a bz2 file and prints the sizes. It takes advantage of a few additional Python 3 features, such as the nice stat result with named attributes (unlike the raw tuple in Python 2.x) and advanced string formatting.

from bz2 import BZ2File from gzip import GzipFile import os with GzipFile('1.gz', 'wb') as g, BZ2File('1.bz2', 'wb') as b: g.write(b'X' * 5000) b.write(b'X' * 5000) for ext in ('.gz', '.bz2'): filename = '1' + ext print ('The size of the {0} file is {1.st_size} bytes'.format(ext, os.stat(filename))) Output: The size of the .gz file is 43 bytes The size of the .bz2 file is 45 bytes

Comment and Contribute






(Maximum characters: 1200). You have 1200 characters left.



Thanks for your registration, follow us on our social networks to keep up-to-date