RSS Feed
Download our iPhone app
Browse DevX
Sign up for e-mail newsletters from DevX


Dig Deep into Python Internals  : Page 3

Python, the open source scripting language, has grown tremendously popular in the last five years—and with good reason. Python boasts a sophisticated object model that wise developers can exploit in ways that Java, C++, and C# developers can only dream of.

Day In The Life of a Python Object
To get a feel for all the dynamics involved in using Python objects let's track a plain object (no tricks) starting from its class definition, through its class instantiation, access its attributes, and see it to its demise. Later on I'll introduce the hooks that allow you to control and modify this workflow.

The best way to go about it is with a monstrous simulation. Listing 2 contains a simulation of a bunch of monsters chasing and eating some poor person. There are three classes involved: a base Monster class, a MurderousHorror class that inherits from the Monster base class, and a Person class that gets to be the victim. I will concentrate on the MurderousHorror class and its instances.

Class Definition
MurderousHorror inherits the 'frighten' and 'eat' methods from Monster and adds a 'chase' method and a 'speed' field. The 'hungry_monsters' class field stores a list of all the hungry monsters and is always available through the class, base class, or instance (Monster.hungry_monsters, MurderousHorror.hungry_monsters, or m1.hungry_monsters). In the code below you can see (via the handy 'dir' function) the MurderousHorror class and its m1 instance. Note that methods such as 'eat,' 'frighten,' and 'chase' appear in both, but instance fields such as 'hungry' and 'speed' appear only in m1. The reason is that instance methods can be accessed through the class as unbound methods, but instance fields can be accessed only through an instance.

class NoInit(object):
    def foo(self):
        self.x = 5
    def bar(self):
        print self.x
if __name__ == '__main__':            
    ni = NoInit()
    assert(not ni.__dict__.has_key('x'))
    except AttributeError, e:
        print e

'NoInit' object has no attribute 'x'
Object Instantiation and Initialization
Instantiation in Python is a two-phase process. First, __new__ is called with the class as a first argument, and later as the rest of the arguments, and should return an uninitialized instance of the class. Afterward, __init__ is called with the instance as first argument. (You can read more about __new__ in the Python reference manual.)

When a MurderousHorror is instantiated __init__ is the first method called. __init__ is similar to a constructor in C++/Java/C#. The instance calls the Monster base class's __init__ and initializes its speed field. The difference between Python and C++/Java/C# is that in Python there is no notion of a parameter-less default constructor, which, in other languages, is automatically generated for every class that doesn't have one. Also, there is no automatic call to the base class' default __init__ if the derived class doesn't call it explicitly. This is quite understandable since no default __init__ is generated.

In C++/Java/C# you declare instance variables in the class body. In Python you define them inside a method by explicitly specifying 'self.SomeAttribute'. So, if there is no __init__ method to a class it means its instances have no instance fields initially. That's right. It doesn't HAVE any instance fields. Not even uninitialized instance fields.

The previous code sample (above) is a perfect example of this phenomenon. The NoInit class has no __init__ method. The x field is created (put into its __dict__) only when foo() is called. When the program calls ni.bar() immediately after instantiation the 'x' attribute is not there yet, so I get an 'AttributeError' exception. Because my code is robust, fault tolerant, and self healing (in carefully staged toy programs), it bravely recovers and continues to the horizon by calling foo(), thus creating the 'x' attribute, and ni.bar() can print 5 successfully.

Note that in Python __init__ is not much more then a regular method. It is called indeed on instantiation, but you are free to call it again after initialization and you may call other __init__ methods on the same object from the original __init__. This last capability is also available in C#, where it is called constructor chaining. It is useful when you have multiple constructors that share common initialization, which is also one of the constructors/initializers. In this case you don't need to define another special method that contains the common code and call it from all the constructors/initializers; you can just call the shared constructor/initializer directly from all of them.

Attribute Access
An attribute is an object that can be accessed from its host using the dot notation. There is no difference at the attribute access level between methods and fields. Methods are first-class citizens in Python. When you invoke a method of an object, the method object is looked up first using the same mechanism as a non-callable field. Then the () operator is applied to the returned object. This example demonstrates this two-step process:

class A(object):
    def foo(self):
        print 3
if __name__ == '__main__':            
    a = A()
    f = a.foo
    print f
    print f.im_self

<bound method A.foo of <__main__.A object at 0x00A03EB0>>
<__main__.A object at 0x00A03EB0>
The code retrieves the a.foo bound method object and assigns it to a local variable 'f'. 'f' is a bound method object, which means its im_self attribute points to the instance to which it is bound. Finally, a.foo is invoked through the instance (a.foo()) and by calling f directly with identical results. Assigning bound methods to local variables is a well known optimization technique due to the high cost of attribute lookup. If you have a piece of Python code that seems to perform under the weather there is a good chance you can find a tight loop that does a lot of redundant lookups. I will talk later about all the ways you can customize the attribute access process and why it is so costly.

The __del__ method is called when an instance is about to be destroyed (its reference count reaches 0). It is not guaranteed that the method will ever be called in situations such as circular references between objects or references to the object in an exception. Also the implementation of __del__ may create a new reference to its instance so it will not be destroyed after all. Even when everything is simple and __del__ is called, there is no telling when it will actually be called due to the nature of the garbage collector. The bottom line is if you need to free some scarce resource attached to an object do it explicitly when you are done using it and don't wait for __del__.

A try-finally block is a popular choice for garbage collection since it guarantees the resource will be released even in the face of exceptions. The last reason not to use is __del__ is that its interaction with the 'del' built-in function may confuse programmers. 'del' simply decrements the reference count by 1 and doesn't call '__del__' or cause the object to be magically destroyed. In the next code sample I use the sys.getrefcount() function to determine the reference count to an object before and after calling 'del'. Note that I subtract 1 from sys.getrefcount() result because it also counts the temporary reference to its own argument.

import sys

class A(object):
    def __del__(self):
        print "That's it for me"
if __name__ == '__main__':            
    a = A()
    b = a
    print sys.getrefcount(a)-1
    del b
    print sys.getrefcount(a)-1


That's it for me

Close Icon
Thanks for your registration, follow us on our social networks to keep up-to-date