Login | Register   
RSS Feed
Download our iPhone app
Browse DevX
Sign up for e-mail newsletters from DevX


Dig Deep into Python Internals  : Page 4

Python, the open source scripting language, has grown tremendously popular in the last five years—and with good reason. Python boasts a sophisticated object model that wise developers can exploit in ways that Java, C++, and C# developers can only dream of.

Hacking Python
Let the games begin. In this section I will explore different ways to customize attribute access. The topics include the __getattribute__ hook, descriptors, and properties.

- __getattr__, __setattr__ and __getattribute__

These special methods control attribute access to class instances. The standard algorithm for attribute lookup returns an attribute from the instance dictionary or one of its base class's dictionaries (descriptors will be described in the next section). They are supposed to return an attribute object or raise AttributeError exception. If you define some of these methods in your class they will be called upon during attribute access under some conditions.
  • __getattr__ and __setattr__ work with old-style and new-style classes.
  • __getattr__ is called to get attributes that cannot be found using the standard attribute lookup algorithm.
  • __setattr__ is called for setting the value of any attribute. This asymmetry is necessary to allow adding new attributes to instances.
  • __getattribute__ works for new-style classes only. It is called to get any attribute (existing or non-existing). __getattribute__ has precedence over __getattr__, so if you define both, __getattr__ will not be called (unless __getattribute__ raises AttributeError exception).
Listing 3 is an interactive example. It is designed to allow you to play around with it and comment out various functions to see the effect. It introduces the class A with a single 'x' attribute. It has __getattr__, __setattr__, and __getattribute__ methods. __getattribute__ and __setattr__ simply forward any attribute access to the default (lookup or set value in dictionary). __getattr__ always returns 7. The main program starts by assigning 6 to the non-existing attribute 'y' (happens via __setattr__) and then prints the preexisting 'x', the newly created 'y', and the still non-existent 'z'. 'x' and 'y' exist now, so they are accessible via __getattribute__. 'z' doesn't exist so __getattribute__ fails and __getattr__ gets called and returns 7. (Author's Note: This is contrary to the documentation. The documentation claims if __getattribute__ is defined, __getattr__ will never be called, but this is not the actual behavior.)

A descriptor is an object that implements three methods __get__, __set__, and __delete__. If you put such a descriptor in the __dict__ of some object then whenever the attribute with the name of the descriptor is accessed one of the special methods is executed according to the access type (__get__ for read, __set__ for write, and __delete__ for delete).This simple enough indirection scheme allows total control on attribute access.

The following code sample shows a silly write-only descriptor used to store passwords. Its value may not be read nor deleted (it throws AttributeError exception). Of course the descriptor object itself and the password can be accessed directly through A.__dict__['password'].

class WriteOnlyDescriptor(object): def __init__(self): self.store = {} def __get__(self, obj, objtype=None): raise AttributeError def __set__(self, obj, val): self.store[obj] = val def __delete__(self, obj) raise AttributeError class A(object): password = WriteOnlyDescriptor() if __name__ == '__main__': a = A() try: print a.password except AttributeError, e: print e.__doc__ a.password = 'secret' print A.__dict__['password'].store[a]

Descriptors with both __get__ and __set__ methods are called data descriptors. In general, data descriptors take lookup precedence over instance dictionaries, which take precedence over non-data descriptors. If you try to assign a value to a non-data descriptor attribute the new value will simply replace the descriptor. However, if you try to assign a value to a data descriptor the __set__ method of the descriptor will be called.

Properties are managed attributes. When you define a property you can provide get, set, and del functions as well as a doc string. When the attribute is accessed the corresponding functions are called. This sounds a lot like descriptors and indeed it is mostly a syntactic sugar for a common case.

This final code sample is another version of the silly password store using properties. The __password field is "private." Class A has a 'password' property that, when accessed as in 'a.password,' invokes the getPassword or setPassword methods. Because the getPassword method raises the AttributeError exception, the only way to get to the actual value of the __password attribute is by circumventing the Python fake privacy mechanism. This is done by prefixing the attribute name with an underscore and the class name a._A__password. How is it different from descriptors? It is less powerful and flexible but more pleasing to the eye. You must define an external descriptor class with descriptors. This means you can use the same descriptor for different classes and also that you can replace regular attributes with descriptors at runtime.

class A(object): def __init__(self): self.__password = None def getPassword(self): raise AttributeError def setPassword(self, password): self.__password = password password = property(getPassword, setPassword) if __name__ == '__main__': a = A() try: print a.password except AttributeError, e: print e.__doc__ a.password = 'secret' print a._A__password Output: Attribute not found. secret

Properties are more cohesive. The get, set functions are usually methods of the same class that contain the property definition. For programmers coming from languages such as C# or Delphi, Properties will make them feel right at home (too bad Java is still sticking to its verbose java beans).

Python's Richness a Mixed Blessing
There are many mechanisms to control attribute access at runtime starting with just dynamic replacement of attribute in the __dict__ at runtime. Other methods include the __getattr__/__setattr, descriptors, and finally properties. This richness is a mixed blessing. It gives you a lot of choice, which is good because you can choose whatever is appropriate to your case. But, it is also bad because you HAVE to choose even if you just choose to ignore it. The assumption, for better or worse, is that people who work at this level should be able to handle the mental load.

In my next article, I will pick up where I've left off. I'll begin by contrasting metaclasses with decorators, then explore the Python execution model, and explain how to examine stack frames at runtime. Finally, I'll demonstrate how to augment the Python language itself using these techniques. I'll introduce a private access checking feature that can be enforced at runtime.

Gigi Sayfan specializes in cross-platform object-oriented programming in C/C++/C#/Python/Java with an emphasis on large-scale distributed systems. He is currently trying to build brain-inspired intelligent machines at Numenta.
Comment and Contribute






(Maximum characters: 1200). You have 1200 characters left.