WEBINAR:
On-Demand
Building the Right Environment to Support AI, Machine Learning and Deep Learning
Function Annotations
Function annotations are my favorite new feature in Python 3000 (along with ABCs). You can attach arbitrary Python expressions to function arguments and/or return values. You decide how you want to annotate your functions and what they mean. Type checking is a classic example; you can annotate each argument and the return value of a function with a type. Doing that helps users to figure out how to call the function. For example, the function
calc_circumference() expects an integral radius and returns a float number. Note the annotation syntax for the
radius argument and for the return value.
def calc_circumference(radius: int) -> float:
return 2 * math.pi * radius
This is not so impressive by itself, because you could have just documented it in the
docstring (although it is more readable as an annotation). Function annotations are stored as a
dict in a function attribute called
__anotations__.
>>> calc_circumference.__annotations__
{'radius': <class 'int'>, 'return': <class 'float'>}
You can access the annotations inside the function too, of course, and verify that the type of each argument matches its annotation. This turns out to be a little more complicated than you might expect, because Python stores the annotations in the
dict as an unordered collection. The following simple function accepts three
int arguments and prints them:
def print_3(a: int, b: int, c: int)
The following example shows one way to access the function annotations to validate that the radius is actually an integer:
def calc_circumference(radius: int) -> float:
assert isinstance(radius, calc_circumference.__annotations__['radius'])
return 2 * math.pi * radius
Unfortunately, that's pretty silly code. It's cumbersome, error-prone and if you need to verify multiple arguments it seriously hinders readability. Moreover, if you write such function-specific validations you don't need annotations because as the function author you
already know that radius is supposed to be an int, and can just write:
assert isinstance(radius, int)
The real power comes when you access the function annotations via a general-purpose decorator. You can use the
call_check decorator (see
Listing 2) to decorate any function or method that has type annotations for its arguments. The decorator extracts the annotations and asserts that the type of each argument matches its type annotation on every call. It also verifies the return value type (if it was annotated—the
return annotation). You need a little finicky code to extract the names of non-keyword arguments from the function's code object because the annotations in the dictionary are unordered; if you just iterate over the items the order may not be the order of the function arguments. This is a serious weakness of function annotations that I hope will be fixed in Python 3.1.
To test the decorator, the code below applies it to the
calc_circumference method:
@call_check
def calc_circumference(radius: int) -> float:
return 2 * math.pi * radius
Now, if you try calling
calc_circumference with both valid and invalid arguments, you'll find that a call with radius 10 returns the correct answer. A second call, using the invalid argument
10.5 (not an int), results in an exception that explains exactly what the problem is and tells the caller that the argument
radius with value
10.5 must be an int.
>>> calc_circumference(10)
62.8318530718
>>> calc_circumference(10.5)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "article_1.py", line 117, in decorated
raise Exception(s)
Exception: The type of radius=10.5 should be int
The next step (which I'll leave to readers) is to use a class decorator or metaclass to decorate all the methods of a class with the
call_check decorator automatically.
There are many more use cases for function annotations, such as help strings, rich type information (e.g. allowed range of values), direct validation functions, hints to let IDEs provide better experience, mapping, adapters, and corresponding C types.
One concern with function annotations is that excessive use might create a "language within a language," where you can't really tell what happens when you call a function until you see the code that processes the annotations.
PEP-3115 Metaclasses in Python 3000
Metaclasses are classes whose instances are also classes. I like to think of them as class customizers. They have the same role as class decorators, but different mechanics (more complicated). The features that distinguish metaclasses from class decorators are:
- Metaclasses are inherited.
- Each class can have exactly one metaclass.
Even these simple features already present a problem. For example, what happens if you inherit from two classes with different metaclasses? The answer is: it depends. If the two metaclasses are unrelated you are not allowed to multiply inherit:
class M1(type): pass
class M2(type): pass
class A(metaclass=M1): pass
class B(metaclass=M2): pass
class C(A, B): pass
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: metaclass conflict: the metaclass of a derived class must be
a (non-strict) subclass of the metaclasses of all its bases
To make this work, one of the metaclasses must be a subclass (possibly indirect) of the other. The
most derived metaclass will be used as the metaclass for the new class.
class M1(type): pass
class M2(M1): pass
class A(metaclass=M1): pass
class B(metaclass=M2): pass
class C(A, B): pass
>>> C.__class__
<class 'article_1.M2'>
Metaclasses in Python 3.0 fix a thorny issue in Python 2.X: The order of class attributes couldn't be determined just from its
__dict__. In many cases, the order is important when interfacing with external systems (ORM, foreign language bridges, COM objects, etc). As a result, if you wanted to use metaclasses for these use cases you had to require each class to provide the order using some contrived convention, such as providing a sequenced list of all the relevant fields:
>>> class A(object):
... x = 3
... y = 4
... z = 5
... order = ['x', 'y', 'z']
...
>>> A.__dict__.keys()
['__module__', '__doc__', '__dict__', 'y', 'x', 'z', '__weakref__', 'order']
>>> A.order
['x', 'y', 'z']
In contrast, Python 3.0 metaclasses can implement a method called
__prepare__() that returns an object that serves in lieu of the standard
dict() to store the members. The trick is that
__prepare__() may return an object that either stores the order of the fields or ensures that iteration will always return the items in the order they were inserted.
For example, suppose you want to create a linear workflow class where the methods must be called in a certain sequence. If you have multiple workflows and you keep adding and removing steps, it can get quite tricky to do the bookkeeping and make sure that users call the workflow methods in the right order. Using that same scenario, here's a metaclass-based solution in which the final result is a class whose methods you must call in the order of their declaration in the class. The following code shows both the class and demonstrations for using it both properly and improperly.
The LinearWorkflow class is very simple and has four methods:
start,
step_1,
step_2, and
step_3, declared in that order. It also has a metaclass (you'll see more about that later), that ensures the methods are called in the proper sequence. For this example, the methods just print their names. Here's the class:
class LinearWorkflow(metaclass=PedanticMetaclass):
def __init__(self):
pass
def start(self):
print('start')
def step_1(self):
print('step_1')
def step_2(self):
print('step_2')
def step_3(self):
print('step_3')
The following interactive session instantiates a LinearWorkflow object and starts calling methods. It calls
start() and
step_1(), which execute. Then, it tries to call step_1 again and get an exception that step_1 was called out of order. It calls
step_2() successfully and then stubbornly tries to call
step_1() one more time (again, out of order) and gets another exception.
>>> x = LinearWorkflow()
>>> x.start()
start
>>> x.step_1()
step_1
>>> x.step_1()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "article_1.py", line 192, in decorated
raise Exception('Method %s called out of order' % f.name)
Exception: Method step_1 called out of order
>>> x.step_2()
step_2
>>> x.start()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "article_1.py", line 192, in decorated
raise Exception('Method %s called out of order' % f.name)
Exception: Method start called out of order
How is this magic accomplished? The PedanticMetaclass keeps tabs on the order of the methods by providing an OrderedDict (a dictionary-like class that stores items in insertion order as a list of pairs). When you call the PedanticMetaclass's
__new__() method it adds each method of its decorated class (LinearWorkflow in this case) that does
not start with an underscore in the OrderedDict. Then it records each method called on its decorated class, and raises an exception if a method is called out of order. Here are the OrderedDict and PedanticMetaclass classes:
class OrderedDict(collections.Mapping):
def __init__(self):
self._items = []
def __getitem__(self, key):
for k, v in self._items:
if k == key:
return v
raise KeyError(key)
def __setitem__(self, key, value):
for i, (k, v) in enumerate(self._items):
if k == key:
self._items[i] = (k, value)
return
self._items.append((key, value))
def __iter__(self):
return iter([k for k,v in self._items])
def __len__(self):
return len(self._items)
def __repr__(self):
s = '{'
for k, v in self._items:
s += '%s: %s, ' % (str(k), str(v))
s += '}'
return s
Author's Note: OrderedDict is intended for demonstration purposes only. I don't recommend using it in production code because it is very inefficient: It performs a linear search for every item access. |
The PedanticMetaclass is not completely trivial, but you should be able to follow the code. The
__prepare__() method must be a
classmethod; it simply returns an empty OrderedDict. The
decorate() function performs the order-checking logic by wrapping each call to methods in the
_order list. The decorator is applied manually in the
__new__() method and then the decorated method is assigned to the
new_class using
setattr (replacing the original method).
class PedanticMetaclass(type):
@classmethod
def __prepare__(metacls, name, bases):
return OrderedDict()
def decorate(self, f):
"""This method is called everytime an attribute is accessed
it ensures that the attributes are accessed in the order they were declared
"""
def decorated(*args, **keywds):
assert f.name in self._order
if self._called + [f.name] != self._order[:len(self._called) + 1]:
raise Exception('Method %s called out of order' % f.name)
self._called.append(f.name)
if len(self._called) == len(self._order):
self._called = []
return f(*args, **keywds)
return decorated
def __new__(metacls, name, bases, ordered_dict):
# Must convert the OrderedDict back to a regular dict
new_class = type.__new__(metacls, name, bases, dict(ordered_dict.items()))
# Create the ordered list of public methods automatically
# (ignore non-callables and methods that start with underscore)
order = [k for k,v in ordered_dict.items() if hasattr(v, '__call__')
and not k.startswith('_')]
new_class._order = order
# Keep the last accessed
new_class._called = []
# Decorate each method to record and check call order
for x in order:
m = getattr(new_class, x)
m.name = x
m = PedanticMetaclass.decorate(new_class, m)
setattr(new_class, x, m)
return new_class