Introducing IronPython

ack before version 1.0 of the CLR shipped, Microsoft engaged a variety of commercial and academic organizations to produce languages that ran on .NET; an effort code-named “Project 7.” One of those languages was Python for .NET, developed by ActiveState. That worked, but Project 7 discovered that the “speed of the current system is so low as to render the current implementation useless for anything beyond demonstration purposes.1” Furthermore, while they blamed some of the performance problems on “the simple implementation of the Python for .NET compiler”, they also claimed that “[s]ome of the blame for this slow performance lies in the domain of .NET internals and Reflection::Emit”.

Largely due to ActiveState’s experience, it became conventional wisdom that “[t]he CLI is, by design, not friendly to dynamic languages.2” This conclusion caught the attention of Jim Huginin, the original creator of Jython?an implementation of Python for the Java VM. Given that Jython runs reasonably well on the JVM, Jim wondered why the CLR ran Python so poorly. He decided to take a couple of weeks and build a simple implementation of Python on .NET in order to determine what Microsoft had done wrong. His plan was to use the findings to write a short paper titled “Why .NET is a Terrible Platform for Dynamic Languages.”

Jim was shocked to discover that his two-week effort resulted in a Python implementation that ran faster than Jython. Intrigued, he kept working on IronPython, eventually joining Microsoft in the summer of 2004. The 1.0 version of IronPython shipped two years later, in September 2006. Today, Jim is an architect in the Dynamic Language Runtime team while the IronPython team is driving towards a fall 2008 release of IronPython v2.0.

Getting IronPython
The IronPython project lives on CodePlex. Unlike most other Microsoft language products, IronPython runs as a transparent, open source project. The IronPython team releases a new version approximately every six weeks. Furthermore, Microsoft makes all the source code available under the OSI approved Microsoft Public License. On the CoDePlex site you’ll find a mailing list for community discussion as well as a public issue tracker where anyone can submit bugs they find.

The currently released version of IronPython is v1.1.1; however, the IronPython team is hard at work on v2.0. As I write this, we have just released Beta 3 of IronPython v2.0. IronPython v2.0 implements the 2.5 version of the Python language (IronPython v1.x implemented Python v2.4). Furthermore, IronPython v2.0 is built on the new Dynamic Language Runtime.

Regardless of which release you download, IronPython releases are distributed as a simple zip file containing the compiled binaries, some tutorials, documentation, and the usual readmes and license files. You can simply unzip that archive to any location on your hard drive. Inside the unzipped files you’ll find the main IronPython executable, ipy.exe. If you run that with no arguments, you get an interactive console window where you can start typing Python code directly. Alternatively, you can pass the name of a python file as the first argument to ipy, and IronPython will execute that code.

Despite the availability of a robust version of IronPython, there’s no current production-quality development experience for IronPython in Visual Studio. However, the VS Extensibility SDK includes a sample IronPython language service, project system and console window, which has now been packaged as “IronPython Studio“?a free download from CodePlex. Improving the Visual Studio experience for IronPython developers is one of the IronPython development team’s areas of investment for the future.

Significant Whitespace
When you first start working with Python, the most notable difference from other languages is the use of significant whitespace. In C#, whitespace is entirely insignificant?statements end with a semicolon and scope blocks are delineated with curly braces. VB, on the other hand, uses the end of line to indicate the end of a statement. However, scope blocks are still explicitly terminated in VB, typically with the keyword End (i.e. End Sub, End Function, End Class), although blocks sometimes close with a term more relevant to the block being closed, such as (Do/Loop, For/Next, etc).

In contrast, Python uses significant whitespace both for statement termination as well as delineating scope blocks. Like VB, Python terminates statements at the end of the line. Python uses a colon to explicitly mark the start of a scope block, similar to the way C# uses the open curly brace; however, unlike VB or C#, Python uses indentation to implicitly determine the end of a code block. All code with the same level of indentation is part of the same code block. To demonstrate, here’s an implementation of the bubble sort algorithm written in Python:

   def bubble_sort(ar):       l = len(ar) - 1       for i in range(l):          for j in range(l-i):             if ar[j] > ar[j+1]:                temp = ar[j]                ar[j] = ar[j+1]                ar[j+1] = temp       return len(ar)

This code has four scope blocks: the bubble_sort function itself, the two for loops (Python’s for loop is the equivalent of C#’s foreach loop and the range function is the equivalent of Enumerable.Range), and the if block. Each block has a different level of indentation. For example, you can see that the return statement is at the same level of indentation as the first for loop, indicating that it’s not part of either the for or if scope blocks, but belongs to the function scope block. This example uses four spaces for indentation, but the specific amount of indentation is irrelevant, as long as it’s consistent.

Significant whitespace enables both consistency and readability. In C-syntax languages such as C#, programmers have differing opinions as to where to put braces, using tabs vs. spaces, when to indent and how much, etc. Because C# ignores whitespace, there are numerous ways to format your code. In Python, there’s only one legal way to format the code. This means that if you walk up to a random bit of Python code written by some person you’ve never met, it’ll be formatted exactly in the same style that the Python code you write is. This typically makes Python code more readable. The fact that you don’t need extraneous syntax (semi colons and curly braces) to indicate the ends of statements and scope blocks also improves readability.

Python’s significant whitespace is one of those things developers tend to either love or hate. However, the use of significant whitespace does appear to becoming more popular. Both Boo and Cobra, two open source .NET languages, use a Python-inspired syntax, including significant whitespace. Furthermore, Microsoft’s new F# language includes an optional lightweight syntax mode that makes indentation significant, just like Python.

Editor’s Note: This article was first published in the September/October 2008 issue of CoDe Magazine, and is reprinted here by permission.

Dynamic Typing
While significant whitespace is the most obvious difference between Python and C#/VB, the biggest difference between them lies in the type system. Python shares many high-level concepts with these .NET languages, such as imperative code, functions, classes, and objects. However, C# and VB are statically typed, which means that these languages fix the capabilities of individual types at compile time. After you determine the fields, properties, and methods of your type and run the code through the compiler, those capabilities can never change, ever. If you need to change them, your only option is to throw away your original compiled binary and recompile a new one.

Dynamic typing, on the other hand, does not fix type capabilities at compile time. Rather, types are completely mutable and you can manipulate them at run time in much the same way that you have manipulated type instances in the past. For example, at runtime, you can create new types, add fields or methods to a type, remove fields or methods from a type, add fields or methods to a specific instance of a type, or even change the inheritance hierarchy of a type.

If you come from a static type language background, the idea of being able to manipulate types at run-time may seem strange, or even dangerous. Static advocates typically highlight type safety as one of the primary benefits of using static typing. When using static types, the compiler can do quite a bit of validation to ensure that the types you create and the methods you call all exist, that the fields and parameters are of the right types, etc. We’ve all had to fix compiler errors because we misspelled a method or type name. But that’s not possible in dynamic languages; instead, In Python, such errors become run-time exceptions.

However, I’d like to put this type safety benefit in perspective. In any system of significant scale, there are essentially an infinite number of ways the application can be wrong. Type safe languages eliminate just one of this infinite number of ways. But that still means that a static language compiler won’t catch the overwhelming majority of errors and bugs you might make when developing your system. If it could, then all successfully-compiled applications would automatically be bug free! The fact is, you need some other mechanism to catch these other errors, typically automated unit tests. So while it’s true that static types do provide a safety net, it’s just not a very big one.

This isn’t to say type safe languages are bad. Rather, I look at it as a tradeoff of static type safety vs. dynamic type flexibility. There is no universally right decision?only the right decision for you and your project. The good news is that with the addition of dynamic languages like IronPython to Microsoft’s language stable, you can make that tradeoff decision while still staying in the .NET realm.

Refreshing Simplicity
One somewhat frustrating aspect of C# is the amount of structure that gets forced on the developer. Some of this structure is related to static typing, such as specifying variable types and casting operators. However, there are also no stand-alone functions in C#, much less just a bunch of code in a simple script. All functions have to be attached to an object, meaning the simplest possible implementation of “Hello, World!” in C# looks like this:

   class Program   {     static void Main()     {         System.Console.WriteLine("Hello, World");     }   }

Contrast that with the Python version:

   print "Hello, World"

Obviously, you don’t judge a language completely by its implementation of “Hello, World,” but the real point is that Python scales well with complexity. If you need only a simple script, you don’t have to add unneeded class and function constructs for the code to live inside. If you want to make a function, it can stand alone; you don’t need to find a class to attach it to (such as the bubble_sort example above). If you really need classes and objects, you can build them, but you don’t have to take on that complexity unless you really need it.

Polymorphic Mallards
Polymorphism?the ability to treat different types the same way?is a staple of object-oriented programming. Statically typed languages have a variety of mechanisms designed to enable polymorphism, including inheritance, interfaces and generics. Because Python is dynamically typed, it doesn’t need any of these mechanisms to provide polymorphism. (Python does support inheritance, but not to enable polymorphism.) Instead, Python determines type compatibility at run time based on the type’s capabilities rather than on the declared types (a process commonly known as “duck typing”). This feature enables code that’s more flexible and reusable in a wider variety of situations.

To see this in action, revisit the bubble_sort function shown earlier. It takes a single parameter, ar, which represents the collection to sort. A closer look shows that there the object instance ar must have three capabilities to be compatible with the bubble_sort function:

  1. You must be able to ask the ar object for its length.
  2. You must be able to get items in the ar object by numeric index.
  3. You must be able to set items in the ar object by numeric index.

To write the equivalent bubble sort method in C#, you would have had to declare the ar parameter as a specific type?either an interface or a base class?that implements the capabilities listed above. The C# compiler would validate that any code that called bubble sort would pass a parameter that implements that parameter type. Passing any other type as a parameter would result in a compile error, even if the type in question implemented all the required capabilities.

Because Python does all the type checking at run time, so if the type instance passed in as the ar parameter is missing an implementation of one of the three methods listed above, Python will throw a TypeError when it executes the bubble sort code. For the bubble sort code, the three required methods have special names: __len__ returns the length of the collection, while __getitem__ and __setitem__ get and set values by index, respectively. Python defines quite a few special names useful in a variety of circumstances.

Here’s an example of a custom linked list class that implements these methods and is thus compatible with the bubble sort code above.

   class linked_list(object):   class node(object):   def __init__(self,value):       self.data = value       self.next = None   def __init__(self):       self.head = None   def insert(self, value):       n = linked_list.node(value)       n.next = self.head       self.head = n   def __iter__(self):       cur = self.head   while cur != None:       yield cur       cur = cur.next   def __len__(self):       count = 0   for n in self:       count += 1   return count   def find_node(self, key):       cur = self.head   for x in range(key):       cur = cur.next   return cur   def __getitem__(self, key):   return self.find_node(key).data   def __setitem__(self, key, value):       self.find_node(key).data = value

Of course, linked lists don’t usually provide indexed access to their members. But the point is that because this custom class implements __len__, __getitem__ and __setitem__, you can pass an instance of this class to the bubble_sort function without any changes at all.

.NET Interoperability
One of the challenges in designing IronPython lies in attempting to satisfy the two separate audiences the language is intended to serve. On the one hand, it needs to work as much like the standard C-based Python implementation as possible. On the other, it has to have high fidelity interop with the .NET Framework. Sometimes, the needs of those two priorities clash. For example, what should be the result of this code?

   s = 'hello, world!'   s.ToUpper()

In Python, the string type does not have a ToUpper method, so this code should throw an exception. However, in .NET, calling the ToUpper function on this string should return “HELLO, WORLD!” These are obviously contradictory requirements.

IronPython handles this by being a good Python implementation by default, but allowing developer to indicate they want high fidelity .NET interop. In Python, code is organized into modules and namespaces, similar to .NET. IronPython includes a special module called clr. By importing that module, developers indicate they want to use .NET interop. Here’s an interactive IronPython session that demonstrates the use of import clr:

   >>> s = 'hello, world'   >>> s.ToUpper()   Traceback (most recent call last):     File "", line 1, in    AttributeError: 'str' object has no                    attribute 'ToUpper'   >>> import clr   >>> s.ToUpper()   'HELLO, WORLD'

Before the call to import clr, you can’t call the ToUpper method on a string. String objects in Python don’t have a ToUpper method so IronPython hides that method, among others. However, after the call to import clr, all of the String object’s methods?both Python and native .NET?are available.

You also use the clr module to load external assemblies. For example, to use .NET XML processing classes such as XmlReader and XmlDocument, you need to add a reference to System.Xml.dll. In C#, you add that reference declaratively at compile time. But because there is no compile time in IronPython, you need to add the reference imperatively via code. Subsequently, you can import classes into your current scope and use them just like any other Python object.

   import clr   clr.AddReference('System.Xml')   from System.Xml import XmlDocument   xml = XmlDocument()   xml.Load('http://devhawk.net/rss.aspx')

It should be pretty obvious what this code does, but there are a few things I want to note. First off, there is no new statement in Python; you create type instances by calling the type like a function. Second, in C#, importing classes from namespaces (via the using statement) is optional, designed to save typing. In Python, it’s mandatory. In other words, writing the code this way wouldn’t work:

   import clr   clr.AddReference('System.Xml')   # the following line doesn't work   xml = System.Xml.XmlDocument() 

The from…import form of the statement imports the short name. To use the fully namespace-scoped name, write import System.Xml.XmlDocument. Import also supports renaming types to avoid collisions using the syntax import as .

You can make nearly the entire .NET Framework available to IronPython simply by adding references to the relevant assemblies and importing the needed types. Besides being able to create instances of .NET types, you can also consume .NET events, implement .NET interfaces, and inherit from .NET types. For example, here’s some code from the IronPython tutorial that uses Windows Forms.

   # the winforms module in the tutorial directory   import winforms   from System.Windows.Forms import *   from System.Drawing import *   def click(*args):       print args      f = Form()      f.Text = "My First Interactive Application"      f.Click += click      f.Show()

However, one thing you can’t do from IronPython is adorn your code with attributes. Any part of the .NET framework that requires attributes, such as WCF contracts or XML serialization, won’t work with IronPython. Those libraries depend on custom attributes that act like custom metadata to extend the existing static type. Python objects don’t have a static type, so there’s nothing to attach the custom attribute to.

The other thing about .NET interop is that it’s essentially one way. It’s easy for Python to call into .NET classes written in statically typed languages. However, there’s no easy way (yet) for statically typed languages to call into dynamically typed objects. Static languages depend on the compile-time type metadata to dispatch method calls, but that metadata just doesn’t exist in dynamically typed languages such as Python. Obviously, given the multi-language nature of the CLR, enabling statically typed languages to call into dynamically typed code is a scenario we would like to enable in the future.

Embedding IronPython
Python has significant traction in the industry as an easily-embeddable language. Likewise, IronPython can be easily embedded inside your .NET applications to provide a scripting or macro development experience.

IronPython 1.x handled embedding entirely thru the PythonEngine type. Here’s an example that accesses the PythonEngine from the interactive console.

   >>> import clr   >>> clr.AddReference("IronPython.dll")   >>> from IronPython.Hosting import PythonEngine   >>> pe = PythonEngine()   >>> pe.Evaluate('2+2')   4

This example doesn’t give the hosted Python environment any hooks into the host, so the code it can execute is fairly limited. However, PythonEngine provides a Globals collection that you can use to expose your application object model to the Python environment.

In IronPython 2.0, the hosting code gets a bit more complicated:

   import clr    clr.AddReference("IronPython.dll")   from IronPython.Hosting import PythonEngine    pe = PythonEngine.CurrentEngine    scope = pe.CreateScope()   source = pe.CreateScriptSourceFromString('2+2')   result = source.Execute(scope)

The Dynamic Language Runtime (DLR) is one reason the IronPython 2.0 code is more complicated. The DLR is an extension to the CLR that provides common capabilities needed for dynamic languages. But it also provides a common hosting API that allows any application hosting the DLR to support any language that targets the DLR. In other words, if your application uses the DLR-hosting API, it can support not only IronPython, but also IronRuby and Managed JavaScript?or any third-party language that gets built on the DLR (see the hosting API)

Between significant whitespace and dynamic typing, there’s no question Python is a wholly different development experience from C# or VB. I’m a recent convert to Python from C#, so I know exactly how strange it can feel. But once you get past that feeling of unfamiliarity, you’ll start to see just how productive Python can be.

Share the Post:
Share on facebook
Share on twitter
Share on linkedin

Overview

Recent Articles: