RSS Feed
Download our iPhone app
Browse DevX
Sign up for e-mail newsletters from DevX


Polyglot Programming: Building Solutions by Composing Languages : Page 2

This article delves into the motivation, benefits, and challenges of writing applications using polyglot programming—leveraging the multi-language nature of the CLR to create simpler solutions to vexing problems.

Polyglot Programming Today
Actually, developers already do this all the time without even realizing it. Do you write applications that talk to a database? Do you write Web applications? Chances are good that you do, which means that you are already polyglot programming: C# + SQL + JavaScript > 1! Developers do this without even thinking about it now; it's a natural part of the development landscape. But why not write all data access code in C# and skip the relational database entirely? You could write an entire application using a flat file or XML document. (Oh, wait. That's yet another language!) But it turns out that relational databases are handy things to have around because they use a different abstraction mechanism to handle large quantities of data. Set-based operations on data sets have appealing characteristics, so we've created special purpose software (database servers) with their own language (SQL) to handle this chore. The same is true for JavaScript. Love it or hate it, but you can't avoid it because it is the lingua franca of Web browsers. It has special features that facilitate writing interactive Web pages.

In fact, I would say that multi-language solutions are even more pervasive than the proceeding paragraphs suggest. Every XML configuration file is its own language. These configuration files all have document type definitions (DTDs) or schemas, which define the grammar of the language. They all just happen to share the same syntax (XML), just as English and French mostly share the same alphabet but have different words and grammars. Viewed in this light, our development environments today are already awash with multiple languages. But most of these languages are just palliatives for the mistaken notion that we can work most effectively by writing in one language, our one true language. That's not always the case.

Polyglot Programming for Real
Let's say you have a desktop application that needs a sophisticated multi-threaded scheduling portion. You could build the entire thing in C#. But, the strong typing in C# doesn't really help you much when building a user interface. That might be better done with VB, with looser typing enabled.

The scheduling part, though, provides the biggest challenge. Building good thread-safe code in C# is hard. This isn't particularly C#'s fault; building good multi-threaded code in any imperative language is hard. But functional languages have much better support for those kinds of applications.

An imperative language belongs in the family of languages that are algorithmic in nature; the lines of code execute more or less top down, and you specify each part of the operation to be performed. Imperative languages generally have shared state in variables. Obviously, these are the types of languages most used today.

Functional languages, on the other hand, model themselves from mathematics. The functions in a functional language work more like mathematical functions (in fact, the really strict functional languages give you the ability to create formal proofs that a function works correctly). Generally, functional languages don't have mutable state, or have it in a way that highlights the differences between mutable and immutable state. Encouraging you to use immutable state makes it easier to write multi-threaded applications. You don't have to worry about synchronizing code blocks because you don't use the shared state that requires synchronization.

Why talk about functional languages here? F# is a new entrant into the .NET language world. It was spawned by Microsoft Research as a derivative of the OCaml functional language. F# borrows much of OCaml's syntax, adding features to make it work well within the CLR. You can call CLR methods, pass parameters, and generally interact with the rest of the .NET universe from your F# code.

However, building entire applications in functional languages is difficult for several reasons. First, the default style of development eschews variables with shared state. It's difficult to build applications that do common things such as I/O when you can't change the value of a variable. Of course, F# has facilities for such things, but typically, what's easy to build in C# tends to be more difficult to build in F#. Of course, the converse is also true. Building things that are very difficult in C# is often easy in F#. Which brings up the second reason why you tend not to build entire applications in F#: It's hard for developers weaned on imperative languages to wrap their head fully around functional languages.

That's where polyglot programming shines. In this view of the world, you don't try to build applications entirely in F#. Instead, for the sophisticated multi-threaded scheduling example cited above, you'll have a solution that contains three projects, each hosting a different language. Use C# for the workflow part of the application (the Controller in Model-View-Controller parlance). Most of the model also resides in C# (all but the scheduling part). Implement the nasty multi-threaded scheduling part in F#, taking advantage of the greater ease of writing multi-threaded code because the language has better support for it. Finally, implement the view in VB with strong typing relaxed, allowing for faster development of the lightweight user interface of the application.

Practical Polyglot Programming
The benefits of writing in this style include using languages better suited to particular types of problems. Just like developers use SQL today to handle data chores, I can see a time when certain parts of the application are written in functional languages. At least one financial trading firm on Wall Street writes all their applications in OCaml now, believing that it gives them a competitive advantage over similar firms. They are, in fact, building the entire application in one true language (theirs being OCaml), so they are paying a penalty for trying to write things like user interfaces that are easier in imperative languages. Once developers become accustomed to writing polyglot programs, it'll seem as natural as database applications today.

Of course, one of the things that make writing database applications difficult today is the nasty impedance mismatch between object-oriented languages and set-based SQL. Literally billions of dollars have been spent trying to solve this problem, and we still have mediocre solutions at best. My friend Ted Neward has a great quote related to this very topic: "O/R mapping is the Vietnam of Computer Science. First, you send in a few advisors, then more advisors. Before you know it, you have troops on the ground and no end in sight!" This quote nicely encapsulates the difficulty of this problem. The latest attempt to make this problem go away is the Entity Framework (notice the use of framework as container for reusable code).

But O/R mapping suffers from two unrelated problems. The first problem is passing information across machine boundaries. To do that, you must have special formats (generally either binary through a database adapter or XML). Passing information across machine boundaries is always expensive and hard to get right; fortunately, that problem is largely solved. The second problem in O/R mapping is that the two domains use different conceptual models; object-oriented languages use object hierarchies while SQL uses sets. The latest attempt to solve this problem uses a different flavor of polyglot programming, a domain-specific language called LINQ, which eases the translation boundary between these two fundamentally different abstraction styles.

The CLR diminishes both problems. The language designers at Microsoft have paved over many of the abstraction distractions between the functional F# and other CLR languages. They can do this because they all produce the same IL code. And that's the other reason why this is an easier problem than O/R mapping. Polyglot programming implies that all the code compiles to a common intermediate representation (like IL). Thus, you don't have to pass it across machine boundaries and you can take advantage of shared types defined in IL.

One of the problems with polyglot programming lies with debugging multi-language solutions. This is where Visual Studio as the common container for all .NET languages comes in handy. Because it's all just IL once it's been compiled, you can step through F# code but end up stepping into C# code and vice versa. In fact, smart tools enable this style of development (which is one of the reasons this style of building applications hasn't really taken hold in the past). Now, developers have a sophisticated environment that readily handles multiple languages.

Close Icon
Thanks for your registration, follow us on our social networks to keep up-to-date