Browse DevX
Sign up for e-mail newsletters from DevX


Meet the Future of Data Head-on with Comega

Take a peek inside the research labs of Microsoft to see the future of data integration in programming languages.




Building the Right Environment to Support AI, Machine Learning and Deep Learning

first saw Comega about a year an a half ago when the lab that was developing it gave me a sneak preview and asked me for some feedback. The only feedback I could give at the time was "Wow!" Now, everybody can get access to this add-on to C# as Microsoft Research has released the preview for public consumption. What is it, you might ask? Comega is an experimental add-on to C# with the intent to make data a first-class citizen of the language. And it succeeds.

In this article you'll get a whistle-stop tour of some of the features that Comega adds to C#, and build an example application that uses some of these features!

Comega is a strongly-typed, data-oriented programming language intended to bridge the gap between semi-structured hierarchical data (XML), relational data (SQL), and the common type system (CTS—the root of all variable and type declaration .NET languages). It spans XML, SQL, and the CTS by generalizing how they are used. This is difficult to get your head around just reading about it—and best learned through practical and hands-on examples. A little later you will build an application that uses some of these features and gives you a taste for what you can do in Comega.

Getting Started
You can download Comega from Microsoft Research. You'll need Visual Studio.NET 2003 to be able to use it. Once it's installed, when you fire up Visual Studio and open the Create New Project dialog, you'll see a new item in the Project Types list for Comega, and if you select it you can see the Comega project templates. (See Figure 1.)

Go ahead and create a new 'Windows Application' with Comega. This creates a new project of type 'cwproj' containing 'form.cw' which is a simple form written with Comega.

Open up this form and take a look at its code. You can see some of this code in Listing 1.

The first thing that you'll notice is that you have XML inline in your code! It's not hidden within a string where syntax coloring cannot be interpolated—it's right there in your code and is used to set the properties of the controls. It's a little weird at first, because as a developer you are so used to having the XML outside of your code and having to parse the XML to do something meaningful. With Comega, data is a first-class citizen, so XML is used like any other keyword.

Figure 1. Creating a New Comega Project: Once you've installed Comega, you create a new Comega project like you would any project in Visual Studio—by selecting a project template from the New Project dialog.
Run this application and you'll get a basic white window with a salmon-colored button that closes the window if you press it.

Using Data in Comega
The core type systems of Comega are:
  • Streams:
    These are central to how Comega works. A stream is (not to be confused with File Streams or the like) a homogenous collection of a particular type that is constructed when it is needed, and as such, is much more flexible than an array. A stream is used similarly to a struct, but what is special about it is that it can contain programming logic. So, for example, if you want to create a stream that contains even numbers, you can declare it like this:

    int* staticEvenNumbers() { yield return 2; yield return 4; yield return 6; }

    This is very similar to how you would declare a struct. C++ programmers will think pointers when they see the '*' but don't worry, you're not going back into pointer trouble! The neat thing about the Comega stream is that you don't need to statically declare the numbers, it can also be done programmatically, such as:

    int* dynamicEvenNumbers(int nCount) { for (i=0; i<=nCount; i++) yield return i*2; }

    What you get is the best of both worlds between a function call and a data structure! DynamicEvenNumbers is an unusual, and very flexible beast now. It behaves like a function call—returning nCount number of even numbers, a data structure that contains those numbers, and a dynamic array that sizes itself without you having to worry about it. Cool huh?

  • Content Classes:
    A content class is a class that is used to specify the schema or structure of a document type. It is analogous to an XML DTD or an XSD sequence. In practical use, this means that if you want to generate XML documents in your Comega program, you have the additional flexibility of automatic validation of their types. Traditionally you would use a DOM to build the document manually, twiddling with nodes and attributes, and then hope for the best. In Comega, life is a lot easier.

    For example, if you want to create a Content Class for an e-mail message, it needs to contains a header and a body. The header must have a 'From' and a 'To' field, and optionally a 'Subject' field. The body is a string containing the text. You would declare the content class like this:

    public class Email { //content of Email message struct{ struct{ string From; string To; string? Subject; } Header; struct{ string P; }+ Body; }

    The usefulness of this isn't immediately apparent, but, when you combine it with a function for generating an e-mail message that uses inline XML, it'll reveal itself. Here's an example of such a function:

    public static Email Vacation(DateTime d, TimeSpan s, string to) { return <Email> <Header> <From>John Doe</From> <To>{to}</To> <Subject>OOF</Subject> </Header> <Body><P>I am OOF from {d} until {d+s}.</P></Body> </Email>; }

    Now you are generating an XML document of type 'Email,' which will be validated by the content class. This is much easier than twiddling around with XMLDocuments, XMLNodes, and such. In addition, you enter the custom data for the e-mail message directly inline in the XML by passing the parameters such as 'to' to the XML using the {} syntax, so <To>{to}</To> will create the correct node.

  • SQL-like Selection:
    You are almost certainly accustomed to using SQL for getting data from tables in a database, and have found that there is some wonderful logic that can be applied in SQL to filter and sort information. With Comega, you can now do this with your in-memory data structures. You'll start with a simple stream, for example:

    : struct{ string Title; string Actor; string Genre;}* DVDs = new{ new{Title="The Matrix", Actor="Keanu", Genre="SciFi"}, new{Title="Battlestar Galactica", Actor="Edward James Olmos", Genre="TVSciFi"}, new{Title="Blade Runner", Actor="Edward James Olmos", Genre="SciFi"}, new{Title="Pride and Prejudice", Actor="Colin Firth", Genre="Mushy"}};

    And next, you can query this stream:

    results = select * from DVDs where Actor=="Keanu"; results.{Console.WriteLine("Title = {0}, Genre={1}", it.Title, it.Genre);};

    So, you can now start using your SQL skills to pull out data instead of writing your own iterators!
These are but a few of the features that are offered in the language and are here to whet your appetite. Comega offers a lot more, including, but not limited to, a new concurrency model for multithreaded applications (formerly known as polyphonic C#), transactions support in SQL, choice-types, nullable value types, and anonymous types. You can learn a lot more in the Comega documentation that comes with the install.

Comment and Contribute






(Maximum characters: 1200). You have 1200 characters left.



Thanks for your registration, follow us on our social networks to keep up-to-date