Browse DevX
Sign up for e-mail newsletters from DevX


Down to the Metal: Managed Code Under the Hood (Part I) : Page 3

No matter which high-level languages you use to create .NET applications, they all compile to Intermediate Language (IL). In this series, you’ll dig down inside the .NET framework and find out how IL works. Learn to write IL directly to create .NET assemblies without the need for a high-level language or sophisticated IDEs.




Building the Right Environment to Support AI, Machine Learning and Deep Learning

Hello World: Tools
Ok, that's enough IL theory. Here are the tools you'll need to complete the first Hello World sample:

  • Ilasm.exe is the IL code compiler. You can find it in your Windows directory under
  • Notepad.exe is the default plain text editor from Microsoft; however, you can use any text editor you're comfortable with.
  • Peverify.exe is the tool used to verify the resulting assembly. You will need this to track down errors in your generated IL byte code, for example, type safety errors. It is located in the bin directory of your SDK installation.
  • Ildasm.exe isn't really required for programming Microsoft Intermediate Language (MSIL) but it's very useful if you want to inspect IL code from already compiled assemblies. You can find this tool in the bin directory of your SDK installation.
To make your life easier you should add the directories of these tools to your path environment variable. Alternatively, if you have Visual Studio .NET installed you can use the command line prompt available from Visual Studio's Tools menu.

Hello World: Code
Here's how to code the typical "Hello World" application with ILAsm. I'll omit everything that can be left out to make this sample as simple as possible. Start up your favourite text editor and type in the following code. Save the contents to the file helloworld.il and then compile it with ilasm.exe using a command line such as c:\>ilasm.exe helloworld.il. This will produce the executable file helloworld.exe.

.assembly HelloWorld {} .method static void Main() cil managed { .entrypoint .maxstack 1 ldstr "Hello World" call void [mscorlib]System.Console::WriteLine(string) ret }

To check whether your code and metadata follow all the rules of the CLI you need to verify the executable, which you do by running the command peverify.exe helloworld.exe. When PEVerify states that all your classes and methods have been verified you can run the executable.

HelloWorld: An Explanation
Everything that starts with a dot is a directive to the compiler. The code begins by telling the compiler to create an assembly called HelloWorld. The two curly brackets following the assembly can contain metadata about this assembly, such as a version number. The code shown omits the metadata to keep the code as simple as possible. The compiler creates a default version number of 0:0:0:0 unless you provide something else, but that's fine for now.

The compiler also adds the necessary support directive for the mscorlib. You can do this explicitly in code if you need use a specific mscorlib version, but for now, accept the defaults wherever possible.

Any metadata entered at the top of the file is called the assembly manifest. You can include other metadata information such as referenced assemblies, the subsystem (Windows executable, console executable, library), module name, etc.

The next line you've typed in is a method directive:

.method static void Main() cil managed

The method directive tells the compiler to create a new method state. It's followed by the signature of that method. In this sample there are six parts:

  • The static modifier specifies that this method can be called without an instance of a class
  • void tells the compiler that there's no return value
  • main is the name of the method. I've chosen this name because it's also the name for the entry point of C# applications, but you can use any name you like.
  • The empty parentheses () show that this method needs no arguments.
  • cil defines the method as a Common Intermediate Language method
  • managed is simply the keyword for managed code, and means that the method doesn't access unmanaged data.
The method body is embedded between curly brackets just as in C#. The brackets define the scope of the method.

In the method body you'll find two more directives:

  • .entrypoint tells the compiler to mark this method as the entry point of the module. When the module is started execution will begin by calling this method. The application will terminate when this method returns.
  • .maxstack 1 this is used by the compiler to determine the size of the evaluation stack that will be needed to run this method. The size is counted in abstract items, not in bytes. That's because it's not needed at runtime, it's just there to provide verifiability for the CIL-to-Native compiler.
Now that you know all used directives you can examine the code for the message body. It's quite simple and straight forward:

The line ldstr "Hello World" pushes the string on the evaluation stack. The token ldstr stands for "Load String."

The line call void [mscorlib]System.Console::WriteLine(string) invokes the WriteLine method from the System.Console class, which resides in the mscorlib assembly. The syntax for calling methods is always the same: assembly name in square brackets followed by the full qualified class name (you are not allowed to use a short version as in high-level languages), then two colons, the method name and the argument type list. This line pops the beforehand loaded string from the stack and passes it to the WriteLine method.

The ret line simply returns from the method to the caller. At this point the evaluation stack must to be empty because the method returns void.

This is a very simple method. There should be no problem calculating the items on the evaluation stack. If you have a larger method it's good practice to use stack transition diagrams. Such diagrams are simple to read and provide helpful information about the current state of the evaluation stack. A simple stack transition diagram looks like this:

..., old state --> ..., new state

The ellipses (...) are used to represent the state of the stack from other operations when such information is not relevant for the actual diagram. For example, three ldstr commands in a row would look like this in a stack transition diagram:

<empty> --> value1 value1 --> value1, value2 ..., value2 --> ..., value2, value3

The diagram for ldstr looks like this:

<empty> --> "Hello World"

Finally, the method call has the following diagram:

"Hello World" --> <empty>

Comment and Contribute






(Maximum characters: 1200). You have 1200 characters left.



Thanks for your registration, follow us on our social networks to keep up-to-date