devxlogo

Down to the Metal: Managed Code Under the Hood (Part I)

Down to the Metal: Managed Code Under the Hood (Part I)

here are many reasons for using Intermediate Language Assembler (the short form is ILAsm) rather than a high-level language like C# or VB.NET. For example, you can do things with ILAsm that are not supported by the high-level language you choose or fix bugs in an assembly without having the source (if it’s not signed of course). Or perhaps you’re just a freak like I am, who always wants to know how things really work under the hood.

Before getting into IL itself, it’s important that you know a bit about the technologies that lie behind the .NET execution engine. I’ll explain the basics briefly; for a more detailed explanation see the related resources.

To get the most benefit from this article, you should have a good understanding of .NET programming in general. No matter what language you use to write .NET applications, the contents of this article apply to any .NET assembly. You will need the Microsoft .NET SDK and a plain text editor (Notepad will do fine) to create the samples.

CLI?Common Language Infrastructure
CLI is a specification that defines how executable code (Common Intermediate Language or CIL in this case) is run in an execution environment (.NET’s Virtual Execution System, or VES). The CLI consists of four parts:

  • Common Type System (CTS)
  • Metadata
  • Common Language Specification (CLS)
  • Virtual Execution System

Here’s a brief description of each of these four essential parts.

Common Type System: This is the center of the CLI. It describes what types are available for all CLI compliant high-level languages?the types that can be used by a compiler and the CLI itself. A very important role of the CTS is to guarantee type safety. Type safety basically states three axioms that must apply to every type in the CLI:

  • References are what they say they are. Every reference to an object has to include sufficient type information to ensure that it is what it appears to be. References are not allowed to point to character data in one code line and then be interpreted as integer data on another code line. Every reference is typed and also contains information about the possible conversions for that type. Type safety makes it impossible to produce bugs or security issues by misinterpreting referenced data.
  • Identities are who they say they are. This rule prevents malicious use of an object by spoofing the compiler to use another type or security domain. The object can be accessed only by functions and fields that are identified by its type information. But be careful, it’s still possible to design a class in such a way that it compromises security.
  • Only appropriate operations can be invoked. Accessible functions and fields are defined by the type of the reference pointing to the object. This already takes into account the visibility that stems from access modifiers, for example, private fields are only visible in the class itself and not outside.

Metadata
Metadata is used to store information about types in a way such that any CLI compliant tool (compilers, debuggers, or even the execution system) can read this data, regardless of which programming language was used to create it. Metadata is primarily used for declarative programming. For example, you could store information such as debugger visibility in the type metadata rather than deriving a special interface for that purpose. Metadata is both more convenient and easier for programmers to use. The execution environment can use metadata to support different execution models, for better control of optimization rules, and to provide automated support for built in services. Metadata can be also very useful for application deployment, particularly when supporting different possible target environments.

Common Language Specification
This is a kind of contract between the CLI and the language designers. Imagine if there were no common specification: Every high-level language would have to be shipped with a different .NET Base Class Library. But with the CLS all languages capable of compiling code for the CLI agree on a set of functions and types that form a common base of services.

Virtual Execution System
The VES enforces the CTS and executes managed code?you can say that managed data is nothing more than data that the CLI automatically loads and unloads via a mechanism that’s called garbage collection. Managed code is code that can access this data. Similarly, because the data is managed by the CLI it can’t be accessed from code that isn’t CLI aware. However, managed code can also access unmanaged data because there is no special mechanism protecting it from being accessed by anything. The CLI also takes care of exception handling, and storing and retrieving security information in managed code.

How the VES Works
The VES needs to be able to execute managed code on many possible platforms. Therefore, it’s impossible to build the VES like an execution environment intended for specific processor architecture, such as Intel architecture. That means the CLI cannot make assumptions about registers, cache, or operations available on the target system. All those considerations must be abstracted to a form common to all platforms, or to a theoretical form that will be compiled to the real target platform by the Just In Time (JIT) compiler when the code will be executed.

Method State
The .NET VES implements some interesting mechanisms for this purpose. The execution engine runs all the control threads of managed code applications and the memory management in a shared memory space. Every control thread executes methods that put their information on a managed heap, which is maintained by the garbage collector. For each method call the VES creates a new method state memory block. When one method finishes the VES hands control back to the method state that invoked the just-completed method. When the last method state returns (the entry point of an application), the application terminates. The diagram in Figure 1 (based on a diagram in the Microsoft’s Tools Developers Guide) illustrates flow of methods in VES nicely:

?
Figure 1. State Model: The figure shows how control threads create method states on the managed heap. As each method finishes, the execution engine hands control back to the method state that invoked the completing method.

A method state always consists of several parts:

  • Instruction Pointer?points to the next instruction to be executed
  • Method Info Handle?stores read only information about the methods signature
  • Evaluation Stack?more on this later
  • Incoming Arguments?the arguments used to call this method
  • Local Variables?an array of all local objects starting at index 0
  • Local Allocations?used for dynamic allocation of local objects
  • Security Descriptor?not accessible by managed code, but used by the CLI to record security overrides (assert, permit and deny)
  • Return State Handle?restores the method state to the one from the caller (this is often referred to as a dynamic link)

Evaluation Stack
Each method state has an associated evaluation stack used by most CLI instructions to retrieve arguments for calls and store results from calls. The evaluation stack consists only of objects; it doesn’t matter if you push an integer, string or a custom object on the stack?the virtual environment only counts how many things are stored on the stack and in what order. Of course, the VES is concerned about type safety as it calls methods, but for now you just need to know that if you call a method that takes three arguments you must place those three arguments on the evaluation stack before calling the method. That pattern holds true not only for this particular three-argument method, but for any CLI function that requires arguments. Another important evaluation stack rule is that the evaluation stack must contain only the return value at each possible exit point of your method (or no value when a method returns void (a Sub (subroutine) method in VB.NET).

Hello World: Tools
Ok, that’s enough IL theory. Here are the tools you’ll need to complete the first Hello World sample:

  • Ilasm.exe is the IL code compiler. You can find it in your Windows directory under
    %windir%/Microsoft.NET/Framework//ilasm.exe.
  • Notepad.exe is the default plain text editor from Microsoft; however, you can use any text editor you’re comfortable with.
  • Peverify.exe is the tool used to verify the resulting assembly. You will need this to track down errors in your generated IL byte code, for example, type safety errors. It is located in the bin directory of your SDK installation.
  • Ildasm.exe isn’t really required for programming Microsoft Intermediate Language (MSIL) but it’s very useful if you want to inspect IL code from already compiled assemblies. You can find this tool in the bin directory of your SDK installation.

To make your life easier you should add the directories of these tools to your path environment variable. Alternatively, if you have Visual Studio .NET installed you can use the command line prompt available from Visual Studio’s Tools menu.

Hello World: Code
Here’s how to code the typical “Hello World” application with ILAsm. I’ll omit everything that can be left out to make this sample as simple as possible. Start up your favourite text editor and type in the following code. Save the contents to the file helloworld.il and then compile it with ilasm.exe using a command line such as c:>ilasm.exe helloworld.il. This will produce the executable file helloworld.exe.

   .assembly HelloWorld {}      .method static void Main() cil managed   {      .entrypoint         .maxstack 1               ldstr    "Hello World"      call     void          [mscorlib]System.Console::WriteLine(string)      ret   }

To check whether your code and metadata follow all the rules of the CLI you need to verify the executable, which you do by running the command peverify.exe helloworld.exe. When PEVerify states that all your classes and methods have been verified you can run the executable.

HelloWorld: An Explanation
Everything that starts with a dot is a directive to the compiler. The code begins by telling the compiler to create an assembly called HelloWorld. The two curly brackets following the assembly can contain metadata about this assembly, such as a version number. The code shown omits the metadata to keep the code as simple as possible. The compiler creates a default version number of 0:0:0:0 unless you provide something else, but that’s fine for now.

The compiler also adds the necessary support directive for the mscorlib. You can do this explicitly in code if you need use a specific mscorlib version, but for now, accept the defaults wherever possible.

Any metadata entered at the top of the file is called the assembly manifest. You can include other metadata information such as referenced assemblies, the subsystem (Windows executable, console executable, library), module name, etc.

The next line you’ve typed in is a method directive:

   .method static void Main() cil managed

The method directive tells the compiler to create a new method state. It’s followed by the signature of that method. In this sample there are six parts:

  • The static modifier specifies that this method can be called without an instance of a class
  • void tells the compiler that there’s no return value
  • main is the name of the method. I’ve chosen this name because it’s also the name for the entry point of C# applications, but you can use any name you like.
  • The empty parentheses () show that this method needs no arguments.
  • cil defines the method as a Common Intermediate Language method
  • managed is simply the keyword for managed code, and means that the method doesn’t access unmanaged data.

The method body is embedded between curly brackets just as in C#. The brackets define the scope of the method.

In the method body you’ll find two more directives:

  • .entrypoint tells the compiler to mark this method as the entry point of the module. When the module is started execution will begin by calling this method. The application will terminate when this method returns.
  • .maxstack 1 this is used by the compiler to determine the size of the evaluation stack that will be needed to run this method. The size is counted in abstract items, not in bytes. That’s because it’s not needed at runtime, it’s just there to provide verifiability for the CIL-to-Native compiler.

Now that you know all used directives you can examine the code for the message body. It’s quite simple and straight forward:

The line ldstr “Hello World” pushes the string on the evaluation stack. The token ldstr stands for “Load String.”

The line call void [mscorlib]System.Console::WriteLine(string) invokes the WriteLine method from the System.Console class, which resides in the mscorlib assembly. The syntax for calling methods is always the same: assembly name in square brackets followed by the full qualified class name (you are not allowed to use a short version as in high-level languages), then two colons, the method name and the argument type list. This line pops the beforehand loaded string from the stack and passes it to the WriteLine method.

The ret line simply returns from the method to the caller. At this point the evaluation stack must to be empty because the method returns void.

This is a very simple method. There should be no problem calculating the items on the evaluation stack. If you have a larger method it’s good practice to use stack transition diagrams. Such diagrams are simple to read and provide helpful information about the current state of the evaluation stack. A simple stack transition diagram looks like this:

   ..., old state --> ..., new state

The ellipses (…) are used to represent the state of the stack from other operations when such information is not relevant for the actual diagram. For example, three ldstr commands in a row would look like this in a stack transition diagram:

    --> value1   value1 --> value1, value2   ..., value2 --> ..., value2, value3

The diagram for ldstr looks like this:

    --> "Hello World"

Finally, the method call has the following diagram:

    "Hello World" --> 

Isn’t Something Missing?
If you are a C# programmer, or using any other pure object oriented language you may have noticed that there is no class directive in this sample. IL doesn’t need a class. It can be used like a procedural programming language. For example, although C# enforces object-oriented programming (requiring a class), VB.NET programmers don’t need to create a class explicitly with a Main method as the entry point. Those restrictions are rules set by each high level language. The important point to remember is that ILAsm just needs a method that specifies the entry point.

In this article you’ve learned the basics about the structure of the Common Language Infrastructure and the layout of a simple Intermediate Language Assembler code. You will still need more information to find your way through the jungle of IL code. Just take a look at the resource section of this article or check out the next parts of the ILAsm article series.

In addition to the online resources listed in the left column of this article, you may find these print resources useful as well.

Books

Electronic Documents

  • Tool Developers Guide Subdirectory of your SDK Installation. See the two files Partition I Architecture.doc, and Partition III CIL.doc.
devxblackblue

About Our Editorial Process

At DevX, we’re dedicated to tech entrepreneurship. Our team closely follows industry shifts, new products, AI breakthroughs, technology trends, and funding announcements. Articles undergo thorough editing to ensure accuracy and clarity, reflecting DevX’s style and supporting entrepreneurs in the tech sphere.

See our full editorial policy.

About Our Journalist