Teaching Myself IL, Day One - An Interface, A Class, A Console.Writeline.
As I wrote yesterday, I've decided to sit and down and teach myself enough IL to make myself understand the inner workings of .net a little better.
I sat down tonight wondering what 3 lines of IL I was going to be able to figure out and write.
First I made sure I was prepared. Since Visual Studio doesn't actually support proper editing of IL files (that is, no syntax highlighting or intellisense...hmm, now there's an idea...I fell back to my 'other' Text Editor of choice, UltraEdit. This little fella can do syntax highlighting, and a quick check of the website showed that someone had kindly donated an MSIL syntax highlight definition.
So authoring was sorted. Now for learning. There's three things that have been beneficial for me so far.
The CLI standards documents. I've got two ways of looking these up. At work I've got a copy of The Common Language Infrastructure Annotated Standard, which I keep forgetting to bring home. So at night, I've been getting the actual ECMA standard documents.
Kenny Kerr's Introduction to MSIL series.
An MSIL tutorial on code guru.
These have been enough to push in the right direction. If I get really stuck, I write a little test code in VB, and open the compiled exe up in reflector. I know, it's cheating, but I guess I can't help it - I've always done best as a beginner by using the 'learn by example' method.
Tonight I got a more done that I thought I would. I'm quite impressed with myself actually :) Rather than go through my actual learning experience, I thought I'd walk through the code I wrote tonight from top to bottom.
.assembly ILConsoleApp
{
.ver 1:0:0:1
}
The .assembly set allows you specify things like the assembly name, version, and any custom attributes you want to set on the assembly level. In this assembly, I've called it 'ILConsoleApp', with a version of 1.0.0.1. I made an attempt at specifying a custom attribute (AssemblyTrademark, chosen at random from the set of my sample VB code that I eventually want to fully emulate.) The problem here is that while I was able to specify the attribute, to declare the actual string value for it, you need to attach a proper manifest.
I'm going to leave manifests for another day :)
.class interface ISimpleInterface
{
.method public abstract virtual void SimpleSub() {}
.method public abstract virtual int32 SimpleFunction() {}
}
Next, I worked out how to declare an interface. It took me a while, until I discovered that even interfaces are declared as a '.class'. So this declares my ISimpleInterface interface. It contains two '.method' instances - I learned with trial and error that all methods on an interface declaration must be public and abstract and virtual.
.class SimpleClass
extends [mscorlib]System.Object
implements ISimpleInterface
{
.method public void .ctor()
{
.maxstack 1
ldarg.0
call instance void object::.ctor()
ret
}
.method public virtual void SimpleSub()
{
.override ISimpleInterface::SimpleSub
.maxstack 1
ldstr "This is a console application"
call void [mscorlib]System.Console::WriteLine(string)
ret
}
.method public virtual int32 SimpleFunction()
{
.override ISimpleInterface::SimpleFunction
.maxstack 1
ldc.i4.1
ret
}
}
I also managed to write a class that implements an interface. The first three lines describe the class (SimpleClass), the inheritance model (I'm being good - if I've got nothing to inherit from, then dammit, I'm going to inherit from Object!), and what interfaces are involved (ISimpleInterface).
Inside the class definition, there's three methods. First is the constructor, followed by the two methods required for the interface I'm implementing. What's interesting is that if you leave out one of the interface methods, it still compiles - it only dies at runtime. Me thinks I'm going to need to be careful :)
In each method is a line that has a '.maxstack'. IL is based on a simple stack machine, and it really helps the runtime out if you let it know how many items on the stack you're going to use in total (that is, in the course of this method, what's the highest number of items you push on the stack at once?). If you leave it out (which I did in the constructor at first) it seems to default to a maxstack size of 8.
The first method defined is the '.ctor'. This is IL speak for constructor :) In it, all I'm doing is forwarding on the call to MyBase.New.
The second two methods are the interface methods. Each of them contains a reference to '.override' to let it know what method is being overridden. I learned here (again by leaving stuff out) that you don't have to specify this. It seems the runtime can figure it all out *shrug*. The first method writes 'This is a console application' to the console, and the second returns an integer value of 4.
And now we have the way to get it all to work...
.method static void main()
{
.entrypoint
.maxstack 1
.locals
(
class SimpleClass oTest
)
newobj instance void SimpleClass::.ctor()
stloc oTest
ldloc oTest
callvirt instance void SimpleClass::SimpleSub()
ret
}
This is my main method. The .entrypoint tag tells the runtime where to start running the application.
.locals is where you specify your local variables. In this case, I declared an oTest variable of type SimpleClass.
The remainder of the method is it actually doing stuff. In this case, I'm creating a new instance of the SimpleClass, storing it in the oTest variable, and then invoking the SimpleSub() method.
So, now what?
String them all together into one file, and jump to the visual studio command prompt.
>ilasm /exe /output=ILConsoleApp.exe ILconsoleApp.il
Microsoft (R) .NET Framework IL Assembler. Version 1.1.4322.2032
Copyright (C) Microsoft Corporation 1998-2002. All rights reserved.
Assembling 'ILconsoleApp.il' , no listing file, to EXE --> 'ILConsoleApp.exe'
Source file is ANSI
Assembled method ISimpleInterface::SimpleSub
Assembled method ISimpleInterface::SimpleFunction
Assembled method SimpleClass::.ctor
Assembled method SimpleClass::SimpleSub
Assembled method SimpleClass::SimpleFunction
Assembled global method main
Creating PE file
Emitting members:
Global Methods: 1;
Class 1 Methods: 2;
Class 2 Methods: 3;
Method Implementations (total): 1
Resolving member refs: 2 -> 2 defs, 0 refs
Writing PE file
Operation completed successfully
>
And does it run?
>ILConsoleApp.exe
This is a console application
>
w00t!!!
I'll let you in on a little secret. I managed to get this to work. In the end it was quite easy, once you learned the conventions involved. But I don't think I really _know_ what I'm doing yet. When do I use virtual? When do I use abstract? What about the other meta data tokens that I haven't touched yet? Never mind all the actual instructions for actually doing stuff - I'll cover them later, once I've made it past the basics.
There's still a long way to go. Based on what I've read so far, this has been child's play. Once I have to deal with branching and Try Catch blocks, then I'm in for the real fun.