Sunday, December 9, 2012

Book Review: Professional .NET Framework 2.0

So, I'm a bit behind on commenting on the tech books that I've read recently.  The first is Professional .NET Framework 2.0 by Joe Duffy (Amazon link).  Note: this book is from 2006 and is currently out-of-print; fortunately there are several sellers in the Amazon marketplace how have new or used copies.

Now, it may seem strange to be reading a .NET 2.0 book from 2006 in 2012, and I would normally agree with that thought.  But this book has a few key advantages to some others.  It was recommended to me as a way to "fill in some gaps", and it has done just that.  I'm not really going to review the entire book here, just provide some comments and samples on what made this valuable to me.

Big Plus - Joe Duffy
One of the big advantages to this book is its author.  Joe Duffy was a program manager on the CLR (Common Language Runtime) team at Microsoft.  This means that he knows the internals of the CLR and not just the external APIs.  And this book provides plenty of insight into the internals, as we'll soon see.

Looking into IL
The biggest learning that I took away from this book is a better understanding of IL (Intermediate Language).  This is what the compiler generates from your C# or VB.NET code (or any other .NET language).  It is not machine code, but it has more in common with assembly language than a 3G or 4G language that business programmers generally use.  The IL will get Just-In-Time compiled (jitted) to the instructions that the machine will actually run.

Now, I've known about IL for a long time, and I've seen people who look at it.  Historically, I haven't been one to dig into it myself.  But after having Joe Duffy explain IL and how language components get transformed into IL, I've been more interested in digging into the IL to see how things are generated.

Setting up ILDASM
The .NET SDK ships with a way to look at the IL -- this is ildasm.exe (for IL Disassembler).  Here's where it's found on my 64-bit machine:

At this location, you'll find a file called "ildasm.exe".  You can run this directly, and then use the File -> Open dialog to locate a .NET assembly.  But what's even better is to add this as an External Tool in Visual Studio. Then you can just run it against whatever project you're currently working on.

To configure this in Visual Studio (I'm using VS2012, but it's the same in earlier versions), just go to the "Tools" menu and select "External Tools...".  From here, you can Add a new tool.  When you're done, it should look something like this:


For the Title, you can use whatever you'd like; this is just what will appear on the Tools menu.  The command should be the fully-qualified location of the "ildasm.exe" file.

The "Arguments" and "Initial directory" settings are a little more interesting.  Notice the arrow to the right of the text box.  This gives you a number of variables that you can choose from.  For the "Arguments" property, I chose 2 variables.  The first "$(TargetName)" is the name of the assembly that is currently active.  Note that this does not include the file extension, just the name.  Because of this, we also need "$(TargetExt)".  This will give us ".exe" or ".dll" depending on what type of assembly we are building.

The "Initial directory" setting tells us where the assembly (from the argument) is located.  In this case, "$(TargetDir)" means the output location of the file.  So, if we are in the Debug build, it will be "<project path>\bin\Debug\".

After clicking "OK", we now have a new item on our tools menu.

Using ILDASM
So, now that we have Visual Studio configured so that it's easy to get to ildasm, let's see what this looks like.  I created a very simple console application that just outputs "Hello, World!" (using Console.WriteLine).  I don't think I need to reproduce the code here ;-)

With this project open, I just need to go to Tools -> ILDasm, and it opens up the compiled assembly.  Drilling down a bit, we get to the Main method:


The name of my application is "ILDamsTest.exe", and we traverse the tree to get to the Main method.  If we double-click on the Main method, we get to the disassembled IL:


Here, we see the IL for the most basic console application.  Going through the instructions, we have the following.  First, "nop" is a "no op", meaning no operation is performed.  This is here so that the compiler can attach a breakpoint to a part of the code that doesn't perform an operation (for example, attaching a breakpoint to the opening curly-brace of the method).

Next, we have "ldstr" which loads the string "Hello, World!" onto the stack.  Next, we have a "call" that runs the Console.WriteLine method.  Note that the IL has the fully-qualified name for this, including the assembly (mscorlib) and the namespace (System).  We can see that this method takes a string argument, and it gets this from the top of the stack.  Since we just pushed "Hello, World!" onto the stack, this string will be used for the parameter.

Next we have another "nop", for the closing curly-brace, and finally the "ret" that returns from the method.

Pretty simple.  Now let's look at the same thing compiled in "Release" mode:


Notice that we've gone from 5 instructions down to 3; the "nop" instructions are gone.  The compiler removed them since we can't breakpoint into release code.

More Fun with IL
So, after I started looking just a little bit at IL, I've become much more curious about how the compiler generates IL.  For example, I like to see what code is generated by the "shortcuts" that the C# language and compiler give us.

Here are 2 equivalent blocks of code:

Now, the "foreach" loop will work on any object that implements the IEnumerable interface.  That interface has a single method ("GetEnumerator").  (Note: if you want more information on IEnumerable, you can reference the series of articles at Next, Please! A Closer Look at IEnumerable.)

So, the first block of code shows how to use the Enumerator directly (with MoveNext and Current), and the second uses the "foreach" construct.  We would expect that these two blocks would compile to the same IL.

But if we look at the IL, we find that they are not compiled to the same code.  They are close, but when  "foreach" is compiled, it is automatically wrapped in a try/finally block.  Try this yourself to see.

And so started my journey into IL and the fun behind it.  Some other interesting places to look include anonymous types (to see that the compiler does create a type with an internal name), delegates (to see what a delegate is behind the scenes), lambda expressions, and many others.  You will probably come across some compiler optimizations that surprise you as well.

Note: I won't include any links to IL references because I haven't found any good ones.  If you want the specification, you can search for "ECMA 335", with a caveat that this is the 500+ page specification document (not overly readable).  Professional .NET Framework 2.0 provides a compact reference for IL (along with a number of examples).

[Update: I found an online reference. It's not very descriptive, but it has the CIL Instructions: http://www.en.csharp-online.net/CIL_Instruction_Set.]

Wrap Up
So, even though Professional .NET Framework 2.0 is 6 years old, I found many useful things in it.  There are plenty of good topics (including a look at the Garbage Collector and also multi-threading).  As you can probably tell, I've become comfortable looking at the IL -- something that I never would have done before.  An appendix is dedicated to the IL operations so that you can parse the output of ildasm yourself.  Once you start looking at the IL, you come across all sorts of interesting things that will give you a better understanding of how the code you write gets transformed into something closer to what the machine will actually run.

This is not a resource for a beginner or someone new to C#, but it is a great way to take a deeper dive into some of the fundamentals of the .NET Framework.

Happy Coding!

No comments:

Post a Comment