Quick and Dirty Is Better Than Slow and Proper
Binary serialization in .NET is another example where I discovered unexpected performance results. I generally prefer custom serialization because it gives you finer control of what goes into the serialization stream and how it is versioned. The issue I raise in this section is a simplification of a real-life example I encountered when doing some .NET development.
Say you use a class to store data points for a graph and you want to store those points to disk. The "proper" object-oriented, .NET-friendly way of doing this is to mark the class with the [Serializable] attribute and implement the ISerializable interface. You can then create a BinaryFormatter object and simply stream the object's state to disk. A messier, "dirty" alternative would be to open a file for writing, loop through the data points, and write them to disk.
You might think the first option is better. Doing it "properly" might carry a performance hit, but it can't be that much, right? Wrong.
The code sample in Listing 2 shows the two approaches. After some common code, it then presents two functions: RunUsingISerializable (proper) and RunUsingDirectStreams (dirty). When you run the sample, you might be surprised to find that the proper way is 50 times slower than the dirty way. It takes about 10 seconds to save the array to disk and load it back up. The dirty way takes 0.2 seconds. That time gap can be the difference between a frustrated user and a happy one.
As the sample runs, .NET calls the function System.Runtime.Serialization.Formatters.Binary.__BinaryParser.get_prs(). I don't know what that function does, but isn't it interesting that serializing 100,000 items to disk involves 3.1 million calls?
I encountered this issue while loading a file from disk and displaying it to the end user in an HTML format. This process involved deserializing a series of classes from disk, creating an XML file, doing an XSLT transform to create the HTML output, creating and saving about 100 images to disk, and then showing the HTML file in a browser. The overall process was slow to the end-user, but it wasn't at all obvious to me that the bottlenecks were the loading from disk and the XSLT transform (which, incidentally, also seems to have a very poor implementation in .NET). Without a code profiler to identify the specific performance clogs, I would have wasted a lot of time optimizing the creating and saving of images to disk, which actually wasn't slow.
Keep Your Eye on the Code
By learning to profile your code, you can identify slow areas and remove bottlenecks. Code profiling tools (see Figure 3 for an example) generally can show you exactly how your application is behaving and can pinpoint where to concentrate your optimization efforts.
It's great that Microsoft and other companies provide pre-packaged, easy-to-use, object-oriented components that implement and hide complex bits of functionality. But since these routines are black-boxed, understanding what is going on inside them is very hard. This makes predicting how they will behave nearly impossible, and you very easily can be stuck with slow code.