Sorting in the .NET Framework

n .NET, sorting arrays or collections of primitive types is a relatively simple task. Most developers understand that for a collection of numbers or strings, the runtime will know how to compare these values. However, some developers are left wondering how to sort complex objects. For example, what if you wanted to sort an array of FileInfo objects by size rather than name?

After witnessing one fellow developer attempt to code his own BubbleSort algorithm because he could not find another way to sort an array of his custom objects, I felt this article would help spread the word about the rich sorting capabilities supported by .NET.

Sorting Arrays and ArrayLists
Sorting an array composed of primitive data types such as strings and integers is straightforward. The Array class contains a static method called Sort. This method takes in an existing array object and sorts it. The following code demonstrates how simple it can be:

Dim aryBeatles() As String = {"John", "Paul", "George", "Ringo"}Array.Sort(aryBeatles)' New order of the aryBeatles array:' ' aryBeatles' ----------' George' John' Paul' Ringo

Sorting an ArrayList is equally easy but somewhat different. The ArrayList is based on an Array. It internally manages an array while exposing properties and methods to make working with them easier. Taking from the example above, here is a demonstration of sorting an ArrayList:

'Base the arraylist on the aryBeatles array.Dim lstBeatles As New ArrayList(aryBeatles)lstBeatles.Sort()

Aside from the syntactical differences, the ArrayList still internally relies on the Array.Sort method to sort its contents. The Array.Sort method is the core of all sorting in. NET. Because of this, for the rest of the article I will use arrays to demonstrate the sorting capabilities of the .NET Framework.

Key-value Pair Sorting
One of the features of sorting in the .NET framework is the ability to sort one array based on the values of another array of equal length. This is called key-value sorting. Basically, you have an array of keys and an array of values. The key value in the key array at a given index is used to look up the value in the value array at the same index. Consider the example in Listing 1.

As you can see, it sorts the key array and changes the index in the value array so that their mapping stays consistent. This type of sorting is useful if you implement your own hashtable or dictionary collection, and the underlying collections are a pair of arrays corresponding to the keys and values.

Figure 1. Partial Sorting:This illustration gives a visual picture of how a portion of an array’s content can be sorted.

Sorting a Portion of a Collection
Sometimes you may want to sort only a portion of a collection. Figure 1 illustrates partial sorting. Typically, you could accomplish this by copying the range of values you want sorted into a temporary array, sort it, and copy them back. However, in .NET you can accomplish this in one line of code:

Dim aryNumbers() As Integer = {13, 4, 7, 10, 2, 8, 17, 5, 1, 11}' Using the Array.Sort(array, index, length) method call to sort' only elements 2-6.Array.Sort(aryStateKeys, 2, 5)' New order of the aryNumbers after partial sorting:' ' aryNumbers' ----------' 0:      13' 1:       4' 2:       2' 3:       7' 4:       8' 5:      10' 6:      17' 7:       5' 8:       1' 9:      11

Automatic Sorting Using a SortedList
The .NET Framework contains a collection called a SortedList. Instead of calling sort when needed to sort the collection, the collection stores all its values already presorted. As you add new values to the collection, it internally finds the appropriate spot to insert the object. This class is useful when you need the objects contained within the collection to be pre-sorted at all times.

The SortedList acts more like a Hashtable rather than an ArrayList. The reason for this is that the collection stores key-value pairs rather than the values themselves. Because the Hashtable and DictionaryBase classes do not have sorting built-in, it may be more suitable to use SortedList when you require the keys to be sorted.

This code shows how to use a SortedList:

 Dim slsStates As New SortedListslsStates.Add("CA", "California")slsStates.Add("AZ", "Arizona")slsStates.Add("NV", "Nevada")' slsStates order:'' Keys | Values' ---------------' AZ   | Arizona' CA   | California' NV   | Nevada

Curiously, Microsoft left out sorting in Hashtables and other dictionary objects. At the time of writing, I was unable to confirm whether or not that sorting capability will be added in .NET 2.0. Neglecting to add this capability prolongs developer inconvenience and will force us to create an ad hoc workaround.

Customized Sorting with IComparer
Another way to sort objects is to use a comparer class. A comparer class is one that implements the IComparer interface. This is not to be confused with IComparable, which has a different use.

When you use a comparer class to sort objects, you are overriding the default comparing method defined by the object’s CompareTo() method. As a matter of fact, if you use a comparer class to sort a list of objects, the class definition for the object doesn’t even need to implement IComparable. Thus, the Sort() methods defined in Array and ArrayList requires that all objects being sorted either implement the IComparable interface, or a class implementing IComparer be passed. The .NET Framework defines several comparer classes for you to use. One of them, the CaseInsensitiveComparer class, allows you to sort strings by ignoring their casing. So in this case, the .NET framework ignores the String class’ CompareTo method, and instead uses the rules defined in the CaseInsensitiveComparer class. The following code illustrates this feature:

Dim aryLastNames() As String = {"sMiTh, ZULU", "smith, john&", "SMITH, TerrY"}Array.Sort(aryLastNames, New CaseInsensitiveComparer)' aryLastNames order:'' smith, john' SMITH, TerrY' sMiTh, ZULU

In the opening paragraphs I talked about sorting objects like System.IO.FileInfo. IComparable, so using a comparer class is absolutely necessary. In any case, defining your own comparer class for FileInfo objects allows for maxmium flexibility when determining how they should be sorted. For example, a program that displays files on disk to the user and allows the user to sort these files by name, extension, length, and whatever else (just like Windows Explorer) would take advantage of a comparer class to implement this functionality. The download for this article provides a demonstration of using IComparer sort an array of FileInfo objects.

IComparable vs. IComparer
Because of the way they’re named, these two interfaces are oftentimes confused. But after exploring their uses, the differences should be clear. The following table shows a side-by-side comparison between the two.

IComparable

IComparer

Comparing method must be defined within the class for which it is used.

Comparing method is defined in a separate class.

Can provide only one method for which to compare objects of the class.

Many classes can be defined, each classthat can contain a distinct comparing method.

Because it is defined within the class, it can access private variables on which to sort.

Can only sort on public properties and fields.

If both interfaces are defined for sorting, this one is not used.

If both interfaces are defined for sorting, this one is used.

Provides a default comparing method for the class in which it is defined.

Provides an alternative sorting method for the class for which it is used.

Table 1. IComparable vs. IComparer: This table breaks down the differences between these two interfaces.

Basically, IComparable is used to define a default comparing method, and IComparer is used to override or use in the absence of the default comparing method.

Comparing Objects with Operator Overloading
Operator overloading is the practice of defining a use for an operator, such as the plus sign + or equality sign =, for the purpose of comparing objects. Although only loosely related to sorting, operator overloading is a nice, powerful feature traditionally available in languages such as C++. Currently in .NET, operator overloading is a feature only supported in C#. However, in .NET 2.0 (code-named Whidbey), operator overloading will be introduced in VB.NET.

A good example of an object that uses overloaded operators is the String class. When you compare two strings using the equal sign = (== in C#), you are implicitly using an overloaded operator.

Considering the Recipe class explained before, you can easily extend what has already been defined to includes operators for equality, less than, and greater than operations. Listing 2 shows the Recipe class modified from above to include these operator definitions.

Making Objects Comparable Using IComparable
Implicitly, all primitive types implement an interface called IComparable. This interface defines one method, CompareTo(), that allows a comparison between itself and another instance of the same type. This means that all basic data types, such as integers, strings, and even Booleans have a CompareTo method on it.

In most cases, your class will be sorted based upon one of its properties, which will most likely be of a primitive type. This makes implementing the IComparable interface even easier. Consider the class Recipe shown in Listing 3.

Listing 3 shows that when you implement IComparable for the Recipe class, you are required to implement the CompareTo() method. In this definition, you simply return the result of comparing the private variable that holds the value for RecipeID to the RecipeID property of the object being passed. Although this is a simple example, if I had to, I could have made the comparison more complex by involving more than one property of the class.

Figure 2. Multidimensional Sorting Diagram: This diagram gives a visual representation of the different ways that a multidimensional array can be sorted.

Sorting Multi-dimensional Arrays
The developer community is abuzz with discussion of sorting multidimensional arrays. Strangely enough, the .NET Framework doesn’t support the sorting of multidimensional arrays. As a matter of fact, if you try to pass a multidimensional array into Array.Sort method you will get an exception. This leaves many developers confused on how to deal with sorting arrays of more than one dimension. Fortunately, there is a workaround.

The way to approach this problem is to turn a multidimensional array into a jagged array. A jagged array is basically an array of arrays. In other words, the array’s element type is also of type array. The key advantage of jagged arrays over multidimensional arrays is that jagged arrays can dynamically scale to infinite dimensions, whereas multidimensional arrays are fixed in their rank, or number of dimensions. And as previously stated, jagged arrays can be sorted.

Different methods of sorting a multidimensional array produce different resuts. The first method is called full sort. This method sorts all of the dimensions. The second method performs a surface sort, or sort on only the first dimension, using the second and subsequent dimensions as tie breakers. The third method sorts the entire array as if it were a one-dimensional array. This method is called flat sorting.

Flat Sorting
When sorting the array in either a structured or flattened manner, the final thing to consider is its dimensional direction. The previous discussion assumed top-down sorting?as if you were starting from the first dimension and working your way down to the last one. However, it’s feasible to sort from the bottom-up, or working from the last dimension to the first. Figure 2 illustrates the six possible ways that a multidimensional array can be sorted. These ways are described in Table 2.

Sort Type

Description

Full Sort

From the starting dimension to the end, each dimension is sorted and the restructuring of the elements of that dimension affects the sorting of the subsequent dimensions.

Surface Sort

From the starting dimension to the end, each element is compares, and in the event of a tie, the next element in the dimension serves as the tiebreaker.

Flat Sort

The array is treated as a one-dimensional array for sorting.

Table 2. Sorting Types For Multidimensional Arrays: Sorting multidimensional arrays can produce different results based on the manner in which it was sorted.

The source code for this article includes a class library to facilitate multidimensional sorting. A demo application is also included to interactively show these sorting methods in action.

Don’t Reinvent the Wheel!
Sadly, many developers, especially those from a VB6 background, are slow to gain a fundamental understanding of the richness of the class library that the .NET library boasts. Hopefully, this article can help save developers from reinventing the wheel by coding their own sorting routines.

Share the Post:
Share on facebook
Share on twitter
Share on linkedin

Overview

Recent Articles: