Building Truly Useful Extension Methods

isual Studio 2008’s new extension methods feature stands out from among Visual Studio’s new offerings because you’ve been clamoring after it for years, right? What’s that? You weren’t clamoring for extension methods? Me neither.

Actually, Microsoft implemented extension methods to make it easier (perhaps even possible) to provide LINQ. And as long as they had to build extension methods anyway, they decided to give developers the option to use them?and then gave themselves credit for an amazing new tool!

Extension methods let you add new features to existing classes, even classes that you didn’t write yourself. You can use them to add new methods to your own classes, classes written by other developers, and even sealed/final classes built by Microsoft such as Integer, Graphics, and String. You can even add extension methods to types built from simpler types, for example, integer or byte arrays.

This article explains how to create and use extension methods. It also shows how to add several useful methods to the String class (and one corresponding method to byte arrays). But first, the following section provides a word of warning.

Extension Methods and Encapsulation
Microsoft invented extension methods as a sneaky way to provide LINQ features. Microsoft’s developers used them to add new methods to existing classes without needing to rebuild those classes. LINQ hides all the details so you don’t really need to worry about what those extension methods are or where they are defined.

If you work at the LINQ level, everything’s fine, but when you pry back the little plastic door that says “Warranty void if removed” and poke a screwdriver inside, things can get a bit confusing.

That’s because extension methods have great potential to violate a class’s encapsulation. Ideally, a class should embody a single, well-defined concept and wrap up all the data and methods associated with that idea in a nice neat little package. Using those hallowed precepts, you’ve constructed the ultimate Customer class.

Now, however, Belligerent Bob (an old FORTRAN programmer who thinks a class is something you take at school) can graft all sorts of extra baggage onto your elegant design. He can add methods that don’t make sense for the class (he adds a ReorderInventory method to your Customer class). He can overload methods that you already created and give the new versions completely different and undocumented meanings (your version of the Sell method takes an Order as a parameter and invoices the Customer; his new version of Sell takes a URL as a parameter and spams the customer with whatever file is at that location). He can generally clutter the class’s namespace with junk so it’s hard to find what you want. Worst of all, he can put all this sordid code just about anywhere so it can be hard to find. (Then he’ll check it out of Source Safe and keep it checked out forever so you can’t fix any bugs in it!)

And if you were thinking that you might be able to thwart Bob’s nefarious scheme by marking your Customer class NotInheritable, forget it. You can use extension methods to add new features to NotInheritable classes, which is a clear violation of intent.

Now, in real life this gloom-and-doom scenario probably won’t play out in all its glory. After all, there are some interesting parallels between the Partial keyword (which I think Microsoft added largely so they could hide all the designer-generated code that used to clutter up form files) and extension methods. The Partial keyword promises the same potential for confusion?and that doesn’t seem to have happened. Most people stick to good old tried-and-true methods for confusing themselves and don’t bother using Partial.

But I’d like to offer some preemptive advice to hopefully prevent Belligerent Bob from doing too much damage.

First, before you add an extension method to a class, ask yourself whether it will be generally useful in many places within the project. If this is a one-shot deal, just write a subroutine near the code that uses it and let it go at that. In other words, if your program needs to check whether a file begins with a salutation (“Dear Mr. Phipps”), don’t add a new method to the String class or the Stream class to do it.

Second, don’t add extension methods to a class if the method doesn’t really have much to do with that class. The Stream class is for reading and writing streams, not for examining their contents.

Third, don’t overload an existing method with a new version that does something different. For example, the String class has a Trim method that removes specified characters form the ends of a string. Don’t use extension methods to make a Trim method that takes different parameters (perhaps a string containing the target characters) and removes characters from the middle of the string. That would be confusing. Besides, you can already use the Replace method to do that.

Finally, document your extension methods so other programmers have at least a slim chance of figuring out what they do, and how to use them. Put them in modules with meaningful names such as CustomerExtensions, so they are easy to find.

I’m not saying that extension methods are evil, just that you should show some restraint when using them. Don’t gratuitously use extension methods to add new features to a class just because you can.

Enough moralizing; it’s time to explore extension methods in depth.

Simple String Extensions
To create an extension method, you really only need to perform two steps. What those steps are depends on whether you’re using Visual Basic or C#.

Creating Extensions in VB

  1. Flag a subroutine or function with the Extension attribute. Extension is defined in the System.Runtime.CompilerServices namespace so you may want to import that namespace to make using the attribute easier.
  2. Write the subroutine or function. The only trick here is that the first parameter to the method is the object on which it is acting. For example, if you’re making an extension to the Customer class then this method’s first parameter must be a Customer object.

How about a useful example? Do you remember Visual Basic 6’s Right function? It returned a certain number of characters from the right end of a string. To do the same thing with the modern String class you need to use the Substring method and the string’s Length property to calculate where to start extracting characters. It’s not terribly hard but getting it right does require some thought.

The Right function is useful in a wide variety of circumstances, fits well with other String methods such as Substring, and doesn’t overload an existing method in a confusing way, so it makes a nice addition:

   Imports System.Runtime.CompilerServices      Module StringExtensions         '''       ''' Return the rightmost number_of_characters characters      ''' from the string.      '''       ''' The string being extended      ''' The number of      ''' characters to return.      ''' A string containing the rightmost characters      ''' in this string.      '''        _      Public Function Right(ByVal str As String, _      ByVal number_of_characters As Integer) As String         If str.Length <= number_of_characters Then Return str         Return str.Substring(str.Length - number_of_characters)      End Function      End Module ' StringExtensions.

This code satisfies the requirements for an extension method.

  • It begins by importing System.Runtime.CompilerServices so it can easily use the Extension attribute.
  • The code incorporates the newfangled "triple-tick" (''') comments to describe the function's purpose and parameters. That syntax lets IntelliSense read the comments and display them, showing what the code does and what its parameters are, so others trying to use this method get the information they need. Using comments effectively goes a long way toward making hidden extension methods easier to understand and use.
  • The extension method is decorated by the Extension attribute and declared as a public function. Its first parameter is a String, so that is the class that it extends. The code that uses this method should pass the remaining parameter into the method call.

The body of the routine ensures the string is long enough, and then performs the simple (but annoying) calculation to return the rightmost characters.

Here's how a program might use the new String.Right extension method:

   Dim txt As String = _      "What kind of noise annoys a noisy oyster?"   Dim word As String = txt.Right(7)

This code initializes the string txt and then calls its Right extension method. Internally the .NET framework passes the string itself (txt) as the first parameter to the method and the value 7 as the second. The method uses the string's Substring and Length methods to peel off the specified number of rightmost characters and returns them, filling the variable word with the value "oyster?"

Creating Extensions in C#
In C#, things are a little different. C# doesn't allow you to write code that isn't part of a class and?let's face it?sometimes code just doesn't belong in a class. (In this case, the code really belongs in someone else's class that you didn't write and whose code is hidden from you.) So in C# you do the next best thing:

  1. Create a static class in which to put your static extension.
  2. Just as in Visual Basic, you then write the extension method. The difference is that C# doesn't require the Extension attribute. However without the attribute, how is C# supposed to know that this is an extension method? The answer is that you give it the needed hint by placing the keyword this in front of the first parameter's declaration. That parameter specifies the type of the class you are extending.

The following code shows the C# version of the Right extension method. You'll find it in the downloadable program example in the CStringExtensionsDemo namespace, inside the static class StringExtensions. Note that the method is also static and that it uses the this keyword to indicate that it is an extension method.

   namespace CStringExtensionsDemo   {      static class StringExtensions      {         public static String Right(this string str,            int number_of_characters)         {            if (str.Length <= number_of_characters) return str;            return               str.Substring(str.Length -- number_of_characters);         }      }   }
Author's Note: This all seems kind of kludgy to me. You have to read the code pretty carefully to notice that this is an extension method. I think I prefer the Extension attribute.)

More Useful Extensions
Now that you've seen how to write the Right method, it's easy enough to add similar methods. The following VB code shows a Left extension method that returns the characters on the left end of a string. (I'm leaving out the triple-tick comments to make the code easier to read.) The code also shows RemoveLeft and RemoveRight methods that strip off the leftmost and rightmost characters from a string. (Those two are a bit gratuitous, but I still maintain that they're more useful than a palindrome checker! Plus this is all just warm-up for the excitement yet to come.)

   ' Return the leftmost number_of_characters characters   ' from the string.    _   Public Function Left(ByVal str As String, _   ByVal number_of_characters As Integer) As String      If str.Length <= number_of_characters Then Return str      Return str.Substring(0, number_of_characters)   End Function      ' Return the string with the leftmost   ' number_of_characters characters removed.    _   Public Function RemoveLeft(ByVal str As String, _   ByVal number_of_characters As Integer) As String      If str.Length <= number_of_characters Then Return ""      Return str.Substring(number_of_characters)   End Function      ' Return the string with the rightmost   ' number_of_characters characters removed.    _   Public Function RemoveRight(ByVal str As String, _   ByVal number_of_characters As Integer) As String      If str.Length <= number_of_characters Then Return ""      Return str.Substring(0, str.Length - number_of_characters)   End Function

Such simple extensions are useful, but the real power of extension methods lies in encapsulating more complex code, making it available through into Intellisense at a developer's fingertips.

High-Level Encryption
If you've read my other articles on DevX and elsewhere you probably know that I like tricky code: complex algorithms, three-dimensional graphics, stuff like that. Something as simple as the Left, Right, RemoveLeft, and RemoveRight extension methods just isn't enough. I couldn't finish this article without doing something a bit more interesting.

I've actually wished for a simpler method for encrypting and decrypting strings. The System.Security.Cryptography namespace provides some amazing cryptographic tools, but they're pretty hard to use, so I decided that implementing them as simple string extensions would be useful.

At a fairly high level, here's what the extension methods should look like to someone who wants to use them:

  • To make the encryption tools as flexible as possible, Microsoft made them encrypt and decrypt bytes rather than text. Whether the bytes represent text, images, video, or spreadsheets doesn't matter to the encryption routines. They just chop up the bytes and reassemble them later.
  • The new String class extension method Encrypt will take a password as a parameter and return the string encrypted with that password. The encryption methods return the encrypted string as a byte array, so Encrypt also returns a byte array.

The following code shows the Encrypt extension method. The parameter plain_text is the string that the method should encrypt. The other parameter is the password. The method uses an ASCIIEncoding object to convert the string into a byte array and then calls helper function CryptBytes (which I'll get to a bit later) to encrypt the string and return the resulting byte array.

   ' Encrypt a string into a byte array.    _   Public Function Encrypt(ByVal plain_text As String, _   ByVal password As String) As Byte()      Dim ascii_encoder As New System.Text.ASCIIEncoding()      Dim plain_bytes As Byte() = ascii_encoder.GetBytes(plain_text)      Return CryptBytes(password, plain_bytes, True)   End Function

Here's how you might use this method to encrypt a string:

   Dim txt As String = "Message"   Dim bytes() As Byte = txt.Encrypt("SecretPassword")

Encrypting a string is useful only if you can decrypt it later. The following code shows the Decrypt extension method, which reassembles the original string. Notice that this method's first parameter is a byte array, so rather than being a String extension, this method extends the byte array data type. The method calls the CryptBytes helper function to decrypt the byte array (discussed later) to decrypt the bytes, converts the bytes back into a string using an ASCIIEncoding object, and returns the result:

   ' Decrypt a byte array into a string.    _   Public Function Decrypt(ByVal encrypted_bytes() As Byte, _   ByVal password As String) As String      Dim decrypted_bytes() As Byte = _         CryptBytes(password, encrypted_bytes, False)      Dim ascii_encoder As New System.Text.ASCIIEncoding()      Return ascii_encoder.GetChars(decrypted_bytes)   End Function

Now, storing an encrypted string in a byte array is okay if you want to store the bytes in a file but you cannot display the bytes on the screen. If you want to visualize the encrypted data, you need a reasonable way to display them in a string. One way to do that is to convert the encrypted byte array into a series of hexadecimal values.

The following BytesToHexString extension method does just that. Its first parameter is a byte array, so that's the data type that it extends. The code loops through the bytes in the array, converts each to hexadecimal, and pads the result to two characters so that small numbers such as A come out as 0A. If the result is non-empty (in other words, the array was not empty), the method removes the leading space character and returns the result:

   ' Return a hexadecimal representation of the bytes.    _   Public Function BytesToHexString(ByVal bytes() As Byte) As String      Dim result As String = ""      For Each b As Byte In bytes         result &= " " & Hex(b).PadLeft(2, "0")      Next b      If result.Length > 0 Then result = result.Substring(1)      Return result   End Function

Now you can write code similar to the following to encrypt a string, convert the encrypted bytes into a string, and save the result in the variable cipher_text.

   Dim txt As String = "Message"   Dim bytes() As Byte = txt.Encrypt("SecretPassword")   Dim cipher_text As String = bytes.BytesToHexString()

After storing the encrypted text stored as a string, you need a way to recover the original message. The first step in reversing the encryption is to turn the hexadecimal string back into a byte array. Then you can use the Decrypt extension method to decode the array.

The following code shows the HexStringToBytes extension method. This method's first parameter is a string so it extends the String class. It first removes all spaces from the hexadecimal string, then adds the text &H to the beginning of each pair of letters in the string (each pair represents a byte). Finally, it converts the result into a byte, and stores the byte in a byte array. After processing all the character pairs, it returns the byte array:

   ' Return a byte array initialized with the given   ' hexadecimal codes.    _   Public Function HexStringToBytes(ByVal str As String) As Byte()      str = str.Replace(" ", "")      Dim max_byte As Integer = str.Length  2 - 1      Dim bytes(max_byte) As Byte      For i As Integer = 0 To max_byte         bytes(i) = CByte("&H" & str.Substring(2 * i, 2))      Next i         Return bytes   End Function
Author's Note: There's a nice symmetry between the BytesToHexString and HexStringToBytes methods. The first extends the byte array data type and returns a string. The second extends the String class and returns a byte array.

By using the Encrypt and BytesToHexString extension methods together, you can now easily write a method that both encrypts a string and returns an encrypted hex string suitable for display. The following code shows the EncryptToString extension method, which does just that. This method simply calls the Encrypt method to get a byte array and then calls the byte array's BytesToHexString method to get a string result:

   ' Encrypt a string into a string of byte values.    _   Public Function EncryptToString(ByVal plain_text As String, _   ByVal password As String) As String      Return plain_text.Encrypt(password).BytesToHexString()   End Function

Similarly, by combining the HexStringToBytes and Decrypt methods, you can create a single useful extension method that decrypts a string holding a hexadecimal encryption. The DecryptFromString method shown below calls the encrypted string's HexStringToBytes method to get a byte array, calls the array's Decrypt method, and returns the result:

   ' Decrypt a string of byte values into a string.    _   Public Function DecryptFromString( _   ByVal encrypted_bytes_string As String, _   ByVal password As String) As String      Return _        encrypted_bytes_string.HexStringToBytes().Decrypt(password)   End Function

Low-Level Encryption
Now you know the story at a high level. The Encrypt and EncryptToString extension methods encrypt a string. The Decrypt and DecryptFromString extension methods decrypt a byte array or string to recover the original message. But I haven't explained how the actual encryption works.

All these extension methods depend directly or indirectly on the CryptBytes helper function shown in Listing 1.

CryptBytes starts by making a triple DES crypto provider. This is an object that uses the DES encryption algorithm three times to encrypt or decrypt data. You can pick from among other crypto providers if you like.

Next the function tries to find a key size that is supported by the operating system. Different versions of the operating systems in different countries support different key sizes so the code must find one that works on this system. (If you're going to send an encrypted message to someone else, make sure they use the same key size, particularly if you're in different countries.) The code starts with a big value (1024) and works downwards until it finds one that works.

The crypto provider encrypts and decrypts data in blocks. The code records the provider's block size for later use.

Next the routine builds a "salt" array. (Don't blame me; "salt" is a cryptographic term that's been around for a while so we're stuck with it.) The salt array is an array of "random" bytes that you pick to make it harder for cyber-desperados to use a dictionary attack that guesses every possible password for your message. The salt guarantees that they'll also have to guess the salt.

Author's Note: You must use the same salt when you encrypt and decrypt or the decryption won't work. For increased security, use different salt values than the ones shown here.

The code then calls helper routine MakeKeyAndIV. I'll explain that one in a minute but for now just accept that this routine creates a key and initialization vector (IV) to initialize the crypto provider.

Next the code makes a cryptographic transformation object to either encrypt or decrypt the byte array. It makes a memory stream to hold the result and builds a CryptoStream object attached to the stream and the transformation object.

Finally the code is ready to perform the encryption or decryption. The routine writes the byte array into the CryptoStream object. That object transforms the data and writes the result into the output memory stream.

The code finishes by converting the memory stream into a new array of bytes and returning it.

(Are you starting to see why I thought it would be nice to wrap this mess in a nice, simple extension method?)

The following code shows the final piece to the puzzle: the helper subroutine MakeKeyAndIV:

   ' Use the password to generate key bytes.   Private Sub MakeKeyAndIV(ByVal password As String, _   ByVal salt() As Byte, ByVal key_size_bits As Integer, _   ByVal block_size_bits As Integer, ByRef key() As Byte, _   ByRef iv() As Byte)      Dim derive_bytes As New Rfc2898DeriveBytes( _         password, salt, 1234)         key = derive_bytes.GetBytes(key_size_bits  8)      iv = derive_bytes.GetBytes(block_size_bits  8)   End Sub

This subroutine makes an Rfc2898DeriveBytes object. This object uses the HMACSHA1 algorithm (don't worry about that!) to generate a series of pseudo-random bytes. There are other classes that you probably won't recognize either that you can use if you have some sort of grudge against the HMACSHA1 algorithm. The program calls the object's GetBytes function to grab some random bytes that it can use for the key and initialization vector.

?
Figure 1. Cryptic Code: Program StringExtensionsDemo encrypts and decrypts strings.

Testing Encryption Extension Methods
The CryptBytes and MakeKeyAndIV helper routines are swathed in cryptographic mysticism but the main extension methods Encrypt, Decrypt, EncryptToString, and DecryptFromString are relatively simple and easy to use. The StringExtensionsDemo project (see Figure 1) in the downloadable code, demonstrates their use in addition to the simpler Left, Right, RemoveLeft, and RemoveRight methods.

When you modify the string in the topmost text box, the program uses the Left, Right, RemoveLeft, and RemoveRight methods to display eight characters from the string.

When you enter a message and a password and then click the Encrypt button, the following code executes:

   Private Sub btnEncrypt_Click(ByVal sender As System.Object, _   ByVal e As System.EventArgs) Handles btnEncrypt.Click      Dim txt As String = txtText.Text      Dim password As String = txtPassword.Text         ' Encrypt the string into bytes.      m_EncryptedBytes = txt.Encrypt(password)         ' Display a textual representation.      Dim ascii_encoder As New System.Text.ASCIIEncoding()      lblCiphertext1.Text = ascii_encoder.GetChars(m_EncryptedBytes)         ' Encrypt and display as a hex byte string.      lblCiphertext2.Text = txt.EncryptToString(password)         btnDecrypt.Enabled = True   End Sub

The code copies the message and password into string variables. It calls the message's Encrypt extension method to get an encrypted byte array and saves that in the form-level variable m_EncryptedBytes. The code then uses an ASCIIEncoding object to convert the array into a string, and displays the result, which looks like gibberish. Finally the code calls the message's EncryptToString method and displays the encrypted text as a hexadecimal string.

After you encrypt a string, if you click the Decrypt button, the following code executes:

   ' Decrypt the previously encrypted bytes using the current   ' password.   Private Sub btnDecrypt_Click(ByVal sender As System.Object, _   ByVal e As System.EventArgs) Handles btnDecrypt.Click      Dim password As String = txtPassword.Text         ' Decrypt the previously saved byte array.      lblPlaintext1.Text = m_EncryptedBytes.Decrypt(password)         ' Decrypt the hex string representation.      Dim hex_str As String = lblCiphertext2.Text      lblPlaintext2.Text = hex_str.DecryptFromString(password)   End Sub

This code calls the saved byte array's Decrypt extension method to recover the original message and displays the result. Then, to prove it can, the code retrieves the hexadecimal representation of the encrypted message, calls its DecryptFromString method, and displays the result.

?
Figure 2. Picky Passwords: After changing the leading "I" in the Password field to an "H" and pressing the Decrypt button, the Plaintext result shows that when the password used to decrypt a message does not match the password used to encrypt it, the result is complete gibberish.

The cryptographic routines are quite strong and produce a pretty secure result as long as your key size is big enough. An important consequence of this is that you must get the password exactly correct to decode a message. If the password is off by a single character, the CryptoStream decoder throws an error and the result is complete garbage.

Figure 2 shows the example program trying to decode a message with a password that is off by a single character. In fact, it's off by a single bit (the "H" at the beginning of the password in Figure 2 differs by only one bit from the "I" used to encrypt the message). The result is complete gibberish.

This is important because it means an attacker cannot easily learn about your password. It would be bad if a partially correct password got the attacker a partly correct message. The cryptographic routines ensure that attackers get nothing useful unless they guess the password exactly.

Extension methods can be a powerful tool. As long as you use them judiciously, extension methods can add new features to existing classes, even ones that you didn't write. This article showed how to add some simple text processing methods and some much more complex cryptographic methods to the String class.

So give extension methods a try and see what interesting ideas you can come up with. Then email your results to me. I'll post the methods that seem like they might be most useful to others.

Share the Post:
Share on facebook
Share on twitter
Share on linkedin

Overview

Recent Articles: