Browse DevX
Sign up for e-mail newsletters from DevX


Making Sausages




Building the Right Environment to Support AI, Machine Learning and Deep Learning

henever you gather multiple travel veterans in the same location for long enough, sooner or later the discussion turns to travel horror stories. I'll admit that it's just August as I write this, and yet I've hit almost 60,000 miles for this year already. (Yes, some of those miles have been accrued by parking my rear on a plane headed for a vacation condo I recently purchased on the other side of the country, but that's a different story.) Recently, I spent the coldest summer week I've ever encountered in Minneapolis which is, I've discovered, just as likely to be somewhere near the second ring of hell in August as it is to be cool, gray, and dreary, as it was during my visit. Of course, I had to actually get to Minneapolis, and although I normally travel on American Airlines, most everyone I know in Minneapolis lives and swears by (or at) Northwest Airlines. Because flying NWA saved me $200 and two hours each direction, I bit the bullet and flew the competition. You know how it is—you fly on some strange carrier, and everything's different. Maybe I'm spoiled by AA, but NWA manages to stuff what must be twenty percent more warm, sweaty bodies into their tubes than I'm used to, and the seat in front, when leaning back, hit my nose for the entire trip. Unable to read, or sleep, the three hours stretched out into an eternity of thoughtful regret, along with quiet (and not so quiet) suffering. My complaints to my cohorts on the ground fell on deaf ears, mostly because they're all in the highest echelon with Northwest, and haven't flown coach in recent memory.

At one point in the trip I had some free time with Marty Schaeferle, who handles content management for AppDev, the training company for whom I've been writing and training for years. Marty did a round-the-world training gig as a contractor for Microsoft a few years back, and after I finished whining about my three hours in the NWA compactor, he regaled me with some of his stories about flying around the world in steerage. He did have kind words for Hong Kong Air (not to be confused with Singapore Air, undoubtedly the most gracious and comfortable carrier either of us has ever flown). It seems that on his trip from Europe to India, somewhere in the middle of the 15 hours of coach-class torture the airline screened, for the benefit of all those who could stomach it, a documentary on making sausages. No kidding. I guffawed at the thought. Perhaps second only to showing movies of airplane disasters, watching various stages of sausage creation has got to be enough to cause even the least squeamish flyer to pray for release. I'm so sorry I missed it. And what could be more like making sausage in the programming world than creating hashed versions of data? (Yes, that wins the award for worst segue in the history of technical writing.) Two columns ago, I promised a second discussion of how hashing is used in the real world, and I skirted the topic last time. My editor, Erik Ruthruff, reminded me rather strongly during a huge Greek dinner during the trip to Minneapolis that it was time to revisit the topic.

In my first column on this topic I discussed using simple hashing to store data in a hash table for easy retrieval later on. Imagine you want to store a user's password in a database, so you can later authenticate the user on your Web site. You could, of course, simply store the data in clear text. This isn't, for obvious reasons, very secure. A common technique is to store the password encrypted somehow, and hashing provides a perfect mechanism for this task. The basics of hashing, to rehash from the previous column (I couldn't resist) are simple. Rather than rehashing the text, I'll quote from the .NET Framework online documentation for the System.Security.Cryptography.HashAlgorithm class: "Hash functions are fundamental to modern cryptography. These functions map binary strings of an arbitrary length to small binary strings of a fixed length, known as hash values. A cryptographic hash function has the property that it is computationally infeasible to find two distinct inputs that hash to the same value. Hash functions are commonly used with digital signatures and for data integrity.

The hash is used as a unique value of fixed size representing a large amount of data. Hashes of two sets of data should match if the corresponding data also matches. Small changes to the data result in large unpredictable changes in the hash." As the documentation mentions, it's virtually impossible to turn the hashed value back into its original value, and this makes it impossible to retrieve an existing password. In other words, if you employ this technique, you'll need to send users a newly generated password so they can log in and then modify their existing password, if a user forgets the current password.

In addition, storing just the password, even hashed, isn't good enough. If an interloper can get access to your database, and if many users share the same password (no matter how hard you try, users still use stupid words like "password" as their password), a determined hacker might be able to ascertain a pattern and attempt a dictionary attack, trying common combinations of letters and numbers, based on patterns seen in the hashed passwords. To work around this potential security hole, most systems employ a second-level of misdirection in the form of a random "salt" value hashed along with the actual password. To store the password, the system must:

  • Retrieve the clear text password from the user.
  • Generate a random "salt" value (in the example shown here, this value will be eight bytes in length.)
  • Use a hashing algorithm to hash the salt and clear text password, creating a single encrypted value.
  • Store the hashed salt+password, along with the original salt value, in the database.
To authenticate a user, the system must:

  • Retrieve the proposed password from the user, in clear text.
  • Retrieve the existing salt and hashed salt+password value from the database.
  • Combine the existing salt value with the proposed password, and apply the same hashing algorithm as was originally used to generate a hashed value.
  • Compare the newly hashed salt+password value with the stored salt+password value. If they match, the user authenticates. If not, you have an incorrect password.
Your job, then, is to create all this code. Luckily, many folks have already done it, including me, and I've provided along with this article all the code you'll need to get started. The .NET Framework provides several cryptographic classes, all of which inherit from the System.Security.Cryptography.HashAlgorithm class. Each of the classes will encrypt data, using a strong encryption algorithm, and create a hashed value for you. You can select from MD5, SHA1, SHA256, SHA384, and SHA512—each does the job, but in increasingly more secure, and increasingly slower ways. For the purposes of this demonstration, I'm using the SHA1 class, but you could use any of these to accomplish the goal.

Comment and Contribute






(Maximum characters: 1200). You have 1200 characters left.



Thanks for your registration, follow us on our social networks to keep up-to-date