advertisement
Premier Club Log In/Registration
  Include Code  Search Tips
TODAY'S HEADLINES  |   ARTICLE ARCHIVE  |   SKILLBUILDING  |   TIP BANK  |   SOURCEBANK  |   FORUMS  |   NEWSLETTERS
Browse DevX
collectionsnewp2.zip
Partners & Affiliates
advertisement
advertisement
advertisement
Rate this item | 0 users have rated this item.
 

Back to Basics: Manage Collections with a Custom Dictionary

Dictionaries are an essential data structure in many of today's programming environments. Use this one in your next Brew application. 


advertisement
n Part 1 of this series, I showed you an elementary vector implementation for Brew as a stand-alone component you can use in your applications. This time, I give you a dictionary with a similar interface you can use in your Brew applications. You can download both the vector and hash here.


Introducing the Dictionary
The dictionary, also known as the associative array or hash, is a data structure that provides an interface for storing key-value associations. Typically, the name is a string, and the value may be anything, although in languages like C, you're hard-pressed to store multiple data types in a single dictionary unless you use unions with a type identifier so you know the type of a specific entry.

Most dictionaries are based on a vector and a hash function that is responsible for helping determine where each item should be inserted in the vector by operating on the item's key. For performance, my implementation does not reuse the vector implementation I presented previously, because the storage characteristics of the dictionary (more on this later) are different than your garden-variety vector. This implementation uses a collection of nodes each containing the name and item being stored, along with its hash for fast comparison:


typedef struct Node
{
  uint32 hash;
  char *szKey;
  void *pValue;
} Node;
Choosing a good hash function is tricky, and there's a lot of fact as well as folklore on the Internet. Theoretically, the problem is simple: you want a function that distributes the keys across the entire range of the collection's indexes in relatively few operations. In practice, this is quite difficult, and much has been written on the topic. I began with a naïve implementation that worked for testing, and after some research, selected the Jenkins one-at-a-time hash. Of course, even good hash functions occasionally return the same value for two different inputs (that is, the inputs collide). When that happens, you need a tie-breaker to decide which item should be stored where. For simplicity, I use linear probing (walking to the next empty slot) to find the next available spot; the alternative, keeping items with the same hash in the same location chained via a linked list, just seemed more work than it was worth to implement.

Author's Note: If you're interested in learning more about selecting hash functions, I encourage you to read the resources cited in Related Resources.

I should remark that Brew provides a similar component, the IRamCache component, based on the IRecordStore implementation. However, it uses a singly-linked list for its implementation, meaning that the search time for different elements can be grossly different. Moreover, it copies keys and values, making it impossible to store entries such as pointers to other Brew interfaces.

  Next Page: The Dictionary Interface


Page 1: IntroductionPage 4: Accessing an Item
Page 2: The Dictionary InterfacePage 5: Deleting an Item
Page 3: Inserting Items in the Dictionary 
Please rate this item (5=best)
 1  2  3  4  5
advertisement
Advertising Info  |   Member Services  |   Permissions  |   Contact Us  |   Help  |   Feedback  |   Site Map  |   Network Map  |   About

internet.commediabistro.comJusttechjobs.comGraphics.com

Search:

WebMediaBrands Corporate Info

Legal Notices, Licensing, Reprints, Permissions, Privacy Policy.
Advertise | Newsletters | Shopping | E-mail Offers | Freelance Jobs