Login | Register   
LinkedIn
Google+
Twitter
RSS Feed
Download our iPhone app
TODAY'S HEADLINES  |   ARTICLE ARCHIVE  |   FORUMS  |   TIP BANK
Browse DevX
Sign up for e-mail newsletters from DevX


advertisement
 

SQL Simplicity for Java Value Mapping : Page 2

When you have a cascading map in Java, which index goes first and how do you scan the whole collection? Use Java maps that have two sets of keys.


advertisement
Introducing Two Keys
Often, mapping values of one type to another does not match the general knowledge of a domain. Suppose you have Hunters who from time to time bring home Mammoths, and you want to keep the data who, when, and how big. If you were using SQL, the solution would be obvious. You would use something like this:

CREATE TABLE Log (Hunter String, Time Timestamp, Weight Number);

In Java, however, this is hard to express. Even if you have a class Hunter, how do you link these three data items together? Solution one would be to create a special class:



class Event { Hunter hunter; Timestamp time; double mammothWeight; // won't fit in float }

The disadvantage of this is that it does not help you to find facts about a certain hunter or a certain date. Even if you have a Collection<Event>, you will have to scan it all.

A natural, database-like solution would be to index such events. But what would be the key? J2EE suggests having a special class for keys, like this:

class EventKey { Hunter hunter; Timestamp time; }

Such a key, while useful for retrieving Entity Beans, does not make any practical sense. There is no such thing as "hunter-timestamp"—hunters are hunters and time is time. Moreover, this kind of "key" does not help you trace the history of successes (or failures) for any given hunter, nor the history of the tribe's feasts and troubles. This means that you need to introduce separate indexes for hunters and for time. Depending on the problem you think you are solving, you can have one index or two:

Map<Hunter, Map<Timestamp, Event>> hunterIndex; Map<Timestamp, Map<Hunter, Event>> timeIndex;

Now imagine that in addition to a Collection<Event> you have to maintain two maps. Every time you add an event to the collection, you have to look up hunterIndex and check whether the entry exists. If it doesn't, you create one with an empty map and then insert a new fact into that map. The same is true with deletion; only now you also have to ask a colleague whether it would be wise to remove empty secondary maps or if keeping them there is okay. Or maybe you know the answer, but your colleague who does your code review knows a different answer. Et cetera, et cetera, et cetera. I don't know about you, but I create such cascading maps several times a year.

In practice, when people have such cascading maps, they rarely bother to keep a separate Collection<Event> because it seems to be a waste of time and space—except maybe when the collection is passed down from above or they have to recount the size of the collection. In that case, practical programmers employ one of two very different strategies:

  1. When requested, scan through hunterIndex, adding up the sizes of secondary maps.
  2. Keep a separate counter by "caching" it, and update it on each addition or deletion. In this case, the programmer must take care of threads and exceptions, and imagine the application running for months—and never recounting its mammoths.

As I see it, all this happens because Java programmers are used to thinking in terms of existing classes, and they just pick up whatever they find in java.util or java.lang. Python programmers do not even encounter such problems, and JavaScript programmers do not have a choice: their only option is associative array with strings as keys.



Comment and Contribute

 

 

 

 

 


(Maximum characters: 1200). You have 1200 characters left.

 

 

Sitemap