devxlogo

Make Data Islands Work in All Browsers

Make Data Islands Work in All Browsers

obust support for Web standards is now the norm rather than the exception. If you develop applications for a purely Microsoft-centric environment then you probably haven’t had to care much about standards, but everywhere else, attendance to and mastery of modern standards (CSS, JavaScript, and MIME types) has become a minimum requirement for any professional Web engineer. Clients increasingly realize the importance of standards, and are increasingly less willing to accept applications that don’t adhere to standards. Amidst this new rigor, however, are many non-standard features of Web browsers that continue to be very useful. XML data islands (embedding XML content inside HTML content) are just one example. In this article you’ll see how to lift your skills up to standard without losing that useful feature.

About Data Bindings
The ideas of data sources and data bindings are as old as Methuselah. You start with some kind of static content, whether that’s a Web page, a Java window, or a 4GL application screen, and then you connect it to data stored somewhere else, in a process called binding. The aim is to complete or augment the original content with data that may change over time, but to do so in a general way that doesn’t rely on having that data in advance. Each time a user makes a request for the content, the content updates itself and displays the most recent set of data. ODBC, JDBC, XUL templates, and XBL are all examples of binding systems, as is Microsoft’s plainly-named data binding services. Binding differs from templated content-generation systems such as PHP, because there’s an assumption that incomplete content is delivered first and is then completed by data later. By comparison, PHP mashes together content and data and then delivers the final content all at once. In a bound system, the required data is usually remote to the original content and referred to by a descriptive address. That address is known as the data source.

XML data islands bring a new twist to this old story because the data accompanies the content as it’s delivered. In that case, the content and the data are in the same spot, and the only binding required is a trivial connection between the two. It’s a happy situation for simple cases, because all the binding logic can be specified with a few, shared bits of information?in this case tags (elements) and tag attributes inside an HTML page?without requiring any additional programming. Here’s an example of how an IE XML data island might work, if IE weren’t strange and different:

                                          Nigel McFarlane           Somewhere in Cyberspace                       This article contributed by:       
Nobody

In this code, the data island starts and ends with the tag. Note that the example is a Tag Soup (non-standard) HTML document. The island is used twice; once in the HTML content which uses the proprietary datasrc and datafld attributes to bind the

tag’s content to the content of the tag, and once in the click handler, which treats the contents of the tag as plain data, digging it out of the document’s DOM for display. The

tag initially contains the content “Nobody,” but that’s replaced with “Nigel McFarlane” as soon as the page loads the datasource.

Internet Explorer performs the binding action using a chunk of Microsoft code that’s either part of IE or part of Windows, depending on your perspective. That’s clearly overkill for the simple case of XML data islands, because the content and the data source are both available within the same browser page. It’s therefore overly complex, not to mention non-portable. Instead, you can use a bit of JavaScript to achieve the same result, and that’s the solution you’ll see here. If you add the now-widely available XMLHttpRequest object you can also use JavaScript to bind remote data sources?but that’s another article.

Note that I said that the preceding code might work in IE. In fact, it won’t work. The first binding use is fine, where the data is bound to tags, but the second use, which accesses the data island using a script, is a bust. That’s because IE does some strange and non-standard things to the island. You’ll see how to get around that restriction.

Editor’s Note: The editorial staff of DevX would like to add our condolences to the voices of so many around the Web. Our longtime author and friend, Nigel McFarlane, passed away in June 2005. This article is one of two that Nigel wrote for us before his death; they are published with the permission of his family. Nigel McFarlane was the author of two books and a frequent contributor to Mozilla and the open source movement. ?Lori Piquet

Fitting in with Standards
To be standards compliant, old-fashioned “Tag Soup” HTML won’t do. Here are some constraints you must meet to include data islands in a standards-aware page:

  • Use a proper HTML standard DOCTYPE, such as HTML 4.01 Strict.
  • The HTML standard doesn’t prevent addition of strange new elements (tags) to an HTML document, so it’s perfectly OK to add an tag (or a similarly named tag), and its children; however, the standard does prevent you from adding those elements to the document’s DTD. So such elements have to remain anonymous. You can’t, for example, use
    instead of because

    already has HTML semantics.
  • Don’t (as is sometimes done with IE) embed an processing instruction in the data island. In the first place, HTML isn’t XML, and in the second place, the client browser may ignore any such processing instruction, because it’s a user-agent hint only.
  • Don’t place the tag in the section of the document, because it’s not about the document.
  • Don’t use IE’s innerHTML technique. Although it’s handy, it’s also non standard.
  • Any standards-compliant Web browser will try to render the contents of the XML data island as plain text: In other words, you’ll see “Nigel McFarlane Somewhere in Cyberspace.” That’s not wanted, so use a CSS “display: none” style rule to prevent the XML content from rendering.
  • Finally, the solution relies on JavaScript and CSS support. If that’s not present, then you should use a fallback technique that avoids datasources in the first place. I’ll leave an explanation of graceful fallback for another day; but it’s not hard to add.

You want to do two things with data islands: display data-driven reports, and fill in data-driven forms.

Preparing The Test Data
Here’s a simple HTML page to experiment with. It contains one data source and two binding locations.

                            

Nigel McFarlane Somewhere in Cyberspace A. Russell Jones DevX Editorial J. Random Hacker Everywhere

Reviewers of this article

  • , Location:

In the document head there’s a single style that hides the island from the user and an included script that does all the work. The body of the document contains three important chunks of code. First, the data island itself, specified with a

tag. The data island contains three records, each with two child (name and location) fields. The example uses a

tag instead of an tag because of IE’s poor DOM support: Anything inside an tag, or inside any tag that’s not an HTML tag isn’t reflected into the DOM by IE. That means you can’t access any such data using a script. Never fear, the humble

tag (or or

) will do instead.

The other two interesting chunks of code are at the bottom; there are two different tags holding a datasource attribute, and thus two places where the data island content will be folded into the main page content. The datafield attribute says what piece of data to add in. Both datasource and datafield are custom attributes not specified in HTML nor in the Microsoft data island spec. I’ve made them up, which is legal in strict HTML 4.01.

Even this small amount of markup exceeds the limitations of the Microsoft solution. Here we’re folding data into an unordered list and into form elements embedded in a table. Both of those examples are difficult or impossible to do with the Microsoft solution.

By itself, this HTML content does very little. You need to add a little script to manipulate the data.

Planning the Script
To implement data binding, you can use a DHTML technique?a script that manipulates the document content. The days of incomprehensible spaghetti-like DHTML are gone now?standards are firmly in place, and DHTML code is expected to be very clean. As a minimum, all the code should be neatly encapsulated in a JavaScript object where it’s kept tidily separate from other code. Here’s the skeleton of the data-binding object, which will be nearly the entire content of islands.js:

   var islands = {     isMSIE : ...,     getElementsByAttribute :         function (node, att) { ...},     getEBArecursive :        function (list, node, att) { ...},     getFieldDataFromRecord :        function (rec,field) { ... },     makeIEtree : function (island) { ... },     merge : function (target, template, record) { ... },     bind : function () { ... }   };

This is a literal object (similar to a Perl hash) that consists of seven properties: a simple value, isMSIE, and six anonymously defined functions. Each function is assigned to a property, so each property acts as an object method. Anonymous functions specified like this are useful because they appear only in the object where they’re defined. That means this object can be included in any Web page without fear of clashing with someone else’s function names. Only the island’s variable name needs to be unique.

To complete this object requires only replacing the ellipses with code. But before doing that, here’s an explanation of what each property does.

  • isMSIE is simple; it’s just a browser check flag.
  • getElementsByAttribute() scans a document for nodes that have a specific attribute. Because the DOM standards don’t offer such a function, one’s been made up.
  • getEBArecursive() is a recursive version of getElementsByAttribute() that reduces the amount of code required.
  • getFieldDataFromRecord() extracts a datafield item from the data island.
  • makeIEtree() fixes some broken behavior in IE?you’ll see how shortly.
  • merge() takes the page content, a special template, and the data island data, and creates the final content.
  • bind() is the first thing called; it’s what makes everything happen.

Now, you have to make the binding happen. You do that by adding an onload event handler to the end of islands.js:

   window.onload = function () { islands.bind(); };

There’s another fancy JavaScript feature at work in this line: scope chains. It’s best not to use this simpler syntax:

   window.onload = islands.bind();

If you use the simple syntax, when the bind() method runs, the current object is the window object. But calling the bind() method from its own object, inside an anonymous function, places the island’s object “in scope.” That way, it’s easy to refer to other methods in the islands object using the special this property, which saves you from having to pollute the object methods with references to the islands variable.

Planning the Object Methods
Here’s how binding is implemented. First, the unfortunate browser check is trivial, so let’s get that out of the way:

   isMSIE:(window.navigator.userAgent.search('MSIE')!=-1)

The bind() method does all the work. For planning purposes, here’s a list of the tasks the method must accomplish:

  1. Find all the datasource targets. For each one:
  2.   Extract the target’s existing content into a separate template
  3.   If it’s IE, repair the matching data island
  4.   Find each record in the matching data island, and for each one:
  5.     Merge the record and the template
  6.     Copy the merged template back into the page.

Some of the methods required are quite general; you can find such things scattered all over the Web in existing pages and in the blogs of leading Web thinkers. Here’s an implementation. First, the getElementsByAttribute() method, which is just a wrapper around the recursive version:

     getElementsByAttribute : function (node, att) {       var rv = [];       this.getEBArecursive(rv, node, att);       return rv;     },

Given any DOM node and an attribute name, it returns all matching DOM nodes underneath (“inside”) that node in an array. Here’s the part that does the work:

     getEBArecursive : function (list, node, att) {       for (var i=node.childNodes.length-1; i>=0; i--)       {         var child = node.childNodes.item(i);         if ( child.nodeType == 1 ) {           if ( child.getAttribute(att) ) {             list.push(child);           }           this.getEBArecursive(list, child, att);         }       }     },

The recursive version passes the list of found nodes and the current node to itself. For each child of a node that’s an element node (a tag), it checks for the attribute and records the node if it’s found, otherwise, it call itself again on that child. When there’s nothing left to search, the for loop does nothing and the recursion ends with a fully built list.

The other rather general function is getFieldDataFromRecord(). Given a node and a tag name, it digs into the node’s subtree and extracts the content of that named tag:

     getFieldDataFromRecord : function (rec,field) {       var found;       for (var i=0; i < rec.childNodes.length; i++) {         if (rec.childNodes.item(i).nodeName.           toLowerCase()==field) {           found = rec.childNodes.item(i).firstChild;         if (found == null )           return "";         else           return found.nodeValue;         }       }     },

This function assumes that the records are arranged in a three-level hierarchy: the record's tag, each data item's tag, and the data held between start and end tags of the data item.

With that preparation out of the way, here's the meaty part.

Implementing the Binding
I'm going to jump forward a little bit and show you the final binding logic. Here's the bind() method.

The numbered comments refer to the planning list earlier in this article.

     bind : function () {       // 1. Find all the datasources in the page.       var targets =          this.getElementsByAttribute(         document,'datasource');       if (!targets || targets.length == 0) return;          // Do it for each data binding 'target' in        // the page        for (var i=0; i < targets.length; i++) {         var iid = targets[i].getAttribute('datasource');         var island = document.getElementById(iid);         if (!island) return;            // 2. Extract a copy of the current content          // for this target         var template = targets[i].cloneNode(true);            // ... and delete the real copy         for (var j = targets[i].childNodes.length-1;             j>=0; j--) {            targets[i].removeChild(               targets[i].childNodes.item(j));         }            if ( this.isMSIE ) { island =             this.makeIEtree(island); }            // 4. Apply the template once for each          // XML record         for (j = 0; j < island.childNodes.length; j++)         {           // children that are text nodes aren't            // real records           var record = island.childNodes.item(j);           if ( record.nodeName == '#text' )             continue;              // 5, 6. Combine page, template and data           this.merge(targets[i], template, record);         }                    if ( this.isMSIE ) { delete island; }       }     }

There's an outside loop for each data source, and an inside loop for each record. When the code finds a datasource tag, you want to replace its content. That entails extracting its content from the page, working on it, and putting it back. The final content combines the DOM subtrees pointed to by the template and islands. Because there might be more than one record per data source, you might have to put it back several times.

The one wrinkle in this is (from a standards point of view) IE's vile behavior. When extracting the data island, IE returns a node, but you can't walk the subtree underneath that node. Instead, IE gives back a flat list of start tags, text and end tags. The makeIEtree() function takes this mess and creates a proper DOM tree out of it. You can see it created and then destroyed on either side of steps 4-6. Here's the code:

     makeIEtree : function (island) {       var subtree =           document.createElement(island.nodeName);       var current = subtree;       var next;          for (var j = 0; j < island.childNodes.length; j++)       {         var record = island.childNodes.item(j);         if ( record.nodeName == '#text' )         {           current.appendChild(             document.createTextNode(record.nodeValue));         }         else if ( record.nodeName.charAt(0) == '/' )         {           current = current.parentNode;         }         else         {           next = document.createElement(              record.nodeName);           current.appendChild(next);           current = next;         }       }       return subtree;     },

This method has a simple loop that runs through the mess that IE returns. The subtree variable is the top of the DOM tree being built; current points to the sub-part of the subtree that's being created; and next is used temporarily to create nodes. Each time the code identifies a new tag, it creates a node and current steps down into it. Each time it identifies a closing tag, current steps back up one node. Otherwise, it just adds whatever is found to the set of child nodes for the current node.

With that obstacle out of the way, all that's left to do is process the found information each time. That's the province of the merge() method:

     merge : function (target, template, record) {       // dig out the fields requiring update       var fields =       this.getElementsByAttribute(template,'datafield');          if (!fields || fields.length == 0) next;          // update text for target fields in the template       for (var k=fields.length-1; k>=0; k--) {         var thetag   = fields[k];         var thefield = thetag.getAttribute('datafield');         var newtext  =            this.getFieldDataFromRecord(record,thefield);            if (thetag.firstChild)  // replace existing text         {           thetag.firstChild.nodeValue = newtext;         }         else if ( thetag.value == null ) // not form tag         {           thetag.appendChild(             document.createTextNode(newtext));         }         else                           // a form element         {           thetag.value = newtext;         }       }          // put the updated content back into the page.       for (k=0; k < template.childNodes.length; k++) {         target.appendChild(           template.childNodes.item(k).cloneNode(true));       }     },

First, the merge method finds all the datafield-laden tags in the target and puts them into an array. Then, for each one, it finds any matching content in the supplied record. There are three cases to consider when putting that matched content into the template (the original page content). Either the content tag already contains text (replace), or it doesn't (add), or it's a form element, in which case the data should go into the value property. Finally, it copies everything in the updated template back into the page itself. All done?all you have to do to make it run is load the page. You can download the sample code and try it in your preferred browser.

Notice that this more professional DHTML code is free of many of the older worries: we don't have event handlers embedded in the page; there's a reduced emphasis on browser-specific tests (only do that when there's no choice), and the effect that's being achieved is serious and meaningful, not a gimmick or a distraction. With the explosive influence of AJAX and other DHTML-related techniques lately, expect to see more of this kind of thing in practitioner's work and in client demands.

Data islands are easy to implement cross-browser using a little generic code. This article shows how to display data-island driven content, including integrating that content into forms, tables and non-tabular markup like lists. With a little more effort you could extend this into a sophisticated data management system, and indeed, a number of experiments have been performed in that area. With the demise of ancient browsers such as IE 4.0 and Netscape 4.x, and the rise of Mozilla and Firefox, DHTML is a more powerful and generally applicable technique than it ever was before. Unlock your IE-specific applications with modern DHTML.

Editor's Note: The editorial staff of DevX would like to add our condolences to the voices of so many around the Web. Our longtime author and friend, Nigel McFarlane, passed away in June 2005. This article is one of two that Nigel wrote for us before his death; they are published with the permission of his family. Nigel McFarlane was the author of two books and a freqent contributor to Mozilla and the open source movement, as well as to DevX. ?Lori Piquet

devxblackblue

About Our Editorial Process

At DevX, we’re dedicated to tech entrepreneurship. Our team closely follows industry shifts, new products, AI breakthroughs, technology trends, and funding announcements. Articles undergo thorough editing to ensure accuracy and clarity, reflecting DevX’s style and supporting entrepreneurs in the tech sphere.

See our full editorial policy.

About Our Journalist