Login | Register   
LinkedIn
Google+
Twitter
RSS Feed
Download our iPhone app
TODAY'S HEADLINES  |   ARTICLE ARCHIVE  |   FORUMS  |   TIP BANK
Browse DevX
Sign up for e-mail newsletters from DevX


advertisement
 

XHTML: HTML Merges With XML

The W3C's recently approved XHTML standard combines HTML and XML and makes it possible for your Web pages to be viewed on a wider variety of devices.


advertisement
eb developers awake! XHTML (Extensible HyperText Markup Language) is coming to a server near you. It'll change everything you ever knew about Web design, give you untold power on the client and the server, and solve one of the great nagging problems of how to create a Web site without spending billions of dollars on versions for Internet Explorer, Mozilla, AOL, Palm Pilot, your telephone...well, you get the idea. On January 26th, the World Wide Web Consortium (W3C) released the first upgrade of the HTML 4.0 standards in more than a year. Surprisingly, this upgrade wasn't intended to add a few more tags or incorporate a couple of CSS extensions into the language. Instead, the XHTML 1.0 standard (located at www.w3.org/TR/xhtml1) ceased being HTML (see the sidebar, "History of HTML").

An XHTML document, in the main, doesn't appear all that radically different from a "normal" HTML element (see Listing 1). The root of such a document is still an <html> node, the document is divided into a <head> and <body> section, and the tag usage is consistent with what has been produced in HTML editors or by hand for the last decade.



However, you will notice some differences. The first has to do with the fact that this is an XML document. It contains the processing instruction <?xml version="1.0" encoding="UTF-8"?>, which both tells the parser that it is an XML document and that it uses the standard 8-bit encoding schema of most typical English documents.

The document's DOCTYPE declaration is likewise a little different from the norm; it points to the XHTML DTD rather than the HTML 4.0 DTD. One of the big controversies surrounding the XHTML specification had to do with a fight between two distinct factions in the W3C. One group wanted to define only one DTD for the specification, arguing that it would help keep the language simple. The other group felt that there should be three distinct DTDs for three different types of XHTML:

  • Strict. The core HTML within the document followed clearly delineated constraints, and any non-HTML code added to it would need to be added under a separate namespace.
  • Transitional.While the HTML contained in the document has to be XML conformant, the requirements about which elements can be contained where are much less strict—you don't need a namespace to declare specific non-HTML-based tags. This is primarily a way to start moving other tag-based formatting standards, such as ColdFusion or ASP, into the domain of XML. As its name implies, it is generally considered a transitional state, and should be used principally for older HTML documents being converted into XHTML.
  • Frameset. Frames are for the most part independent of the content that they contain. Because they're essentially meta-structures, the W3C decided to pull frames out of the base XHTML format and create a distinct namespace for it.

The XHTML document recommended in January 2000 took this latter approach, with three distinct namespaces that you could potentially specify. In practice, unless you work heavily with frames, you will probably only need to worry about the strict DTD.

Namespaces have become fairly common in XML circles lately, but if you're working in a strictly HTML environment, chances are you've not encountered them before. Namespaces serve a simple purpose—they identify a set of tags as belonging to one particular object description. It's entirely possible that two XML structures might be used together (in XHTML, it's almost certain) and you need to have some way of distinguishing between a <title> as used to describe the title of an HTML document in the title bar and <title> when used to indicate someone's job title.

A namespace associates a specific prefix, a short name or even letter, with an associated URI (Uniform Resource Identifier), as a way of identifying the namespace uniquely. It is not required that the namespace actually point to anything (indeed, most of the common ones don't)—only that you uniquely identify the namespace relative to other namespaces in the document. For example, this declaration identifies the default namespace (xmlns="http://www.w3.org...") for the document, which specifies that unprefixed tags will use the XHTML standard for display:

<html xmlns ="http://www.w3.org/1999/xhtml" xmlns:emp = "http://www.yourCompany.com/employee">

The default namespace is one where the tags don't require a prefix to identify them. The declaration then defines a second namespace (xmlns:emp="http://www.myCompany.com/ .."'), which indicates that any element that begins with the prefix "emp:" should be considered to be part of the employee namespace for your company. Thus, you may have an XML structure much like Listing 2, where an XHTML document contains an embedded XML island.

This ability to separate namespaces is an important aspect of XHTML, although to really appreciate its significance, it is worth shifting your viewpoint about HTML from that of a markup language to one where HTML provides the definition of a document object that is in turn made up of paragraph objects, list objects, header objects, form objects, and so forth. The XHTML namespace describes a collection of document objects. A different namespace describes a different object model—a different view of reality that's focused on objects such as employees and addresses. When you combine two such namespaces together, you define relationships between the two object collections—for example, this section of the document focuses on employees, that HTML table is linked to this other site of financial information, and so forth. This has benefits for both creating sophisticated server-side code for displaying such information, as well as for creating modular output that contains subsets of HTML for different platforms.



Comment and Contribute

 

 

 

 

 


(Maximum characters: 1200). You have 1200 characters left.

 

 

Sitemap
Thanks for your registration, follow us on our social networks to keep up-to-date