Login | Register   
RSS Feed
Download our iPhone app
Browse DevX
Sign up for e-mail newsletters from DevX


Get a Jump on XML Development  : Page 2

Taking advantage of XML's power requires a mind shift. Get used to thinking differently about markup languages.

XML Constructs
Every XML document has two parts: the DTD and the document element. The DTD lists the tags and their associated attributes that may be used in the document. The DTD also specifies which tags may be nested within other tags, which tags and attributes are required, and which tags are optional, as well as what entities, such as graphics and non-ASCII characters, may be used in documents created using the vocabulary. The document element is a single markup tag that contains the document's entire content and other markup. Document structures are declared in the DTD and are used in the document element to describe content.

To create XML documents, you need to be familiar with several basic constructs: elements, attributes, and entities. XML also allows comments, so you can document your code. Below are the basic structures and syntax you'll need.

Elements are the labels used to describe your content. They're described in the DTD by element declarations and invoked in the document element as tags. Element declarations by default define tag pairs, like the heading Level 1 tag pair (<H1>...</H1>) used in HTML. Tag pairs contain text as well as other elements and their content. An element declaration may also define an empty element, one that isn't designed to contain any text or other elements, such as the image tag (<IMG>) in HTML.

By way of example, imagine that you're creating a simple document to describe the various software packages your organization owns. The purpose of this document is to catalog and keep track of the software you have so you can upgrade to newer versions or don't accidentally buy duplicate copies of a package. A simple DTD would need elements for the software title, version, vendor, platform and operating system requirements, a brief description of the package, and the number of copies you own. Listing 1 shows the element declarations in the document's DTD.

These element declarations accomplish several things at once. They describe the eight tags that can be used within the document element to describe a software package. The individual element declarations also specify what content can be included within the element. The declaration for the package element indicates that one instance of the title, version, vendor, platform, OS, description, and copies tags must be nested in that order within the package element. Both the platform and OS entities are specified as empty tags, and all of the other tags may only contain regular text, as defined by the (#PCDATA) statement after the tag name in each declaration. Here's the markup created from these element declarations and used to describe a software package:

<PACKAGE> <TITLE>Norton Utilities</TITLE> <VERSION>3.5</VERSION> <VENDOR>Symantec</VENDOR> <PLATFORM /> <OS /> <DESCRIPTION>A hard disk utility program</DESCRPTION> <COPIES>1</COPIES> </PACKAGE>

The package element contains all of the other tags, as specified in its element declaration. Also, the empty platform and OS elements have a slash before the greater-than sign that closes the tag. This is XML's syntax for specifying empty elements. Since these two elements are empty, there has to be another way to provide information about the platform and operating system the package supports. That other way is with attributes.

Attributes provide extra information about an element. Specific attributes are defined for individual elements on a case-by-case basis. XML attributes work just like HTML attributes, so this will be familiar territory for web builders. Attributes are defined in a DTD by an attribute list declaration. In the software description document attributes provide operating system and package information. The attribute list for these two elements might look like this:


The attribute list declaration for the platform element indicates that the TYPE attribute can have a value of either PPC or Mac, that the default value is PPC, and that the attribute is required. The attribute list declaration for the OS element indicates that the TYPE attribute can have a value of Mac7x, Mac8, WIN95, WINNT, or WIN98, the default is WIN95, and the attribute is required. You add the two attributes to their elements like this:


Notice that both attribute values are in quotation marks. One of the rules of XML is that all attribute values, regardless of type, must be enclosed in quotation marks. XML provides for other types of attribute values, including text strings and unique identifiers, and not every attribute must be labeled as required.

An entity is a storage unit that can hold string or blocks of text (a text entity) or non-XML data like graphics, audio files, and video files (binary entities). All entities used in a document must first be defined with an entity declaration that assigns a name to the entity that is then used to reference the entity in a document. Entities are actually one of the most powerful DTD and content management structures available in XML, but they're also a more advanced topic than I have room to address in this article. Any of the XML resources listed will have complete information on creating and implementing XML entities.

XML employs the same comment syntax as HTML. Any text or markup located between <!- and -> is invisible to the application processing the document but is visible to any person working on the document. Use comments to leave notes to yourself or others, or to temporarily disable sections of markup and content, as you would in HTML.

Comment and Contribute






(Maximum characters: 1200). You have 1200 characters left.



Thanks for your registration, follow us on our social networks to keep up-to-date