Reading a DTD
Even if you don't plan to build a DTD from scratch, it is helpful to know how to read one and to understand the document it is describing.
From reading a DTD you should be able to compile a list of elements and their attribute, and how and when to use them. You should also be able to compile a list of entities that you can use within the document.
Some people find it helpful to actually sketch out a document tree as they go through the DTD, to visualize the structure of the document.
Here's a list of things to look for as you go through a DTD:
Read the Comments
Read the comments! Comments can tell you a lot about the DTD, how to use it, and what to be aware of when using it.
Most DTD authors will include information that you should know before using the DTD. This might range from use restrictions to how-to information.
Comments look like this:
<!-- Here's a comment -->
Note the Basic Elements
Look through the DTD and identify the element names that comprise the document. Note how they are capitalized. You might want to develop a reference sheet of elements, that you can make notes on as you work your way through the DTD.
Elements begin like this:
The text immediately after the element declaration is the element's name.
Read the Element Declaration
Each element declaration provides the name of the element and the content which it contains. Sometimes the content is text. Other times is other elements, arranged in a certain order or used a certain number of times.
Click on each portion of these element declarations to learn about the rules they describe.
<!ELEMENT EMPLOYEE (FIRST, MI, LAST)>
<!ELEMENT FIRST (#PCDATA)>
<!ELEMENT MI (#PCDATA)>
<!ELEMENT LAST (#PCDATA)>
Look for Parent/Child Relationships
The element rules build a hierarchy of element, describing how one element is related to another. And element that is contained within another is considered a child of the element in which it is contained. Use these relationships to sketch out your document tree.
The parent/child relationship is defined in the content type portion of the element definition. If the content type is another element, then those elements are children of the element whose definition you are reading. For example: FIRST, MI, and LAST are children of EMPLOYEE:
<!ELEMENT EMPLOYEE (FIRST, MI, LAST)>
The DTD can require that the child elements be used in a certain order or that they be used one, none, or many times. It can also group elements to create more detailed rules.
Read Attribute Lists
After element definitions, you may see attachment lists. An attachment list begins like this:
Each attribute list defines the attributes for an element. Many attributes may be defined in one ATTLIST.
The ATTLIST is structure like this:
<!ATTLIST element-name attribute-name attribute-type default-data>
See Which Element the Attribute Defines
Right after the ATTLIST declaration is the name of an element. This is the element that the attribute list defines. For example, this ATTLIST defines the COMMENT element:
<!ATTLIST COMMENT attribute-name attribute-type default-data>
Find Attribute Names for Each Element
Following the element name is the name of the first attribute declared in this list. This name is the attribute name you type into the element tag in the XML file. For example, this ATTLIST defines the attribute "category" for the element COMMENT.
<!ATTLIST COMMENT category attribute-type default-data>
Add the attribute information to the element reference list you are building.
Determine Attribute Value Types
Attributes can be one of several different types. The attribute-type describes the type of value that the attribute may contain. For example, this ATTLIST says that the "category" attribute for the element COMMENT contains one of four values: red, green, blue, or other.
<!ATTLIST COMMENT category (red | green| blue| other) default-data>
See the Attribute's Default
The final part of the ATTLIST is the default value of the attribute. The default value has a strong effect on how the attribute is used and what values it might have if you don't use it in the XML tag. You can make the value required (#REQUIRED) or optional (#IMPLIED). Or, you can provide a default value that will be used automatically if the attribute is not entered.
Read Entity Declarations
Along with element and attribute definitions, you may also see entity definitions. Typically, these will appear in a group, often at the beginning of the DTD, and usually with explanatory comments.
An entity definition begins like this:
After the declaration, is the entity's name and the contents of the entity. The contents may be text or it may be a pointer to another external file. For example, this defines two entities, one called "copyright" and one called "trademark." Copyright is defined within the definition, while trademark points to another file.
<!ENTITY copyright "Copyright 2000, As The World Spins Corp. All rights reserved. Please do not copy or use without authorization. For authorization contact firstname.lastname@example.org.">
<!ENTITY trademark SYSTEM "http://www.worldspins.com/legal/trademark.xml">