Where Did XML Come From?
XML is a simplified version of SGML and a cousin of HTML. It was developed by members of the W3C and released as a recommendation by the W3C in February 1998.
SGML, the parent of XML, is an international standard that has been in use as a markup language primarily for technical documentation and government applications since the early 1980s. It was developed to standardize the production process for large document sets. Think: Medical records. Company databases. Aircraft parts catalogs. Other really huge documents.
Marking-up documents in SGML allows information to be passed from one system to the next without losing information. With databases marked-up in SGML you can see what Widget A is all about and go check to see if Widget A is in stock.
Early on, people thought that SGML would be useful for the Web. In fact, HTML is really an very basic application of SGML! But HTML quickly became used for visual layout, so a group of people returned to the basics, determined to create something that had the strengths of SGML without being so difficult to implement—and had the ease of use of HTML, but with more structural power. The result was XML.
The design goals of XML, taken from the XML Specificationare:
- XML shall be straightforwardly usable over the Internet.
- XML shall support a wide variety of applications.
- XML shall be compatible with SGML.
- It shall be easy to write programs which process XML documents.
- The number of optional features in XML is to be kept to the absolute minimum, ideally zero.
- XML documents should be human-legible and reasonably clear.
- The XML design should be prepared quickly.
- The design of XML shall be formal and concise.
- XML documents shall be easy to create.
- Terseness in XML markup is of minimal importance.
In other words, XML is easy to create, easy to read, and designed for use over the Internet. What more could a Web designer ask for?
What Does XML Look Like?
If you've ever used HTML, XML is going to look very familiar!
When you view the source of a document written in XML the first thing you'll see is the XML declaration, which looks like this:
Then, in the body of the document, you'll see a lot of tags. The tags look familiar at first—they start with the usual less than sign and end with the usual greater than sign, like this:
But then you'll notice that the tags might not be quite the names you've come to expect! You'll see tags that seem to be made-up tag names. Tags like <dogchow> and <badcars> and <species>. In fact, if you view the source of an XML document, you'll see tags surrounding lots of words, maybe every word in the document. These tags define exactly what the content is. And the creator of the document had the power to create his or her own specific set of tags.
Suppose you're looking at a Web page marked up in XML on The Canterbury Tales by Chaucer. You're looking specifically at lines 282-286 of "The Physician's Tale." The document source for that section might look like this:
The Physician's Tale
That no man woot therof but God and he.
For be he lewed man, or ellis lered,
He noot how soone that he shal been afered.
Therfore I rede yow this conseil take —
Forsaketh synne, er synne yow forsake.
The tags simply define that:
1) This document is the Canterbury Tales.
2) This section is the Physician's Tale.
3) Each line of the Physician's Tale is defined.
4) Each line ends, and the Physician's Tale and The Canterbury Tales end.
If the entire document were marked up such as this, you could easily jump to a certain line or section. The entire document is annotated for easy reference and searching, and instead of viewing the entire document, users could request only specific sections of a document—simply by calling the specific tags they want. Oh, and we don't recommend that you manually type out each line in the Canterbury Tales. Get a computer to count the lines for you.