Browse DevX
Sign up for e-mail newsletters from DevX


XML 1.0 Superset Makes XML Concise : Page 2

Because XML was not designed for data, it has serious ambiguities and constraints. These limitations are hard for many to understand because most articles never address them. ConciseXML, a superset of XML 1.0, aims to solve not only these limitations but also the verbosity of XML.




Building the Right Environment to Support AI, Machine Learning and Deep Learning

Document Versus Data
XML is commonly used for representing data structures. A data structure is simply a way to represent data that obeys some well-defined structure. The Water language, using ConciseXML, can formally describe the structure of data by using Water Type and Water Contract. Using Water, you also can unambiguously represent static data.

Representing static data might seem straightforward, but XML 1.0 has design constraints carried over from the document markup world that can make representing data in XML quite confusing. The quandary between elements and attributes is a common example of this confusion.

Most programming languages and other technologies for representing data employ the concept of a data structure or object. This article, by convention, uses the term object. The word object is similar to other terms such as a record, structure, or tuple from other technologies.

In most programming languages, an object has fields, and those fields hold values that are also objects. Water objects have this property as well: An object is a collection of fields; each field has a key and a value; and the value can be any object.

The following ConciseXML is an example of an item object:

<item id="XL283" color="blue" size=10/>

The preceding ConciseXML could be described as creating an instance of an item object. The instance has three fields: id, color, and size. The value of the id field is the string "XL283", the value of the color field is "blue", and the value of the size field is the number 10.

The type or class of the object appears as the element's name, immediately following the opening angle bracket (<). The fields of the object are represented as key-value pairs within the element's opening area. An opening angle bracket syntactically is the start of an XML element, but it has the semantic meaning of performing a call. The call is either the calling of a method or the calling of a constructor method of an object. Fields of an object have a clear and unambiguous key and value:

<item id="xx283" color="blue" size=10/>

In the preceding line, the instance of item has three fields. "id" is the key of the first field, and "xx283" is the value of the field. "color" is the key of the second field and "blue" is its value. "size" is the key of the third field and the integer 10 is its value.

It is very common, though, to see the following XML to represent the instance of item above:

<item> <id>xx283</id> <color>blue</color> <size>10</size> </item>

To the vast majority of people, the above XML is normal and easily understood, but this is an example of XML in the flat-world model. The round-world model sees this as an ambiguous, poorly constructed XML data object. One problem (which is described in detail later in this article) is that the syntax of an XML element is used to represent two very different things: an object and a field of an object. Having one syntax to represent two different concepts presents a serious ambiguity problem. This ambiguity leads to a serious problem when a machine tries to interpret the meaning of the XML data.

For a data structure to be useful, the distinction between objects and fields is extremely important. How, for example, do you know that <color>blue</color> represents a field of item and not an instance of type color? As humans, we use our gift of pattern recognition to deduce that color must be a field of item because it occurs within the content of item and it has blue in the content of the element.

To emphasize the ambiguity, what if you wrapped the item within another color element? Is item now a field of color? Did the meaning of item radically change because it moved to a different level in the structure? Consider the following example:

<color> <item> <id>xx283</id> <color>blue</color> <size>10</size> </item> </color>

If a serious ambiguity appears in such a small example, imagine the scope of the problem when objects and data structures get more complex. At a minimum, data structures need to be unambiguous and not depend on any other knowledge for interpreting a data structure.

Water's use of XML makes a clear separation between objects and fields. An XML element represents an object. XML attributes represent fields of an object. The ConciseXML syntax allows any type of object as the value of an attribute; therefore, Water supports fields that can store any type of object—not just strings.

Comment and Contribute






(Maximum characters: 1200). You have 1200 characters left.



Thanks for your registration, follow us on our social networks to keep up-to-date