Featured Discussion: Designing Mixed-element Schema

Featured Discussion: Designing Mixed-element Schema

ecently, “Paul” wrote to the xml.general group asking:

I have the following XML element:

   Hello John, my name    is Paul. I would    like to tell you something.

The element can contain any number of and elements in any order?but I cannot seem to determine how to construct the element within the Schema file. I currently have in the schema something like this:

                  

The element allows for any ordering, but only allows to have a max of 1 and 1 element. I need to have an unbounded number of and elements in any order.

Diagnosis
There are two problems here. First, the and elements appear in any order, and second, they appear some unforeseeable number of times within the element. Fortunately, XML schema aren’t limited to sequences or set lists; you can create mixed content elements just as easily.

When you’re working with regular structured data, such as data extracted from a relational database table, it’s relatively easy to see how you can write an XML schema, to describe the data, because:

  • The database defines the data types for each column
  • Any row holds one and only one data value for each column.

For example, here’s a simple customer record with three columns:

CustomerID CustomerLName CustomerFName
25 Foo John

A schema definition for this might look like:

                                                

The element defines a sequence of sub-elements that must appear in the defined order. The preceding schema would validate the following XML file:

   10   Doe   John

Working It Through
Sometimes, you need to ensure that a set of elements appears, but you don’t care about the order. For example, does it really matter if the CustomerID element always appears first in a document? If you’re simply scanning through an XML file, perhaps reading all elements into a Customer class, you would key off the element name, creating a new Customer instance each time you encounter a element, setting its ID whenever you encounter a tag, etc. As long as the tags contain the correct data, it makes little difference whether you set the Customer object’s ID property before setting its LastName property; any sequence of ID, LastName, and FirstName tags works equally well.

In that case, you can use the element, which lets a list of sub-elements appear in a document in any order.

                 

Now, you can validate either the customer XML document shown earlier or a customer XML document with a different sub-element sequence, for example:

   Doe   John   10

However, when the data in XML documents is less structured, as in Paul’s question, the schema structure is less clear. One answer received from Anthony Jones states that you can solve the problem of the unknown number of child elements by adding a maxOccurs attribute with a value of “unbounded.” Unbounded means an element may appear any number of times.

But that still doesn’t solve the problem; as Paul says, the element allows for a set of unsequenced elements, but only one of each element may occur in the set. “John” provides a more complete schema that also uses the maxOccurs=”unbounded” attribute.

                   

But, that’s not a complete solution, because the element doesn’t allow maxOccurs to be unbounded, it restricts the value of both the minOccurs and maxOccurs attributes to either 0 or 1.

A Working Solution
Rather than xs:all, use xs:choice, with the minOccurs attribute set to 0 and the maxOccurs attribute set to unbounded. Here’s the fixed schema.

                                        

The preceding schema lets the and sub-elements occur zero or more times, in any order. The solution is not intuitive because the element implies that the validator will choose between alternatives, but in this case, it simply chooses all the child and elements, letting the document validate. There have been several calls for alterations to the element in XML Schema 1.1, adding support for the maxOccurs=”unbounded” attribute value, but that version isn’t available yet.

Schema such as this are critical when you need to be able to validate unstructured data, where it’s impossible to know the order or number of elements in advance.

Go the the xml.general group now to participate or ask your own question.

Share the Post:
Heading photo, Metadata.

What is Metadata?

What is metadata? Well, It’s an odd concept to wrap your head around. Metadata is essentially the secondary layer of data that tracks details about the “regular” data. The regular

XDR solutions

The Benefits of Using XDR Solutions

Cybercriminals constantly adapt their strategies, developing newer, more powerful, and intelligent ways to attack your network. Since security professionals must innovate as well, more conventional endpoint detection solutions have evolved

AI is revolutionizing fraud detection

How AI is Revolutionizing Fraud Detection

Artificial intelligence – commonly known as AI – means a form of technology with multiple uses. As a result, it has become extremely valuable to a number of businesses across

AI innovation

Companies Leading AI Innovation in 2023

Artificial intelligence (AI) has been transforming industries and revolutionizing business operations. AI’s potential to enhance efficiency and productivity has become crucial to many businesses. As we move into 2023, several

data fivetran pricing

Fivetran Pricing Explained

One of the biggest trends of the 21st century is the massive surge in analytics. Analytics is the process of utilizing data to drive future decision-making. With so much of

kubernetes logging

Kubernetes Logging: What You Need to Know

Kubernetes from Google is one of the most popular open-source and free container management solutions made to make managing and deploying applications easier. It has a solid architecture that makes