Login | Register   
LinkedIn
Google+
Twitter
RSS Feed
Download our iPhone app
TODAY'S HEADLINES  |   ARTICLE ARCHIVE  |   FORUMS  |   TIP BANK
Browse DevX
Sign up for e-mail newsletters from DevX


advertisement
 

Taking XML Validation to the Next Level: Explore CAM's Expressive Power : Page 4

The generic-sounding Content Assembly Mechanism, or CAM, is an exciting step beyond XML Schema, but it's new and not well documented. This article series represents CAM: The Missing Manual. This last installment is a deep exploration of CAM's ability to express exactly what you need for data-centric documents.


advertisement

Cardinality

Cardinality lets you have multiple occurrences of an element, be it chapters in a book, books in a catalog, or items in an order. XML Schema uses the minOccurs and maxOccurs facets to express these in a straightforward manner. CAM uses a different model that at first glance may seem more complicated but after you've worked with it for a while, you should find it just as simple to use. CAM uses these predicates:

  • makeOptional: Indicates an element or attribute may or may not be present.
  • makeMandatory: (The converse of makeOptional). Indicates an element or attribute must be present.
  • makeRepeatable: Indicates an element may occur multiple times.
  • setLimit: Limits the number of times an element may repeat.
  • setRequired: Requires a minimum number of occurrences of an element.
Table 3 lists the principal cardinality combinations, showing how to express the same concept in both XML Schema and CAM.

Table 3. Cardinality Alternatives: This table shows cardinality constraints map between XML Schema and CAM. The defaults are highlighted in blue and may be safely omitted if desired.
Cardinality XML Schema CAM
0 or 1 minOccurs="0" maxOccurs="1" action="makeOptional()"
exactly 1 minOccurs="1" maxOccurs="1" action="makeMandatory()"
action="setLimit(1)"
0 or more minOccurs="0" maxOccurs="unbounded" action="makeOptional()"
action="makeRepeatable()"
1 or more minOccurs="1"
maxOccurs="unbounded"
action="makeMandatory()"
action="makeRepeatable()"
0 to n minOccurs="0" maxOccurs="n" action="makeOptional()"
action="makeRepeatable()"
action="setLimit(n)"
1 to n minOccurs="1" maxOccurs="n" action="makeMandatory()"
action="makeRepeatable()"
action="setLimit(n)"
n or more minOccurs="n" maxOccurs="unbounded" action="makeRepeatable()"
action="setRequired(n)"
m to n minOccurs="m" maxOccurs=" n" action="makeRepeatable()"
action="setRequired(m)"
action="setLimit(n)"

Note that both attributes and elements are mandatory by default with CAM. You may make either optional with the makeOptional predicate. With XML Schema, on the other hand, elements are mandatory but attributes are optional by default.

Compositors

Compositors allow you to organize XML content elements. XML Schema defined three standard compositors, and CAM constructs map nicely to the same concepts (see Table 4). Note that the table subtly includes specific cardinality constraints, those that are the defaults for both XML Schema and CAM: one instance of an element. For simplicity, the following sections that detail each of these compositors assume this solitary cardinality. But by the proper application of cardinality you could, for example, change the rule for unordered elements from…

  • "all child elements must appear but may be in any order"
…to…

  • "all child elements may appear and may be in any order."
Similarly, you could change the rule for choice of elements from…

  • "one and only one of the child elements must appear"
…to…

  • "each child element may appear any number of times in any order".
Implementing the latter example involves a number of intricacies as you'll see in the discussion of the choice compositor.

Table 4. XML Schema and CAM Compositors: This table summarizes how compositors map between XML Schema and CAM.
Compositor Type Description XML Schema CAM
Unordered Elements All child elements must appear but may be in any order <xs:all> default
Ordered Elements All child elements must appear in the declared order <xs:sequence> orderChildren()
Choice of Elements One and only one of the child elements must appear <xs:choice> setChoice()

Unordered Child Elements Compositor

This compositor requires all its child elements to be present, but order is immaterial. XML Schema imposes several restrictions on the application of this useful construct. These restrictions are (from xFront XML Schema Tutorial, slide 161). The <xs:all> element:

  • May not be nested within <sequence>, <choice> or another <all>.
  • Contains only elements—it may not include <sequence> or <choice>.
  • If in a complex type definition extending another type, the parent type must have empty content.
CAM has no such restrictions; this is the default behavior of any parent element. So to define a book element that contains a title, author, and ISBN number where the three elements may appear in any order, the structure can be something like Table 5.

Table 5. Unordered Child Elements: CAM uses this model by default so no additional constructs are required. XML Schema uses the container.
CAM XML Schema

<Guidebook>
<Title>%string%</Title>
<Author>%string%</Author>
<ISBN>%string%</ISBN>
</Guidebook>

<xs:element name="Guidebook">
<xs:complexType name="Book">
<xs:all>
<xs:element name="Title"
type="xs:string" />
<xs:element name="Author"
type="xs:string" />
<xs:element name="ISBN"
type="xs:string" />
</xs:all>
</xs:complexType>
</xs:element>




Ordered Child Elements Compositor

This compositor requires all child elements to be present in the order specified. For the CAM implementation, there is no difference in structure (compare the CAM structure of Table 5 to Table 6); the difference lies in the business rules. You simply specify the orderChildren() predicate in a rule pointing to the relevant parent element (Guidebook in this case). On the XML Schema side, the <sequence> element replaces the <all> element.

Table 6. Ordered Child Elements: You specify an ordered set of child nodes in CAM by adding a rule with the orderChildren predicate. In XML Schema, this requires changing the <xs:all> container to an <xs:sequence> container.
CAM XML Schema


<Guidebook> <Title>%string%</Title> <Author>%string%</Author> <ISBN>%string%</ISBN> </Guidebook>


<as:constraint action=
"orderChildren(//Guidebook)" />

<xs:element name="Guidebook"> <xs:complexType name="Book"> <xs:sequence> <xs:element name="Title"
type="xs:string" /> <xs:element name="Author"
type="xs:string" /> <xs:element name="ISBN"
type="xs:string" /> </xs:sequence> </xs:complexType> </xs:element>


Choice Compositor

This compositor requires exactly one of its child elements to be present. Again, the CAM structure looks very much like an XML instance except that all choices are present; the setChoice predicate in the rules section overlays the semantics of exclusive choice onto this structure (see Table 7).

Table 7. Choice of Elements: To require a choice of one element from a set, use a rule with the setChoice predicate in CAM. In XML Schema, use the container.
CAM XML Schema

<duck-type>
<diving>%string%</diving>
<dabbling>%string%</dabbling>
</duck-type>


<as:constraint action=
"setChoice(//duck-type/*)"/>

<xs:element name="duck-type">
<xs:complexType>
<xs:choice>
<xs:element name="diving"
type="xs:string" />
<xs:element name="dabbling"
type="xs:string" />
</xs:choice>
</xs:complexType>
</xs:element>


 
Figure 3. Attaching a Rule to a Node: The typical rule is applied directly to the node it should affect. If you later select the same node in the Structure view, you will again see the rule attached to it in the ItemRules view. This seems terribly obvious…until you learn about applying rules with the setChoice predicate.
Before delving into the intricacies of setChoice semantics, here's a brief digression into the application of setChoice in the editor. You typically apply a predicate directly to the element that you wish it to act upon. In Figure 3, for example, the goal is to make the element <someOptionalElement> optional. To do this, select the <someOptionalElement> node in the Structure view, open its context menu, and select Add New Rule. In the rule wizard, select the makeOptional() predicate. When you close the wizard, the new rule shows up in the ItemRules view—firmly attached to <someOptionalElement>.

The setChoice() predicate, however, operates differently. You apply a rule with this predicate to the parent of the group of elements that you wish to turn into a set of choices, and then refer to the child choices within the predicate. Figure 4 illustrates how this works.

 
Figure 4. Attaching a Rule with setChoice: (1) Initially no rules are present on the <classification> element. Select Add New Rule from the context menu and change the Rule XPath settings to those shown (2), to update the XPath to //classification/*. When you close the wizard there's no rule attached to the <classification> node (3) because the rule XPath specified the node's children. Frames (4) and (5) confirm that the rule is attached to the child nodes.

Before defining a rule (see Figure 4, frame 1) note that the <classification> element is selected in the Structure and that no rules are defined according to the ItemRules view. Opening the context menu on the <classification> node and selecting Add New Rule brings up the rule wizard (frame 2). Near the top, the Item field should confirm that you are on the "classification" item. Just below that the XPath (an automatically generated field) shows what the group of "Rule XPath" checkboxes designate. By default, just Parent and All are checked so the XPath should initially show //bird/classification. Deselect Parent and select Children; the XPath will change as shown in the figure, to //classification/*. Finally, select setChoice as the action, and then close the wizard.

Frame 3 in Figure 4 shows the Structure and ItemRules views immediately after closing the wizard. The <classification> node is still selected, but surprisingly, there are still no rules defined for it—even though you just created one. So what happened to the rule? It went to the children—all of them—because that's what the XPath selection defined. Frames 4 and 5 are present simply to prove the point: the rule does indeed appear for each of those nodes, even though there's only one rule. You can prove this by deleting the rule from the ItemRules view for any of the child nodes; doing so removes it from all the child nodes.

The rule for this set of nodes uses the XPath //classification/* selector to indicate all children of classification nodes. You can express the same content but be more explicit about the child nodes using a more specific XPath expression such as:

//classification/*[ (name() = 'raptor' ) or (name() = 'waterfowl') or (name() = 'passerine') ]

The above expression selects only those children with matching names. The preceding example works only when your XML does not use namespaces; it will fail when the XML includes namespace qualifiers on nodes (e.g. foo:raptor). The following variation attempts to match nodes that contain the base node names when namespaces are in use:

//classification/*[ contains(name(),'raptor') or contains(name(),'waterfowl') or contains(name(),'passerine') ]

It would seem that this is more robust but the contains() function opens the door for other elements as well, e.g. "velociraptor" matches as readily as "raptor". Clearly, ends-with() would fare no better. When using namespaces, either match the entire, qualified names, or use an XPath expression including the substring-after function to strip off the namespace in the comparison:

//classification/*[ (substring-after(name(),':') = 'raptor' ) or (substring-after(name(),':') = 'waterfowl') or (substring-after(name(),':') = 'passerine') ]

The straightforward //classification/* notation is often adequate but there are two reasons to be aware of variations. First, CAM templates generated from schema files use either the contains or equality variations (this seems to be in a state of flux at the time of writing). This discussion should help you understand the generated templates. Second, and more importantly, the simple notation works only when you have proper hierarchy in your structure. The above example structure looks in part like this:

<bird> <classification> <raptor/> <waterfowl/> <passerine/> </classification> <waterfowl-category> <diving/> <dabbling/> </waterfowl-category> </bird>

Both the <classification> and <waterfowl-category> have separate setChoice predicates on all their children. So the classification list of raptor, waterfowl, and passerine corresponds to all (*). But consider a flatter structure that does not isolate the children hierarchically:

<bird> <raptor/> <waterfowl/> <passerine/> <diving/> <dabbling/> </bird>

In the preceding example, the XPath expression //bird/* would match diving and dabbling in addition to raptor, waterfowl, and passerine. In this case, you would have to use one of the more specific XPath expressions (you can see examples in the downloadable code in the files Compositors/compositors_flat.cam, .xml, and .xsd). Interestingly, note that this is another way to apply conditions—but without explicitly setting a rule to be conditional. The XPath expression that selects the nodes implements the conditionality.

With this grounding in setChoice, the next section shows the intricacies of layering it with cardinality in practical examples.



Comment and Contribute

 

 

 

 

 


(Maximum characters: 1200). You have 1200 characters left.

 

 

Sitemap