Browse DevX
Sign up for e-mail newsletters from DevX


Making Sense of the XML DataType in SQL Server 2005 : Page 5

In SQL Server 2005, XML becomes a first-class data type. Developers can make minor remote modifications to stored XML documents easily, taking advantage of new support for XML schema-based strong typing, and server-based XML data validation.




Building the Right Environment to Support AI, Machine Learning and Deep Learning

Typed XML
The SQL language (and SQL database servers, by extension) represents a type system for storing information. You define data types in databases all the time. That is called schema. You define types tables and views), with certain attributes (columns and data types in those columns), relationships between types (foreign keys), and rules about the data that can be stored in types (constraints and triggers). The same thing happens in XML. In many situations, you want to dictate rules about what can be stored in XML data types. In SQL Server 2005, you can register XML schemas with the database. These schemas can be used to specify what XML can be used in a particular situation. In particular, using schemas can allow you to extend the type system by using XML schemas to specify complex data types.

In SQL Server 2005, you can register XML schemas with the database.
SQL Server 2005 allows both the use of XML columns and variables with generic XML (as you have seen earlier in this article) and the use of XML columns and variables that are typed with XML schema. When you use the XML data type with schema information, any XML inserted into a typed XML column is validated against the schema. In this way, the database ensures that the data stored is not only well-formed, but also conforms to the schema.

Using Typed XML
The first step in using typed XML is registering a schema. This is done by using the new CREATE XML SCHEMA COLLECTION statement. This new statement allows you to store schemas for XML that are used to validate XML stores:

CREATE XML SCHEMA COLLECTION BaseballSchema AS '<?xml version="1.0"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:element name="Team"> <xsd:complexType> <xsd:sequence> <xsd:element name="Pitcher"> <xsd:complexType> <xsd:attribute name="name" type="xsd:string" /> </xsd:complexType> </xsd:element> </xsd:sequence> <xsd:attribute name="name" type="xsd:string" /> </xsd:complexType> </xsd:element> </xsd:schema>'

The CREATE XML SCHEMA COLLECTION statement creates a collection of schemas, any of which can be used to validate XML typed with the name of the collection. This example shows a new schema called BaseballSchema being added to the database. The schema is entered as a string value. Once you have a schema registered, you can use the schema in new instances of the XML data type:

DECLARE @team xml(BaseballSchema) SET @team = '<Team name="Braves"> <Pitcher name="John Smoltz" /> </Team>' SELECT @team

Like you did earlier in the article, you can create a variable of XML type, but because you want to dictate the type of XML, use the parenthetical syntax specifying the schema name you registered. You can store the XML in the variable. You want a specific type of XML, one that conforms to the BaseballSchema. But what if the XML attempting to be stored does not conform to BaseballSchema? In this example, you try and store a piece of XML that has an undefined attribute (or role, if you prefer that term):

SET @team = '<Team name="Braves"> <Pitcher name="John Smoltz" role="Closer" /> </Team>'

This fails because the XML stored in the @team variable does not conform to the specific schema type. Executing the SET statement just shown yields a specific XML validation error:

XML Validation: Undefined or prohibited attribute specified: 'role'

There may be other reasons that specifying a schema is helpful, such as aiding in queries.

As you might expect, using typed XML is straightforward in table creation:

CREATE TABLE Team ( TeamID int identity not null, TeamDoc xml(BaseballSchema) )

Much like the XML variable example above, you can create new rows with the typed XML data, like so:

INSERT INTO Team (TeamDoc) VALUES (' <Team name="Braves"> <Pitcher name="John Smoltz" /> </Team>')

When the insertion happens, SQL Server 2005 validates the XML against the schema specified in the table declaration (BaseballSchema). The following insertion violates that schema, again by specifying an invalid role, so it will fail:

UPDATE Team SET TeamDocTyped = '<Team name="Braves"> <Pitcher name="John Smoltz" role="Closer" /> </Team>'

Being able to specify the types of XML that are allowed in a particular case is helpful in extending the type system to include complex types that are not allowed with standard SQL. If typed XML allows you to specify these complex types as type information, you need a way to manage the schemas to allow for extension and change of these types as the data matures.

Managing XML Schema
What happens when a schema changes? During development, this is likely to happen quite a bit. For example, if you wanted to add the new role attribute to the Pitcher type as defined in BaseballSchema, you would have to drop the entire schema collection:


But this schema is in use in the Team table, so SQL Server 2005 does not allow you to drop the schema collection:

Specified collection 'BaseballSchema' cannot be dropped because it is used by object 'Team'.

Instead, you can alter the table to drop the column referencing the schema, drop the schema itself, then re-create the schema with the new attribute, and finally, modify the table to add back the column that you dropped, this time referencing the new schema version, as seen below.

ALTER TABLE Team DROP COLUMN TeamDoc GO DROP XML SCHEMA COLLECTION BaseballSchema GO CREATE XML SCHEMA COLLECTION BaseballSchema AS '<?xml version="1.0"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:element name="Team"> <xsd:complexType> <xsd:sequence> <xsd:element name="Pitcher"> <xsd:complexType> <xsd:attribute name="name" type="xsd:string" /> <xsd:attribute name="role" type="xsd:string" /> </xsd:complexType> </xsd:element> </xsd:sequence> <xsd:attribute name="name" type="xsd:string" /> </xsd:complexType> </xsd:element> </xsd:schema>' GO ALTER TABLE Team ADD TeamDoc xml (BaseballSchema) GO

If you were allowed to change the schema in place, you would have to re-validate all the data in the database, so SQL Server 2005's approach of just not allowing such a change seems reasonable. There is no good solution when you need to change the schema of existing typed XML.

You can extend the schema collection by adding new schemas to them to allow new types of typed XML. This is done with the ALTER XML SCHEMA COLLECTION statement:

ALTER XML SCHEMA COLLECTION BaseballSchema ADD ' <?xml version="1.0"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:element name="Score"> <xsd:complexType> <xsd:attribute name="HomeTeam" type="xsd:string" /> <xsd:attribute name="AwayTeam" type="xsd:string" /> <xsd:attribute name="HomeScore" type="xsd:int" /> <xsd:attribute name="AwayScore" type="xsd:int" /> </xsd:complexType> </xsd:element> </xsd:schema>'

The ALTER XML SCHEMA COLLECTION statement allows you to alter a schema collection to create new top-level types to be used in the same schema collection. By using the ALTER XML SCHEMA COLLECTION syntax and adding the new Score element type, you extend the types allowable within the BaseballSchema. After you add this schema, you can use the new schema types:

DECLARE @team xml(BaseballSchema) SET @team = '<Score HomeTeam="Braves" AwayTeam="RedSox" HomeScore="5" AwayScore="4" />'

As mentioned earlier, because the XML is really typed to the schema collection rather than a single schema, this is perfectly acceptable. You should note that the XML data type schema name is still the same (BaseballSchema), but you can now use the new types in the XML.

Thanks for your registration, follow us on our social networks to keep up-to-date