XML.com: XML From the Inside Out
oreilly.comSafari Bookshelf.Conferences.

advertisement

The .NET Schema Object Model

December 04, 2002

Despite the many articles explaining W3C XML Schema (WXS), it's not enough to discuss WXS as a specification only. Educational materials should also discuss tools which aid the development of XML applications which employ WXS. This article focuses on an API in the .NET platform, the XML Schema Object Model (SOM). SOM is a rich API which allows developers to create, edit, and validate schemas programmatically -- one of the few such tools available so far.

SOM operates on schema documents analogously to the way DOM operates on XML documents. Schema documents are valid XML files that, once loaded into the SOM, convey meaning about the structure and validity of other XML documents which conform to the schema. SOM is indispensable for a certain class of application, like a schema editor, where it needs to construct the schema in memory and check the schema's validity according to the WXS specification.

The SOM comprises an extensive set of classes corresponding to the elements in a schema. For example, the <xsd:schema> ... </xsd:schema> element maps to the XmlSchema class -- all schema information that can possibly be contained within those tags can be represented using the XmlSchema class.

Similarly <xsd:element> ... </xsd:element> maps to XmlSchemaElement, <xsd:attribute> ... </xsd:attribute> maps to XmlSchemaAttribute and so on. This mapping helps easy use of the API. For a complete listing of all the classes available in the System.Xml.Schema namespace, refer to the .NET Framework Class Library Reference.

Creating A Schema Using The SOM

The following is a simple customer schema, with a top-level Customer element that has two child elements, FirstName and LastName, and one attribute, CustID.

<?xml version="1.0" encoding="utf-8"?>
<xs:schema xmlns:tns="http://tempuri.org" 
           targetNamespace="http://tempuri.org"
           xmlns:xs="http://www.w3.org/2001/XMLSchema">
  <xs:element name="Customer">
    <xs:complexType>
      <xs:sequence>
        <xs:element name="FirstName" type="xs:string" />
        <xs:element name="LastName" type="tns:LastNameType" />
      </xs:sequence>
      <xs:attribute name="CustID" type="xs:positiveInteger"
                    use="required" />
    </xs:complexType>
  </xs:element>
  <xs:simpleType name="LastNameType">
    <xs:restriction base="xs:string">
      <xs:maxLength value="20"/>
    </xs:restriction>
  </xs:simpleType>
</xs:schema>

The following code creates the customer schema in memory using the SOM API. We will use a bottom-up approach in building the schema, constructing the child elements, attributes and their corresponding types first, and then proceed to build the top-level components.

// Create the FirstName and LastName elements
XmlSchemaElement firstNameElem = new XmlSchemaElement();
firstNameElem.Name = "FirstName";
XmlSchemaElement lastNameElem = new XmlSchemaElement();
lastNameElem.Name = "LastName";

// Create CustID attribute
XmlSchemaAttribute idAtt = new XmlSchemaAttribute();
idAtt.Name = "CustID";
idAtt.Use = XmlSchemaUse.Required; 
// The XmlSchemaUse enumeration has the values 
// Required/Optional/Prohibited/None.
// Default is XmlSchemaUse.Optional

Apart from the Name property, which corresponds to the "name" attribute of <xs:element> or <xs:attribute> in a schema, all other attributes allowed by the schema (defaultValue, fixedValue, form to name a few) also have corresponding properties in the XmlSchemaElement/XmlSchemaAttribute classes.

The content of elements and attributes is defined by their types. To create elements or attributes whose types are one of the built-in schema types as defined in XML Schema Part 2: Datatypes, the SchemaTypeName property on XmlSchemaElement or XmlSchemaAttribute is set with the corresponding qualified name of the type. In order to create a user-defined type for the element or attribute, a new simple or complex type is created using the XmlSchemaSimpleType or XmlSchemaComplexType class.

In the customer schema the FirstName element's type is the built-in type "string". The LastName element's type is a named simple type that is a restriction of the built-in type "string", with a maxLength facet value of 20. If we don't want to create named simple or complex types at the top-level, but want the types to be anonymous children of the element or attribute (only simple types apply for attributes), we must set the SchemaType property, with the unnamed simple or complex type, instead of the SchemaTypeName property.

WXS allows both anonymous and named simple types to be derived by restriction from other simple types (built-in or user-defined) or constructed as a list or union of other simple types. The XmlSchemaSimpleTypeRestriction class is used to create a simple type by restricting the built-in string type. We can use XmlSchemaSimpleTypeList or XmlSchemaSimpleTypeUnion classes to create list or union types. The Content property of XmlSchemaSimpleType denotes whether it is a simple type restriction, list, or union.

// Create the simple type for the LastName element
XmlSchemaSimpleType lastNameType = new XmlSchemaSimpleType();
lastNameType.Name = "LastNameType";
XmlSchemaSimpleTypeRestriction lastNameRestriction = new
XmlSchemaSimpleTypeRestriction();
lastNameRestriction.BaseTypeName = new
XmlQualifiedName("string","http://www.w3.org/2001/XMLSchema");
XmlSchemaMaxLengthFacet maxLength = new XmlSchemaMaxLengthFacet();
maxLength.Value = "20";
lastNameRestriction.Facets.Add(maxLength);
lastNameType.Content = lastNameRestriction;

// Associate the elements/attributes with their types
firstNameElem.SchemaTypeName = new XmlQualifiedName("string",
    "http://www.w3.org/2001/XMLSchema"); // Built-in type
lastNameElem.SchemaTypeName  = new XmlQualifiedName("LastNameType",
    "http://tempuri.org");               // User-defined type
idAtt.SchemaTypeName         = new XmlQualifiedName("positiveInteger",
    "http://www.w3.org/2001/XMLSchema"); // Built-in type

// Create the top-level Customer element
XmlSchemaElement custElem = new XmlSchemaElement();
custElem.Name = "Customer";

// Create an anonymous complex type for the Customer element
XmlSchemaComplexType customerType = new XmlSchemaComplexType();
XmlSchemaSequence seq = new XmlSchemaSequence();
seq.Items.Add(firstNameElem);
seq.Items.Add(lastNameElem);
customerType.Particle = seq;

We can also create XmlSchemaChoice or XmlSchemaAll as the particle of the complex type to replicate <xs:choice> or <xs:all> semantics. The sequence is the container to which we add the child elements (FirstName and LastName) for our customer schema.

// Add attribute CustID to the complex type
customerType.Attributes.Add(idAtt);

// Set the SchemaType of Customer element to the 
// anonymous complex type created above
custElem.SchemaType = customerType;

//Create an empty schema
XmlSchema custSchema = new XmlSchema();
custSchema.TargetNamespace = "http://tempuri.org";

// Add all top-level elements and types to the schema
// (In this case customer element custElem and the 
// simple type LastNameType)
custSchema.Items.Add(custElem);
custSchema.Items.Add(lastNameType);

Now we compile the schema and write it to stdout or to a file.

// Compile schema
custSchema.Compile(new ValidationEventHandler(ValidationCallbackOne));
custSchema.Write(Console.Out);

The Compile method of the XmlSchema object validates the schema document against the rules specified in WXS specification. Compilation also makes the post-schema-compilation properties available. All such properties on different SOM classes form the Post-Schema-Compilation Infoset.

The ValidationEventHandler, passed as a parameter to the Compile method, is a delegate that calls the callback function ValidationCallbackOne to handle validation errors and warnings. For more information about ValidationEventHandling, refer to Validation and the Schema object Model.

It would be more useful to write the schema out to a file using the XmlTextWriter. Replace the custSchema.Write with the call to the following method:

private void WriteCustomerSchema(XmlSchema custSchema) {
    FileStream file = new FileStream ("Customer.xsd", 
                                      FileMode.Create,
    FileAccess.ReadWrite);
    XmlTextWriter Writer = new XmlTextWriter (file, 
                                      new UTF8Encoding());
    Writer.Formatting = Formatting.Indented;
    custSchema.Write(Writer);
}

Post-Schema-Compilation Infoset (PSCI)

Once the schema is compiled, the SOM classes that correspond to every schema component expose the post-schema compilation infoset properties. Each class has a set of properties that can be set when they are created in the object model. Such properties are pre-schema compilation infoset properties. In the customer schema created above, the SchemaTypeName property, used to set the name of the simple or complex type of the element or attribute, is a property in the pre-schema-compilation infoset, and the ElementType or AttributeType property is the corresponding post-schema-compilation property.

For the customer schema the following table lists the corresponding post-schema-compilation properties:

XML Representation

SOM Object

Property

Type (Value)

Customer element

XmlSchemaElement

QualifiedName

XmlQualifiedName (tns:Customer)

ElementType

object (Anonymous XmlSchemaComplexType)

BlockResolved

XmlSchemaDerivationMethod (Empty)

FinalResolved

XmlSchemaDerivationMethod (Empty)

Anonymous Complex Type

XmlSchemaComplexType

QualifiedName

XmlQualifiedName (Empty)

BaseSchemaType

XmlSchemaComplexType (anyType)

DerivedBy

XmlSchemaDerivationMethod (Restriction)

Datatype

XmlSchemaDataytpe (Null)

ContentType

XmlSchemaContentType (ElementOnly)

FinalResolved

XmlSchemaDerivationMethod (Empty)

ContentTypeParticle

XmlSchemaSequence

AttributeUses

XmlSchemaObjectTable with one attribute

AttributeWildCard

XmlSchemaAnyAttribute (Null)

CustID attribute

XmlSchemaAttribute

QualifiedName

XmlQualifiedName (CustID)

AttributeType

XmlSchemaDatatype

(positiveInteger)

LastNameType

Simple type

XmlSchemaSimpleType

QualifiedName

XmlQualifiedName (tns:LastNameType)

BaseSchemaType

XmlSchemaDatatype

(string)

DerivedBy

XmlSchemaDerivationMethod (Restriction)

Datatype

XmlSchemaDatatype (string)

FinalResolved

XmlSchemaDerivationMethod (Empty)

Table 1

Manipulating the SOM

In order to manipulate the schema loaded into the SOM, we should be able to traverse the SOM and get at the elements, attributes, and types stored in it. The XmlSchema class has the following properties that provide access to the collection of all global items added to the schema.

Property Name

Item type stored in the collection/Array

Elements

XmlSchemaElement

Attributes

XmlSchemaAttribute

AttributeGroups

XmlSchemaAttributeGroup

Groups

XmlSchemaGroup

Includes

XmlSchemaExternal (Can be XmlSchemaInclude/XmlSchemaImport/XmlSchemaRedefine)

Items

XmlSchemaObject (Can access all global level items: elements/attributes/types added to this collection)

Notations

XmlSchemaNotation

SchemaTypes

XmlSchemaType (Can be XmlSchemaSimpleType/XmlSchemaComplexType)

UnhandledAttributes

XmlAttribute (Attributes that do not belong to the schema namespace)

The Items property is a pre-schema-compilation property that can be queried to look up Elements, Attributes, Types, Groups, and so on.

The UnhandledAttributes property provides access to all the non-schema-namespace attributes found in the schema. These attributes are not processed by the schema processor. All the other properties are post-schema-compilation properties and become available once the schema is compiled.

Pages: 1, 2, 3

Next Pagearrow