XML.com: XML From the Inside Out
oreilly.comSafari Bookshelf.Conferences.

advertisement

The .NET Schema Object Model
by Priya Lakshminarayanan | Pages: 1, 2, 3

Including or Importing Schema Documents

A schema may contain import, include, and redefine elements. These point to other schema documents and augment the structure of the main schema in some way.

For example, the address.xsd sample schema in the WXS Primer can be incorporated into our customer schema so that different address types are available for use. We can incorporate it either as an <xs:include> or <xs:import>, to use its components as-is, or as <xs:redefine> to modify any of its components to suit our needs. Since the address schema has a targetNamespace that is different from that of our customer schema, we use the import semantics.

The XmlSchema object has an Includes property that can be accessed to get at all the includes, imports, or redefines added to the main schema. Each component added to this collection is of type XmlSchemaExternal. Since the XmlSchema object also exposes an Items collection, one might incorrectly assume that the includes and imports defined in the schema are added to it. In fact, they are added to a separate collection which can accessed through the Includes property on XmlSchema. In our example, the imported address schema will be added to the Includes collection and not to the Items collection in the XmlSchema object for the customer schema. The following sample shows how to import the address.xsd schema to the Customer schema:

private void IncludeSchema(string fileName) {

    // Read address.xsd and customer.xsd into the SOM and compile
    XmlSchema addrSchema = ReadAndCompileSchema("address.xsd");
    XmlSchema custSchema = ReadAndCompileSchema("Customer.xsd");

    //Create the import
    XmlSchemaImport imp = new XmlSchemaImport();
    imp.Namespace = "http://www.example.com/IPO";
    imp.Schema = addrSchema;

    //Add the import to the customer schema
    custSchema.Includes.Add(imp);
    custSchema.Compile(new 
       ValidationEventHandler(ValidationCallbackOne));
    custSchema.Write(console.Out);
}

We can also set the SchemaLocation property on the XmlSchemaImport object, instead of the Schema property. In this case, during compilation, the schema will be fetched from the location provided.

The following schema is generated from the sample code:

<?xml version="1.0" encoding="utf-8"?>
<xs:schema xmlns:tns="http://tempuri.org" 
           targetNamespace="http://tempuri.org"
           xmlns:xs="http://www.w3.org/2001/XMLSchema">
  <xs:import namespace="http://www.example.com/IPO"/>
  <xs:element name="Customer">
    <xs:complexType>
      <xs:sequence>
        <xs:element name="FirstName" type="tns:FirstNameCT" />
        <xs:element name="LastName" type="tns:LastNameType" />
        <xs:element name="PhoneNumber">
          <xs:simpleType>
            <xs:restriction base="xs:string">
              <xs:pattern value="\d{3}-\d{3}-\d{4}" />
            </xs:restriction>
          </xs:simpleType>
        </xs:element>
      </xs:sequence>
      <xs:attribute name="CustID" type="xs:positiveInteger"
                    use="required" />
    </xs:complexType>
  </xs:element>
  <xs:simpleType name="LastNameType">
    <xs:restriction base="xs:string">
      <xs:maxLength value="20"/>
    </xs:restriction>
  </xs:simpleType>
  <xs:complexType name="FirstNameCT">
    <xs:simpleContent>
      <xs:extension base="xs:string">
        <xs:attribute name="Title" type="xs:string" />
      </xs:extension>
    </xs:simpleContent>
  </xs:complexType>
</xs:schema>

If we have included or imported a number of related schemas to the Customer schema -- Order.xsd, address.xsd, orderdetail.xsd and so on -- the following code demonstrates recursive iteration through them and each of their includes or imports:

private void TraverseExternals() {
    XmlSchema custSchema = ReadAndCompileSchema("Customer.xsd");
    RecurseExternals(custSchema);
}

private void RecurseExternals(XmlSchema schema) {
    foreach(XmlSchemaExternal ext in schema.Includes) {
        if(ext.SchemaLocation != null) {
            Console.WriteLine("External SchemaLocation: " +
                ext.SchemaLocation);
        }
        if (ext is XmlSchemaImport) {
            XmlSchemaImport imp = (XmlSchemaImport)ext;
            Console.WriteLine("Imported Namespace: " + 
                imp.Namespace);
        }

        if (ext.Schema != null) {
            ext.Schema.Write(Console.Out);
            //Traverse its externals
            RecurseExternals(ext.Schema);
        }
    }
}

A Schema Library -- The XmlSchemaCollection

For most applications that make use of schemas, there is a need to be able to load a set of schemas together for validating instance documents and cache the schemas for later reuse. The XmlSchemaCollection is designed to serve this purpose. The XmlSchemaCollection can store a set of WXS schemas or XML-Data Reduced (XDR) schemas.

The following code sample illustrates the usage of this class:

private static XmlSchemaCollection CreateSchemaCollection() {
    XmlSchemaCollection xsc = new XmlSchemaCollection();
    xsc.ValidationEventHandler += 
        new ValidationEventHandler(ValidationCallbackOne);
    xsc.Add(null,"address.xsd");
    xsc.Add(null,"Customer.xsd");
    return xsc;
}

The Add method loads the schema document into an XmlSchema object, then compiles the schema and adds it to the collection. The first parameter takes the namespace that the schema belongs to, which is the same as the targetNamespace of the schema. If this parameter is null, the targetNamespace defaults to the one defined in the schema.

Validation using the XmlSchemaCollection

The XmlValidatingReader class in the .NET Framework is used for validating XML documents against a schema or a set of schemas. The XmlSchemaCollection is used by the validating reader for efficient validation. Schemas can be added to the schema collection by accessing its Schemas property. We can either add individual schemas or an entire collection. The following code validates the Customer.xml file against the Customer schema.

private static void Validate() {
    XmlTextReader tr = new XmlTextReader("Customer.xml", 
                                         new NameTable());
    XmlValidatingReader vr = new XmlValidatingReader(tr);
    vr.ValidationEventHandler += 
        new ValidationEventHandler(ValidationCallbackOne);
    vr.ValidationType = ValidationType.Schema;
    vr.Schemas.Add(CreateSchemaCollection());
    while(vr.Read()) {} //Validate
}

Security in the SOM -- Can You Trust Your Schema?

One reason WXS schemas are important to XML is because they set the ground rules for communication across different XML applications. When schemas have to be exchanged across applications, situations may arise in which you cannot trust the source from which your schemas originated. There was a recent question asked on one of our discussion lists about how security is handled in the SOM:

"The larger question I’m investigating is that my app reads XSD schemas, and I’d like to attempt to find externally referenced schemas for the user. I’m trying to figure out if I can safely open imports or includes that have a schemaLocation value without allowing a malicious schema to call a harmful URL with my user’s credentials (like schemaLocation="http://my401k.com/sellEverything?sendCheckTo=MyAddress").

The SOM provides a way by which one can choose to not resolve the externals if the schema is not from a trustworthy source. The Compile method on the XmlSchema class has an overload that takes in an XmlResolver. The Add() methods on XmlSchemaCollection also have corresponding overloads that take in XmlResolvers. In both cases, you can pass in null for the resolver so that the schemaLocation of the includes, imports, and redefines are not resolved. This is available in the Beta 1 release of the .Net Framework. You can also create your own custom resolvers by subclassing the XmlResolver class and overriding the GetEntity method.

Further Reading

The following are links to the complete reference on the Schema Object Model, the class hierarchy in the System.Xml.Schema namespace in the .NET Framework class library.

Acknowledgments

Thanks to Dare Obasanjo, Mark Fussell, Tejal Joshi and Yan Leshinsky for reviewing this article and providing feedback.



1 to 3 of 3
  1. how to traverse XML Schemas
    2006-05-02 21:26:37 Forum
  2. Traversing the Schema Child with Group Reference
    2003-08-01 08:46:44 Matt Frame
  3. Targeting a nested child element
    2003-02-24 10:21:26 Wendy Attenberger
1 to 3 of 3