Why You Should Very Carefully Use Restriction Of Complex Types
Restriction of complex types involves creating a derived complex type whose content model is a subset of the base type.
The parts of the WXS spec which describe derivation by restriction in complex types (Section 3.4.6 and Section 3.9.6) are generally considered to be its most complex parts. Most bugs in implementations cluster around this feature, and it is quite common to see implementers express exasperation when discussing the various nuances of derivation by restriction in complex types. Further, this kind of derivation does not neatly map to concepts in either object oriented programming or relational database theory, which are the primary producers and consumers of XML data. This is the exact opposite of the situation with derivation by extension of complex types.
Another challenge in using derivation by restriction of complex types
arises from the way in which restrictions are declared: when a given
complex type is to be derived by restriction from another complex type,
its content model must be duplicated and refined. Duplication of a
definition replicates definitions, possibly down a long derivation chain,
so any modification to an ancestor type must be manually propagated down
the derivation tree. Furthermore, such replication cannot cross namespace
boundaries -- deriving
may not work if
ns2:SlowCar's has a child element,
ns2:MaxSpeed, because it cannot be correctly derived from
ns1:Car's child element
The following schema uses derivation by restriction to restrict a complex
type, which describes a subscriber to the XML-DEV mailing list, to a type
that describes me. Any element that conforms to the
DareObasanjo type can also be validated as an instance of the
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema> <!-- base type --> <xs:complexType name="XML-Deviant"> <xs:sequence> <xs:element name="numPosts" type="xs:integer" minOccurs="0" maxOccurs="1" /> <xs:element name="signature" type="xs:string" nillable="true" /> </xs:sequence> <xs:attribute name="firstSubscribed" type="xs:date" use="optional" /> <xs:attribute name="mailReader" type="xs:string"/> </xs:complexType> <!-- derived type --> <xs:complexType name="DareObasanjo"> <xs:complexContent> <xs:restriction base="XML-Deviant"> <xs:sequence> <xs:element name="numPosts" type="xs:integer" minOccurs="1" /> <xs:element name="signature" type="xs:string" nillable="false" /> </xs:sequence> <xs:attribute name="firstSubscribed" type="xs:date" use="required" /> <xs:attribute name="mailReader" type="xs:string" fixed="Microsoft Outlook" /> </xs:restriction> </xs:complexContent> </xs:complexType> </xs:schema>
Derivation by restriction of complex types is a multifaceted feature that is useful in situations where secondary types need to conform to a generic primary type, but also add their own constraints which go beyond those of the primary type. However, its extreme complexity requires that it be used only by those who have a firm grasp of WXS.
Why You Should Carefully Use Abstract Types
Borrowing a concept from OOP languages like C# and Java, both element declarations and complex type definitions can be made abstract. An abstract element declaration cannot be used to validate an element in an XML instance document and can only appear in content models via substitution. An abstract complex type definition similarly cannot be used to validate an element in an XML instance document; but it can be used as the the abstract parent of an element's derived type or in cases where the element's type is overridden in the instance using xsi:type.
Abstract complex types and element declarations are useful for creating
generic base types which contain information common to a set of types
Shape vs. Circle or Square), yet the definition is
not deemed "complete" unless further derivation (extension or restriction)
has been applied. While this feature is not complicated to use, some
implications of its use are subtle and complex. Abstract types should be
used with care.
Do Use Wildcards to Provide Well Defined Points Of Extensibility
WXS provides the wildcards
xs:anyAttribute which can be used to
allow the occurrence of elements and attributes from specified namespaces
into a content model. Wildcards allow schema authors to enable
extensibility of the content model while maintaining a degree of control
over the occurrence of elements and attributes. A good discussion of the
benefits of using wildcards is available in an XML.com article, "W3C XML
Schema Design Patterns: Dealing With Change".
Cautious schema authors, concerned with the problems posed by type
derivation, may choose to block attempts at type derivation using the
final attribute on complex type definitions and element
declarations (similar to
sealed in C# and
in Java). They may then choose to allow extensibility at specific parts of
the content model by using wildcards. This gives schema authors more
control over the content models they define and may reduce some of the
problems with various aspects of complex type derivation (specifically
derivation by extension).
<?xml version="1.0" encoding="utf-8" ?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.example.com/fruit/" elementFormDefault="qualified"> <xs:complexType name="myKitchen"> <xs:choice maxOccurs="unbounded"> <xs:any processContents="skip" /> <xs:element name="apple" type="xs:string"/> <xs:element name="cherry" type="xs:string"/> </xs:choice> </xs:complexType> </xs:schema>
The content model of the
myKitchen type is such that it
can contain one or more
cherry, or any
other element. However, during validation, if an
element is seen, the compiler cannot tell whether it should be validated
against the wildcard or the
apple element declaration.
There are subtle but potentially profound ramifications to the
selection of both the namespace attribute and the
processContents attribute. Overly restrictive values can
impede extensibility; overly loose values can open the schema up to
abuse. Controlling the supported namespaces for a wildcard can also be
bewildering, especially when the set of allowable namespaces is subject to
Do Not Use Group or Type Redefinition
Redefinition is a feature of WXS that allows you to change the meaning of an included type or group definition. Using xs:redefine, schema authors can include type or group definitions from schema documents and alter these definitions in a pervasive manner. Redefinition is pervasive because it not only affects type or group definitions in the including schema but also those in the included schema as well. Thus all references to the original type or group in both schemas refer to the redefined type, while the original definition is overshadowed. This leads to the problems pointed out in "W3C XML Schema Design Patterns: Dealing With Change":
This causes a certain degree of fragility because redefined types can adversely interact with derived types and generate conflicts. A common conflict is when a derived type uses extension to add an element or attribute to a type's content model, and a redefinition also adds a similarly named element or attribute to the content model
A major problem with type redefinition is that unlike type derivation
it cannot be prevented by using the
final attributes. Thus any schema can have its types
redefined in a pervasive manner, thus altering their semantics
completely. It is advisable to avoid this feature due to the potential
conflicts it can cause.
Many schema authors attempt to use type redefinition to increase the value space of an enumeration but this does not work. The only way to increase the number of values accepted by an enumeration used as a base type is to create a union. However, those additional values are only available to applications of the resulting union type, not for the applications of the original base type. Also note that chained redefinitions (redefining a redefine) can be problematic, resulting in unexpected definition clashes.
The WXS recommendation is a complex specification because it attempts to solve complex problems. One can reduce its burdens by utilizing its simpler aspects. Schema authors should ensure that their schemas validate in multiple schema processors. Schemas are an important facilitator of interoperability. It's foolish to depend on the nuances of a specific implementation and inadvertently give up this interoperability.
I'd like to thank Priya Lakshminarayanan and Mark Feblowitz for their help with this article.