XML.com

Co-occurrence constraints and Conditional Type Assignment, with XML Schema 1.1

May 29, 2018

Mukul Gandhi

This article discusses two useful features of XML Schema 1.1: "Co-occurrence constraints", and "Conditional Type Assignment".

1. Introduction

This article discusses the XML Schema 1.1 language and specifically its following two features in details: "Co-occurrence constraints", and "Conditional Type Assignment". The article assumes that reader has knowledge of XML, XML Namespaces and basic knowledge of XML Schema (likely the 1.0 version of XML Schema language) as well. The very basics of these technologies won't be covered in this article. XML Schema 1.0 Second Edition has been a W3C standard since October 2004, and it is widely implemented in numerous products and libraries. XML Schema 1.1 became a W3C standard in April 2012, and its implementation is already available in few products and libraries. This article uses Xerces-J to test the examples presented. The other compliant products will exhibit similar behavior.

Some form of co-occurrence constraints has been available in XML Schema 1.0 as well, which would be discussed as well in this article. Conditional Type Assignment is a completely new facility introduced in XML Schema 1.1 language.

XML Schema 1.1 language is backward compatible with XML Schema 1.0 language. This means that, XML Schema 1.0 validations will run fine with an XML Schema 1.1 processor. An XML Schema document, describes the structure and data-types for a certain class of XML documents (for example, a schema document can describe how a purchase order should look like as an XML document). To do this, the schema document uses notions like element and attribute declarations, and complex and simple type definitions. An XML Schema document also is a well-formed XML document, therefore it can be processed just like any XML document. But XML Schema documents are special in a sense that, they have elements and attributes from the XML namespace "http://www.w3.org/2001/XMLSchema" (the XML Schema document can have XML items from other XML namespaces as well as XML items not from any namespace).

2. XML Schema example

Before going deeper into the main topics of this article, I would like to present a simple example mentioning an XML instance document, its corresponding schema and the process of validation of the XML instance document with a schema. This intends to set the right technical context, for the larger body of this article.

Consider the following example, where a XML document has data about a person that would be validated with a XML Schema.

XML Schema document:

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

    <xs:element name="person">
        <xs:complexType>
            <xs:sequence>
                <xs:element name="fName" type="xs:string"/>
                <xs:element name="mName" type="xs:string" minOccurs="0"/>
                <xs:element name="lName" type="xs:string"/>
                <xs:element name="gender">
                    <xs:simpleType>
                        <xs:restriction base="xs:string">
                            <xs:enumeration value="M"/>
                            <xs:enumeration value="F"/>
                        </xs:restriction>
                    </xs:simpleType>
                </xs:element>
                <xs:element name="dob" type="xs:date" minOccurs="0"/> 
            </xs:sequence> 
        </xs:complexType> 
    </xs:element>

</xs:schema>

The following XML instance document is valid according to above schema document:

<person>
    <fName>Mukul</fName>
    <lName>Gandhi</lName>
    <gender>M</gender>
</person>

(the XML instance document has chosen not to mention the elements "mName" and "dob", which are declared optional in the XML schema).

As a well known convention, XML documents that are validated with a schema will be referred as XML instance documents in this article.

3. Why use co-occurrence constraints and conditional type assignments in XML validations?

The nature of data (in particular XML data) is such that, co-occurrence constraints and conditional type assignment are one of the natural constraints that can exist in XML documents. And the users of XML Schema language, have been demanding these features in XML Schema language since long time; and these are now available in XML Schema 1.1 language. Let's look at the benefits of these features as been described in paragraphs below.

Co-occurrence constraints: There is certainly a requirement that, different elements and attributes in an XML document may relate to each other by certain conditions. Some examples of these are

  1. On an XML element, attribute "min" must be less than attribute "max".
  2. The sum of data in a sequence of elements must meet a certain condition (like it must be equal, less than or greater than some value, etc).
  3. A specific element values must relate in a certain way, to a specific attribute's value.

We can imagine many other such constraints, which may collectively be termed as "co-occurrence constraints".

Conditional type assignment: These are a specific type of constraints, that solve the following problems while modeling XML data using XML schemas. Some properties of an element (mainly the absence/presence, or values of its attributes), may require certain types (simple or complex types) to the element.

These aspects will become more clear, in the following sections of this article.

4. Co-occurrence constraints

The XML Schema 1.1 specification defines schema co-occurrence constraints as follows:

"constraints which make the presence of an attribute or element, or the values allowable for it, depend on the value or presence of other attributes or elements".

XML Schema 1.0 provided certain kinds of co-occurrence constraints, using the following elements in the schema document: unique, key and keyref (these constructs are known as Identity Constraints or IDC in the XML Schema language). These elements need to be specified on the element declarations in a schema document, for establishing co-occurrence constraints. The unique, key and keyref constraints are available in XML Schema 1.1 language as well.

At a high level, both "key" and "unique" constraints require that, certain values in the XML document must be distinct (i.e. all different).

Following is an example of using the "key" element in an XML Schema document.

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

        <xs:element name="students">
            <xs:complexType>
               <xs:sequence>
                  <xs:element name="student" maxOccurs="unbounded" type="Student"/>
               </xs:sequence> 
            </xs:complexType>
            <xs:key name="id_key">
                <xs:selector xpath="student"/>
                <xs:field xpath="id"/>
            </xs:key>
        </xs:element>

        <xs:complexType name="Student">
            <xs:sequence>
                <xs:element name="id" type="xs:integer" minOccurs="0"/>
            </xs:sequence>
        </xs:complexType>

</xs:schema>

Interestingly, the "id" element within the "student" element is specified as optional in the above schema document. I've chosen so, to illustrate the differences between "key" and "unique" IDC elements.

Following is one valid XML document according to above schema document:

<?xml version="1.0"?>
<students>
    <student>
       <id>1</id>
    </student>
    <student>
       <id>2</id>
    </student>
    <student>
       <id>3</id>
    </student>
    <student>
       <id>4</id>
    </student>
    <student>
       <id>5</id>
    </student>
</students>

Following are two invalid XML documents for the schema presented in this example:

<?xml version="1.0"?>
<students>
    <student>
       <id>1</id>
    </student>
    <student>
       <id>2</id>
    </student>
    <student>
       <id>2</id>
    </student>
    <student>
       <id>4</id>
    </student>
    <student>
       <id>5</id>
    </student>
</students>

(in this example, a particular "id" value occurs more than once, that violates the "key" constraint)

and,

<?xml version="1.0"?>
<students>
    <student>
       <id>1</id>
    </student>
    <student>
       <id>2</id>
    </student>
    <student>
       <id>3</id>
    </student>
    <student>
       
    </student>
    <student>
       <id>5</id>
    </student>
</students>

(in this example, although "id" element is specified as optional in the schema, it must be present if "key" constraint is specified in the schema)

If in the XSD document of this example, we specified "unique" element instead of "key" as follows:

<xs:unique name="id_unique">
    <xs:selector xpath="student"/>
    <xs:field xpath="id"/>
</xs:unique>

and everything else in the XSD document remains same, then following two XML documents would be reported as valid by such a XSD document:

<?xml version="1.0"?>
<students>
    <student>
        <id>1</id>
    </student>
    <student>
        <id>2</id>
    </student>
    <student>

    </student>
    <student>
        <id>4</id>
    </student>
    <student>
        <id>5</id>
    </student>
</students>

and,

<?xml version="1.0"?>
<students>
    <student>
        <id>1</id>
    </student>
    <student>
        <id>2</id>
    </student>
    <student>

    </student>
    <student>
        
    </student>
    <student>
        <id>5</id>
    </student>
</students>

The "unique" element in an XSD document, considers fine that the XML item pointed by xs:field can be null/absent. But the "key" element does not allow this (all XML items pointed by xs:field must exist, and must have valid values).

I won't be explaining "keyref" element in this article, and expect the readers to read about it elsewhere if they want to.

4.1 XML Schema 1.1 co-occurrence constraints

In this section, I would explain the uses of following XML Schema 1.1 constructs: <xs:assert> and <xs:assertion>. The <xs:assert> construct provides features of co-occurrence constraints. <xs:assertion> looks similar to <xs:assert> but <xs:assertion> is specified in the definitions of XSD simple types as facet. Although <xs:assertion> is not a co-occurrence syntax in XML Schema documents, I'll explain <xs:assertion> as well in this section, since <xs:assert> and <xs:assertion> are syntactically very similar.

I'll copy few definitions from the XML Schema 1.1 specification below, to illustrate where in an XSD document <xs:assert> element can occur (the place of assertions are highlighted with bold emphasis).

XML Representation Summary: complexType Element Information Item

<complexType
    abstract = boolean : false
    block = (#all | List of (extension | restriction)) 
    final = (#all | List of (extension | restriction)) 
    id = ID
    mixed = boolean
    name = NCName
    defaultAttributesApply = boolean : true
    {any attributes with non-schema namespace . . .}>
    Content: (annotation?, (simpleContent | complexContent | (openContent?, (group | all | choice | sequence)?, ((attribute | attributeGroup)*, anyAttribute?), assert*)))
</complexType>

XML Representation Summary: simpleContent Element Information Item et al.

<simpleContent
    id = ID
    {any attributes with non-schema namespace . . .}>
    Content: (annotation?, (restriction | extension))
</simpleContent>
<restriction
    base = QName
    id = ID
    {any attributes with non-schema namespace . . .}>
    Content: (annotation?, (simpleType?, (minExclusive | minInclusive | maxExclusive | maxInclusive | totalDigits | fractionDigits | length | minLength | maxLength | enumeration | whiteSpace | pattern | assertion | {any with namespace: ##other})*)?, ((attribute | attributeGroup)*, anyAttribute?), assert*)
</restriction>
<extension
    base = QName
    id = ID
    {any attributes with non-schema namespace . . .}>
    Content: (annotation?, ((attribute | attributeGroup)*, anyAttribute?), assert*)
</extension>

XML Representation Summary: complexContent Element Information Item et al.

<complexContent
    id = ID
    mixed = boolean
    {any attributes with non-schema namespace . . .}>
    Content: (annotation?, (restriction | extension))
</complexContent>
<restriction
    base = QName
    id = ID
    {any attributes with non-schema namespace . . .}>
    Content: (annotation?, openContent?, (group | all | choice | sequence)?, ((attribute | attributeGroup)*, anyAttribute?), assert*)
</restriction>
<extension
    base = QName
    id = ID
    {any attributes with non-schema namespace . . .}>
    Content: (annotation?, openContent?, ((group | all | choice | sequence)?, ((attribute | attributeGroup)*, anyAttribute?), assert*))
</extension>

As can be seen from above definitions, <xs:assert> can be written in complex type definitions and they can occur zero up to any number of times. The asserts are written at the end of complex type definitions.

The element <xs:assert> is itself defined as following in the XML Schema 1.1 specification:

XML Representation Summary: assert Element Information Item

<assert
    id = ID
    test = an XPath expression
    xpathDefaultNamespace = (anyURI | (##defaultNamespace | ##targetNamespace | ##local)) 
    {any attributes with non-schema namespace . . .}>
    Content: (annotation?)
</assert>

Quoting from XML Schema 1.1 specification, "An assertion is a predicate associated with a type, which is checked for each instance of the type. If an element or attribute information item fails to satisfy an assertion associated with a given type, then that information item is not locally valid with respect to that type."

What this essentially means is, that if assertions are specified on a complex type for example, then every XML instance fragment that is validated by such a complex type will also be validated by the rules expressed by the assertions. An assertion when evaluated on an XML instance document, will result in a boolean 'true' or a 'false' outcome. If the evaluation of any assertion results in a 'false' outcome, then the concerned XML instance fragment and the XML instance document as a whole would be reported as invalid by the XML Schema 1.1 processor. Assertions are specified as XPath 2.0 expressions, that are evaluated on a strongly typed (the types we are referring are the XML Schema types) XML tree rooted at an element whose complex type is performing the validation of an XML element.

Let us look at few XML Schema use cases, where assertions could be useful. The following examples with relevant explanations illustrate this.

Assertion example 1:

Consider the following XSD 1.1 schema:

(the complex type of this XSD document is borrowed from XML Schema 1.1 specification)

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

    <xs:element name="data">
        <xs:complexType>
            <xs:attribute name="min" type="xs:int"/>
            <xs:attribute name="max" type="xs:int"/>
            <xs:assert test="@min le @max"/>
        </xs:complexType>
    </xs:element>

</xs:schema>

(the assertion in this example, requires that the value of "min" attribute must be less or equals to that of "max" attribute for the XML document to be considered valid)

Following is a valid XML document, when validated by the above schema:

<?xml version="1.0"?>
<data min="5" max="10"/>

While, following XML document is invalid when validated by the schema in this example:

<?xml version="1.0"?>
<data min="5" max="2"/>

The XML Schema 1.1 specification allows, one up to any number of <xs:assert> elements to be there at a particular point. Consider the following XSD document, which is a slight variation of the XSD document specified above:

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

    <xs:element name="data">
        <xs:complexType>
            <xs:attribute name="min" type="xs:int"/>
            <xs:attribute name="max" type="xs:int"/>
            <xs:assert test="@min le @max"/>
            <xs:assert test="@min gt 2"/>
        </xs:complexType>
    </xs:element>

</xs:schema>

In this example we have two <xs:assert> elements, both of which need to return boolean value 'true' for the validation to pass. The second <xs:assert> specifies that the "min" attribute must be greater than 2.

Following is a valid XML document for the above schema:

<?xml version="1.0"?>
<data min="5" max="10"/>

Whereas, following XML document will be reported as invalid by the schema specified in this example:

<?xml version="1.0"?>
<data min="1" max="10"/>

Interestingly, the two asserts specified in this example can be converted into following one <xs:assert>:

<xs:assert test="(@min le @max) and (@min gt 2)"/>

(please note the "and" condition in the assert's XPath expression)

Note that any particular assert, can use expressions using the full schema aware XPath 2.0 language (look at References section, for a link to XPath 2.0 language specification).

Assertion example 2:

Consider the following XSD document:

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

    <xs:element name="IDVals">
        <xs:complexType>
            <xs:sequence>
                <xs:element name="id" type="xs:integer" maxOccurs="unbounded"/>
            </xs:sequence>
            <xs:assert test="count(id) eq count(distinct-values(id))"/>
        </xs:complexType>
    </xs:element>

</xs:schema>

The assertion in above XSD document specifies, that all "id" element values must be distinct.

Following is a valid XML document as per schema above:

<?xml version="1.0"?>
<IDVals>
    <id>1</id>
    <id>2</id>
    <id>3</id>
    <id>4</id>
    <id>5</id>
</IDVals>

Whereas, following is an invalid XML document when validated by the schema document provided in this example:

<?xml version="1.0"?>
<IDVals>
    <id>1</id>
    <id>2</id>
    <id>3</id>
    <id>4</id>
    <id>2</id>
</IDVals>

(the value 2 has occurred more than once)

It can be argued, that assertion in this example specifies an IDC like constraint (with "key" or "unique" XSD elements). It is really up to the user's taste whether they would like to use IDC constraints or use assertions, for a requirement like specified in this example.

Assertion example 3:

Let us look at another XSD example, that solves the problem of sorting using <xs:assert> expressions. Consider the following XSD document:

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

    <xs:element name="IDVals">
        <xs:complexType>
            <xs:sequence>
                <xs:element name="id" type="xs:integer" maxOccurs="unbounded"/>
            </xs:sequence>
            <xs:assert test="every $val in (for $x in 1 to count(id)-1 return (id[$x] le id[$x+1])) satisfies ($val eq true())"/>
        </xs:complexType>
    </xs:element>

</xs:schema>

The <xs:assert> element in this XSD document requires, that values of "id" elements must occur in ascending sorted order. In this example, consecutive same values are considered sorted.

Let's try to understand the XPath 2.0 expression as specified in this example. A "for" loop is embedded in an "every" statement. The "for" loop returns a sequence of boolean values (a 'true' means that, the previous value is less-or-equal to the current item's value). The "every" construct specifies, that each of the item in the "for" loop's returned sequence must be 'true'. The specified XPath expression is a naive implementation, of the check of sorted order.

Assertion example 4:

Assertions also provides new string processing capabilities for complex type "mixed" content models. Following is one example of this. Consider the following XSD document:

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

    <xs:element name="X">
        <xs:complexType mixed="true">
            <xs:sequence>
                <xs:element name="x" type="xs:string" maxOccurs="unbounded"/>
            </xs:sequence>
            <xs:assert test="not(contains(string-join(text(),''),'prohibited'))"/>
        </xs:complexType>
    </xs:element>

</xs:schema>

(note mixed="true" on complexType, and the <xs:assert> element)

The assertion specifies that, the word "prohibited" must not occur in any of text nodes which are child of the element node "X".

Following are two valid XML documents for the above schema document:

<?xml version="1.0"?>
<X>
    <x>a</x>
    <x>b</x>
    <x>c</x>
    <x>d</x>
    <x>e</x>
</X>

and,

<?xml version="1.0"?>
<X>
    <x>a</x>
    <x>b</x>
    <x>c</x>
    f
    <x>d</x>
    <x>e</x>
</X>

Whereas, following is one invalid XML document when validated by the XSD document provided in this example:

<?xml version="1.0"?>
<X>
    <x>a</x>
    <x>b</x>
    <x>c</x>
    prohibited
    <x>d</x>
    <x>e</x>
</X>

Assertion example 5:

Assertions also provide for arbitrary string processing when using the <xs:any> wild-card schema element (the <xs:any> wild-card in the schema, specifies that it allows any element to occur at its point in the XML instance document). Consider the following XSD document:

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

    <xs:element name="X">
        <xs:complexType>
            <xs:sequence>
                <xs:any processContents="skip"/>
            </xs:sequence>
            <xs:assert test="not(temp)"/>
            <xs:assert test="string-length(string(*[1])) lt 10"/>
        </xs:complexType>
    </xs:element>

</xs:schema>

The first assertion specifies that, child element of element "X" must not have name "temp". The second assertion specifies that, string value of an element which is child of "X" must have maximum length 9.

Following is one valid XML document for the XSD schema shown above:

<?xml version="1.0"?>
<X>
   <x>abcde</x>
</X>

Whereas following two XML documents would be invalid, when validated by the same XSD schema:

<?xml version="1.0"?>
<X>
   <temp>abcde</temp>
</X>

(element "temp" is not allowed by the <xs:assert>)

and,

<?xml version="1.0"?>
<X>
   <x>abcdefghij</x>
</X>

(the string "abcdefghij" exceeds the length required by <xs:assert>)

Assertion example 6:

In this example, we'll see how specifying <xs:assert> works in an XSD complex type derivation. Consider the following XSD document:

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

    <xs:element name="X">
        <xs:complexType>
            <xs:complexContent>
                <xs:extension base="xSpec">
                    <xs:attribute name="id" type="xs:integer"/>
                    <xs:assert test="@id mod 2 = 0"/>
                </xs:extension>
            </xs:complexContent>
        </xs:complexType>
    </xs:element>

    <xs:complexType name="xSpec">
        <xs:sequence>
            <xs:element name="x" type="xs:string"/>
        </xs:sequence>
        <xs:assert test="contains(x, 'hello')"/>
    </xs:complexType>

</xs:schema>

Following is one valid XML document, when validated by the above XSD schema:

<?xml version="1.0"?>
<X id="2">
    <x>abcdefhello</x>
</X>

(the value of attribute "id" is an even integer, and the value of element "x" contains the word "hello" )

Following are two invalid XML documents, when validated by the XSD document specified in this example:

<?xml version="1.0"?>
<X id="2">
   <x>abcdef</x>
</X>

(the value of element "x" doesn't contain the word "hello" )

<?xml version="1.0"?>
<X id="1">
   <x>abcdefhello</x>
</X>

(the value of "id" attribute is not even)

Other useful XSD 1.1 schemas can be written that use <xs:assert>, if we use XPath 2.0 functions like "avg", "max", "min", "sum" (Infact the whole of XPath 2.0 functions & operators can be used in <xs:assert> XPath expressions).

With Xerces-J, by default XML comments and processing instructions (PIs) are not available in the XPath Data Model (XDM) trees that <xs:assert> expressions can access. With an API option, setting a specific validation feature to boolean 'true', will make comments and PIs exist in XDM trees during assertion evaluations. By enabling this feature, we can allow <xs:assert> expressions to check for presence/absence and do string processing on comments and PIs.

4.2 Simple Type facet <xs:assertion>

The XML Schema 1.1 language, has introduced a new facet for simple types that is named <xs:assertion>. Although <xs:assertion> doesn't provide for co-occurrence constraints, I'm describing it here because it is quite similar to <xs:assert> at the syntactical level (<xs:assert> and <xs:assertion> both use XPath 2.0 expressions as a predicate language).

Following is an excerpt from the XML Schema 1.1 specification the grammar of simple types:

XML Representation Summary: simpleType Element Information Item et al.

<simpleType
    final = (#all | List of (list | union | restriction | extension)) 
    id = ID
    name = NCName
    {any attributes with non-schema namespace . . .}>
    Content: (annotation?, (restriction | list | union))
</simpleType>
<restriction
    base = QName
    id = ID
    {any attributes with non-schema namespace . . .}>
    Content: (annotation?, (simpleType?, (minExclusive | minInclusive | maxExclusive | maxInclusive | totalDigits | fractionDigits | length | minLength | maxLength | enumeration | whiteSpace | pattern | 
assertion | explicitTimezone | {any with namespace: ##other})*))
</restriction>
<list
    id = ID
    itemType = QName
    {any attributes with non-schema namespace . . .}>
    Content: (annotation?, simpleType?)
</list>
<union
    id = ID
    memberTypes = List of QName
    {any attributes with non-schema namespace . . .}>
    Content: (annotation?, simpleType*)
</union>

The element <xs:assertion> is itself defined as following in the XML Schema 1.1 specification:

XML Representation Summary: assertion Element Information Item

<assertion
    id = ID
    test = an XPath expression
    xpathDefaultNamespace = (anyURI | (##defaultNamespace | ##targetNamespace | ##local)) 
    {any attributes with non-schema namespace . . .}>
    Content: (annotation?)
</assertion>

Let's see a fairly simple example of using <xs:assertion> facet in an XSD schema. Consider the following XSD document:

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

    <xs:element name="IDVals">
        <xs:complexType>
            <xs:sequence>
                <xs:element name="id" type="ID" maxOccurs="unbounded"/>
            </xs:sequence>
        </xs:complexType>
    </xs:element>

    <xs:simpleType name="ID">
        <xs:restriction base="xs:nonNegativeInteger">
            <xs:assertion test="$value mod 2 = 0"/>
        </xs:restriction>
    </xs:simpleType>

</xs:schema>

Following is one valid XML document, when validated by the above XSD document:

<?xml version="1.0"?>
<IDVals>
    <id>2</id>
    <id>4</id>
    <id>6</id>
    <id>8</id>
    <id>10</id>
</IDVals>

(the value of the "id" element must be an even integer, as required by the schema)

Whereas following is one invalid XML document, when validated by the XSD document specified in this example:

<?xml version="1.0"?>
<IDVals>
    <id>1</id>
    <id>2</id>
    <id>3</id>
    <id>4</id>
    <id>5</id>
</IDVals>

(the "id" values 1, 3 and 5 are not valid according to the XSD schema)

5. Conditional Type Assignment

The XML Schema 1.1 specification also refers to this XSD construct as "Type Alternatives". Many times, we would also refer to this feature as CTA. The "Type Alternative" XSD feature is also a form of co-occurrence constraint (as we shall see in this section), but I have chosen to describe it in a section of its own. Assertions are the generic co-occurrence constraints feature, while CTAs provide a certain kind of co-occurrence constraints feature.

We can specify 0 up to any number of <xs:alternative> XSD elements as child of <xs:element> construct.

The following XSD grammar fragments, illustrate the syntax of <xs:alternative> construct:

XML Representation Summary: element Element Information Item

<element
    abstract = boolean : false
    block = (#all | List of (extension | restriction | substitution)) 
    default = string
    final = (#all | List of (extension | restriction)) 
    fixed = string
    form = (qualified | unqualified)
    id = ID
    maxOccurs = (nonNegativeInteger | unbounded) : 1
    minOccurs = nonNegativeInteger : 1
    name = NCName
    nillable = boolean : false
    ref = QName
    substitutionGroup = List of QName
    targetNamespace = anyURI
    type = QName
    {any attributes with non-schema namespace . . .}>
    Content: (annotation?, ((simpleType | complexType)?, alternative*, (unique | key | keyref)*))
</element>
<alternative
    id = ID
    test = an XPath expression
    type = QName
    xpathDefaultNamespace = (anyURI | (##defaultNamespace | ##targetNamespace | ##local)) 
    {any attributes with non-schema namespace . . .}>
    Content: (annotation?, (simpleType | complexType)?)
</alternative>

Let's look at a simple example, illustrating the functionality of <xs:alternative> element. Consider the following XSD document:

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

    <xs:element name="Addresses">
        <xs:complexType>
            <xs:sequence>
                <xs:element name="address" minOccurs="2" maxOccurs="2">
                    <xs:alternative test="@format ='US'" type="USAddress"/>
                    <xs:alternative test="@format ='Canada'" type="CanadaAddress"/>
                </xs:element>
            </xs:sequence>
        </xs:complexType>
    </xs:element>

    <xs:complexType name="USAddress">
        <xs:sequence>
            <xs:element name="street" type="xs:string"/>
            <xs:element name="city" type="xs:string"/>
            <xs:element name="state" type="xs:string"/>
            <xs:element name="zip" type="xs:positiveInteger"/> 
        </xs:sequence>
        <xs:attribute name="format" type="xs:string" fixed="US"/>
    </xs:complexType>

    <xs:complexType name="CanadaAddress">
        <xs:sequence>
            <xs:element name="civicAddress" type="xs:string"/>
            <xs:element name="municipality" type="xs:string"/>
            <xs:element name="province" type="xs:string"/>
            <xs:element name="postalCode" type="xs:string"/>
        </xs:sequence>
        <xs:attribute name="format" type="xs:string" fixed="Canada"/>
    </xs:complexType>

</xs:schema>

According to the XSD document above, the content of "address" element will have a type "USAddress" in an XML instance document if the attribute "format" has value "US" on the "address" element. And if the "address" element's attribute "format" has value "Canada", then the content of "address" element will have a type "CanadaAddress" in an XML instance document. i.e depending on the value of an attribute in an XML instance document, we can specify different XSD types to an element.

Following is one valid XML document, according to the XSD document presented above:

<?xml version="1.0"?>
<Addresses>
    <address format="US">
        <street>123 Main Street</street>
        <city>Lansing</city>
        <state>Michigan</state>
        <zip>48864</zip>
    </address>
    <address format="Canada">
        <civicAddress>10-123 1/2 Main ST SE</civicAddress>
        <municipality>Montreal</municipality>
        <province>QC</province>
        <postalCode>H3Z 2Y7</postalCode>
    </address>
</Addresses>

Let's also look at the following variations to above XSD validation:

1) If in an XML instance document, we specify value other than "US" and "Canada" to the attribute "format" of XML element "address".

In this case, the validation will still pass. Let's discuss the reasons of this. The <xs:element> declaration with <xs:alternative> elements in it, is implicitly available as following:

<xs:element name="address" minOccurs="2" maxOccurs="2" type="xs:anyType">
   <xs:alternative test="@format ='US'" type="USAddress"/>
   <xs:alternative test="@format ='Canada'" type="CanadaAddress"/>
</xs:element>

That is, if "format" attribute does not have values "US" or "Canada", then "address" element will have type "xs:anyType".

Also consider following syntax:

<xs:element name="E1" type="T3">
   <xs:alternative test="..." type="T1"/>
   <xs:alternative test="..." type="T2"/>
</xs:element>

The XML instance element "E1" can have types T1, T2 or T3. Types T1 and T2 must derive from type T3. This is a constraint that is required by the XML Schema 1.1 specification.

2) We change the element declaration in XSD document to following:

<xs:element name="address" minOccurs="2" maxOccurs="2">
    <xs:alternative test="@format ='US'" type="USAddress"/>
    <xs:alternative test="@format ='Canada'" type="CanadaAddress"/>
    <xs:alternative type="xs:error"/>
</xs:element>

Now if in the XML instance document, value of attribute "format" is anything other than "US" or "Canada", the type xs:error will be assigned to the element "address". xs:error is an XSD simple type, and any element or attribute that is assigned the type xs:error is invalid. Therefore, in this case the XSD validation fails.

"test" attribute on <xs:alternative>: The value of "test" attribute on the <xs:alternative> element, is an XPath expression. It would evaluate to either boolean 'true' or 'false'. The context item for the "test" evaluation is an element node on which <xs:alternative> constructs are specified. The XML Schema 1.1 language says, that we can write these "test" expressions using a CTA specific XPath language (that is specified in the XSD 1.1 specification), or using the full XPath 2.0 language. Xerces-J by default uses the CTA specific XPath language, but it can use full XPath 2.0 language by setting a specific feature on the validation API during the XSD validation. Other XSD 1.1 validators can choose to implement this, in their own ways.

6. Summary

I hope this article met the expectations of readers, to understand the working of co-occurrence constraints and type alternative features of the XSD 1.1 language.

While discussing both assertions and conditional type assignments, I haven't discussed the XML namespace related features of assertion and CTA constructs. They are simple to understand, if we know the fundamentals of XML namespaces. I would leave it to the readers to explore those features, if they want to.

7. References