Extensibility, XML Vocabularies, and XML Schema
by David Orchard
|
Pages: 1, 2, 3, 4, 5, 6
Indicating Incompatible Changes
Given adoption of the Must Ignore rule, it is often the case that the creator of an extension or a new version wants to require that the consumer understand the extension, overriding the Must Ignore rule. The previous section showed how a version author could use new namespace names, element names, or version numbers to indicate an incompatible change. An extension author does not have these mechanisms available for indicating an incompatible or mandatory extension. A language provider that wants to allow extension authors to indicate incompatible extension must provide a mechanism for indicating that consumers must understand the extension.
10.Provide Must Understand rule: Container languages SHOULD provide a “Must Understand” model for dealing with the extensions that may optionally override a default Must Ignore rule.
This rule and the Must Ignore rule work together to provide a stable and flexible processing model for extensions.
Must Understand Flag
Arguably the simplest and most flexible
override technique is a Must
Understand flag that indicates whether the item must be
understood. The SOAP [8], WSDL [9], and WS-Policy [10]
attributes and values for specifying understand are respectively:
soap:mustUnderstand=”1”,
wsdl:required=”1”,
wsp:Usage=”wsp:Required”. SOAP
is probably the most common case of a container that provides a
Must Understand model. The
default value is 0, which is
effectively the Must Ignore rule.
A language designer can reuse an existing Must Understand model by constraining their language to an existing Must Understand model. A number of web services specifications have done this by specifying that the components are SOAP header blocks, which explicitly brings in the SOAP Must Understand model.
A language designer can design a Must
Understand model into their language. A Must Understand flag
allows the producer to insert extensions into the container and use
the Must Understand attribute to over-ride the must
Ignorerule. This allows producers to extend
instances without changing the extension element’s
parent’s namespace, retaining backwards compatibility.
Obviously the consumer must be extended to handle new extensions,
but there is now a loose coupling between the language’s
processing model and the extension’s processing model. A
Must Understand flag is provided below:
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://www.openuri.org/name/1"
xmlns:name="http://www.openuri.org/name/1">
<xs:complexType name="name">
<xs:sequence>
<xs:element name="first" type="xs:string"/>
<xs:element name="last" type="xs:string"/>
<xs:element name="middle" type="xs:string" minOccurs="0"/>
<s:element name="Extension" type="name:ExtensionType"
minOccurs="0" maxOccurs="1"/>
<xs:any namespace="##other" processContents="lax"
minOccurs="0" maxOccurs="unbounded"/>
</xs:sequence>
<xs:attribute ref="name:mustUnderstand"/>
<xs:anyAttribute/>
</xs:complexType>
<xs:complexType name="ExtensionType">
<xs:sequence>
<xs:any processContents="lax" minOccurs="1"
maxOccurs="unbounded" namespace="##targetnamespace"/>
</xs:sequence>
<xs:anyAttribute/>
</xs:complexType>
<xs:attribute name="mustUnderstand" type="xs:boolean"/>
</xs:schema>
Example 13 – New components in existing or new namespace(s) with Extension Type Schema and Must Understand
An example of an instance of a third party indicating that a prefix component is an incompatible change:
<name xmlns="http://www.openuri.org/name/1">
<first>Dave</first>
<last>Orchard</last>
<pref2:prefix xmlns:pref2="http://www.example.org/name/pref/1"
name:mustUnderstand="true">
Mr.
</pref2:prefix>
</name>
Example 14 – New components in existing or new namespace(s) instance with Must Understand
Specification of a Must Understand flag must be treated carefully as it can be computationally expensive. Typically a processor will either: perform a scan for Must Understand components to ensure it can process the entire document, or incrementally process the instance and is prepared to rollback or undo any processing if an not understood Must Understand is found.
There are other refinements related to Must Understand. One example is providing an element that indicates which extension namespaces must be understood, which avoids the scan of the instance for Must Understand flags.
Type Extension
Another option for indicating mandatory requirements is allowing extension authors to use other schema mechanisms for extending the main type, such as type extension. The language designer allows for type extension, and they must specify that type extensions must be understood.
<nameWithPrefix xmlns="http://www.openuri.org/name/prefix/1">
<first>Dave</first>
<last>Orchard</last>
<prefix>Mr.</prefix>
</nameWithPrefix>
Example 15 – New components in existing or new namespace(s) with Type Extension instance
The nameWithPrefix schema is an extension of the name with the prefix added.
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://www.openuri.org/name/pref/1"
xmlns:pref="http://www.openuri.org/name/pref/1">
<xs:import namespace="http://www.openuri.org/name/1"/>
<xs:complexType name="nameWithPrefix">
<xs:complexContent>
<xs:extension base="name:name">
<xs:sequence>
<xs:element name="prefix" type="xs:string"/>
</xs:sequence>
</xs:extension>
</xs:complexContent>
</xs:complexType>
</xs:schema>
Example 16 – Illegal new components in existing or new namespace(s) with Type Extension schema mandatory extension
Like many of the attempts to write a schema so far, this schema and other variations are problematic. This schema is illegal because the prefix in the pref namespace and the wildcard with ##other are non-deterministic. An alternative is to not have the wildcard at all, and rely upon subtyping for extension. But this prevents any kind of compatible evolution as both sides must have the new schema to understand the type. There language designer has to choose between allowing compatible extensibility/versioning OR incompatible extensibility when subtyping is used.
The language designer has the option of using subtyping for incompatible versioning as they could create a nameWithPrefix type that adds the prefix in the same namespace. This does not enable extension authors to indicate incompatible extensions.
Substitution Groups
Another mechanism for extending a type in XML Schema is substitution groups. Substitution groups enable an element to be declared as substitutable for another. This can only be used for incompatible extensions as the consumer must understand the substitution type. Substitution groups require that elements are available for substitution, so the name designer must have provided a name element in addition to the name type.
Substitution groups do allow a single extension author to indicate that his or her changes are mandatory. The limitations are that the extension author has now taken over the type’s extensibility. A visual way of imagining this is that the type tree has now been moved from the language designer over to the extensions author. And the language designer probably does not want his or her type to be “hijacked”.
However, this is not substantially different than an extension being marked with a “Must Understand.” In either case -- with the extensions higher up in the tree (sometimes called top-typing) or lower in the tree (bottom-typing) -- a new type is effectively created.
The difference is that there can only be one element at the top of an element hierarchy. If multiple mandatory extensions are added, then the only way to compose them together is at the bottom of the type because that is where the extensibility is.
Substitution groups do not allow a language designer and an extension author to incompatibly change the language, since they end up conflicting over what to call the name element. Thus substitution groups are a poor mechanism for allowing an extension author to indicate that their changes are incompatible. A Must Understand flag is a superior method because it allows multiple extension authors to mix their mandatory extensions with a language designer’s versioning strategy.
Therefore, language designers should prevent substitution groups and provide a Must Understand flag or other model when they wish to allow third parties to make incompatible changes.
In some cases, a language does not provide
a Must Understand mechanism. In the absence of a
Must Understand model, the only way to force consumers to
reject a message if they don’t understand the extension
namespace is to change the namespace name of the root element, but
this is rarely desirable.
Extension v. Versioning
The usage of namespace names for identifying components has led to the interesting situation where the distinction between an extension and a version can be quite blurred, depending upon the language designer’s choices.
One rough way of thinking of these two concepts is that extension is typically the addition of components over space; that is, designers other than the language’s creator are adding components. Versioning is typically the addition of components over time, under the designer’s explicit control. In either case, a change to the language may be done in a compatible or an incompatible way. The simple cases of extensions are compatible decentralized additions and versions are compatible or incompatible centralized changes are how we typically distinguish the terms. But these break down depending upon how the language is designed.
There are a couple of scenarios that illustrate the ambiguity in these terms. Imagine that version 1.0 of a Name consists of “First” and “Last” elements. A third party author extends the Name with a “middle” element in a new namespace that he or she controls.
In scenario 1, the Name author decides to formally incorporate the middle name as an optional (and hence compatible) addition to the name, producing version 1.1 of the Name type. He does this by referring to the third party’s definition and namespace for middle names. This is typically considered a new “version” of the Name and would probably result in a new schema definition. If the Name author reuses namespace names for compatible revisions, there will be no difference in an instance document containing middle that is of Version 1.0 or Version 1.1 type. The instance documents are the same, and thus the distinction between a “version” and an “extension” is meaningless for an individual document.
In scenario 2, the middle author decides that the middle name is a mandatory part of the Name type. He or she was provided with a mechanism for indicating an incompatible change and uses it. Now an instance of Name with the middle is incompatible with version 1.0 of the Name. What “version” of the Name is this middle, and is the middle an “extension” or a “version”? It isn’t 1.0. It’s probably more accurately thought of as a version defined by the third party. Again, the presence of the “extension” is actually an incompatible change.
These two examples -- a third-party extension being added into a compatible version and a third-party extension resulting in an incompatible version -- show the ability to specify (in)compatibility has blurred the distinction between these two terms.