W3C XML Schema Datatypes Reference
by Rick Jelliffe
November 29, 2000
This quick reference helps you easily locate the definition of
datatypes in the XML Schema specification. A "What You Need To Know"
section gives a brief introduction to the way datatypes work.
Specification Map
What You Need To Know
- W3C XML Schema specification defines many different built-in
datatypes. These
datatypes can be used to constrain the values of attributes or
elements which contain only simple content. These datatypes are not
available for constraining data in mixed content.
Derivation and Facets
- All simple datatypes are derived from
their base type by restricting the values allowed in their lexical
spaces or their value
spaces.
- Every datatype has a set of
facets that characterize the properties of the datatype. For
example, the
length of a string or the encoding of a
binary type (i.e., whether hex encoding or base64). By restricting
some of the many facets, a new datatype can be derived.
- There are three varieties of datatypes that you can use when
deriving your own datatypes: as well as atomic
datatypes, where the data contains a single value, you can derive a
list, where
the data is treated as a whitespace-separated list of tokens, and a
union
type, where the lexical value of the data determines which of the
base types is used.
Usage of the string datatype
The string
datatype should not be used for general text. Use a complex type
instead, allowing mixed content and "wildcarding" it to allow elements
from other namespaces. This kind of declaration will be more
future-proof. It is impossible to extend an element declared to have
simple content so that it can contain sub-elements. Here is a
definition that may be more suitable:
<complexType name="kindToStrangersText" mixed="true" >
<annotation>
<documentation xml:lang="en" >
This is a type definition for generic text in XML.
For maintenance reasons, it is preferable to use
something like this rather than the built-in datatype
string, unless you have an absolute requirement to
use a simple datatype.
</documentation>
</annotation>
<group minOccurs="0" maxOccurs="unbounded" >
<any namespace="##other" />
</group>
<attributeGroup ref="xml:specialAttrs"/>
<anyAttribute namespace="##any" />
</complexType>
You will have to import the xml:lang and xml:space definitions
too:
<import namespace="http://www.w3.org/XML/1998/namespace"
schemaLocation="http://www.w3.org/2000/10/xml.xsd" />
And the schema element itself should probably have namespace
declaration.
xmlns:xml="http://www.w3.org/XML/1998/namespace"
Limitations
There is no provision for
- overriding facets in the instance document,
- creating quantity/unit pairs,
- declaring n>1 dimensional arrays of tokens,
- specifying inheritance effects,
- declaring complex constraints where the value of some
other information item in the instance (e.g. an attribute) has an
effect on the current datatype.