Menu

W3C XML Schema Datatypes Reference

November 29, 2000

Rick Jelliffe

This quick reference helps you easily locate the definition of datatypes in the XML Schema specification. A "What You Need To Know" section gives a brief introduction to the way datatypes work.

Specification Map

What You Need To Know

  • W3C XML Schema specification defines many different built-in datatypes. These datatypes can be used to constrain the values of attributes or elements which contain only simple content. These datatypes are not available for constraining data in mixed content.

Derivation and Facets

  • All simple datatypes are derived from their base type by restricting the values allowed in their lexical spaces or their value spaces.
  • Every datatype has a set of facets that characterize the properties of the datatype. For example, the length of a string or the encoding of a binary type (i.e., whether hex encoding or base64). By restricting some of the many facets, a new datatype can be derived.
  • There are three varieties of datatypes that you can use when deriving your own datatypes: as well as atomic datatypes, where the data contains a single value, you can derive a list, where the data is treated as a whitespace-separated list of tokens, and a union type, where the lexical value of the data determines which of the base types is used.

Usage of the string datatype

The string datatype should not be used for general text. Use a complex type instead, allowing mixed content and "wildcarding" it to allow elements from other namespaces. This kind of declaration will be more future-proof. It is impossible to extend an element declared to have simple content so that it can contain sub-elements. Here is a definition that may be more suitable:


<complexType name="kindToStrangersText"  mixed="true" >

  <annotation>

    <documentation xml:lang="en" >

    This is a type definition for generic text in XML. 

    For maintenance reasons, it is preferable to use

    something like this rather than the built-in datatype

    string, unless you have an absolute requirement to

    use a simple datatype.

    </documentation>

  </annotation>

  <group minOccurs="0" maxOccurs="unbounded" >

    <any namespace="##other" />

  </group>

  <attributeGroup ref="xml:specialAttrs"/>

  <anyAttribute namespace="##any" />

</complexType>

You will have to import the xml:lang and xml:space definitions too:


<import namespace="http://www.w3.org/XML/1998/namespace"

   schemaLocation="http://www.w3.org/2000/10/xml.xsd" />

And the schema element itself should probably have namespace declaration.


xmlns:xml="http://www.w3.org/XML/1998/namespace"

Limitations

There is no provision for

  • overriding facets in the instance document,
  • creating quantity/unit pairs,
  • declaring n>1 dimensional arrays of tokens,
  • specifying inheritance effects,
  • declaring complex constraints where the value of some other information item in the instance (e.g. an attribute) has an effect on the current datatype.