XML.com: XML From the Inside Out
oreilly.comSafari Bookshelf.Conferences.


UML For W3C XML Schema Design

UML For W3C XML Schema Design

August 07, 2002

Even with the clear advantages it offers over the fast-receding DTD grammar, W3C XML Schema (WXS) cannot be praised for its concision. Indeed, in discussions of XML vocabulary design, the DTD notation is often thrown up on a whiteboard solely for its ability to quickly and completely communicate an idea; the corresponding WXS notation would be laughably awkward, even when WXS will be the implementation language. Thus, UML, a graphical design notation, is all the more attractive for WXS design.

UML is meant for greater things than simple description of data structures. Still the UML metamodel can support Schema design quite well, for wire-serializable types, persistence schema, and many other XML applications. UML and XML are likely to come in frequent professional contact; it would be nice if they could get along. The highest possible degree of integration of code-design and XML-design processes should be sought.

Pushing the Envelope

Related Reading

XML Schema

XML Schema
The W3C's Object-Oriented Descriptions for XML
By Eric van der Vlist

Application of UML to just about any type model requires an extension profile. There are many possible profiles and mappings between UML and XML, not all of which address the same goals. The XML Metadata Interchange and XMI Production for W3C XML Schema specifications, from the OMG, offer a standard mapping from UML/MOF to WXS for the purpose of exchanging models between UML tools. The model in question may not even be intended for XML production. WXS simply serves as a reliable XML expression of metadata for consumption in some other tool or locale.

My purpose here is to discuss issues in mapping between these two metamodels and to advance a UML profile that will support complete expression of an WXS information set. The major distinction is that XMI puts UML first, so to speak, in some cases settling for a mapping that fails to capture some useful WXS construct, so long as the UML model is well expressed. My aim is to put WXS first and to develop a UML profile for use specifically in WXS design:

  • The profile should capture every detail of an XML vocabulary that an WXS could express.
  • It should support two-way generation of WXS documents.

I suggest a few stereotypes and tags, many of which dovetail with the XMI-Schema mapping. I discuss specific notation issues as the story unfolds, and highlight the necessary stereotypes and tags.

David Carlson has also done some excellent work in this area, and has proposed an extension profile for this purpose. I disagree with him on at least one major point of modeling and one minor point of notation, but much of what is developed here lines up well with Carlson's profile.

Modeling Simple Types

Derived simple types
Problem: simple types can be derived as part of a schema design.
Solution: simpleType stereotype of UML classes; constraints define constraining facets.

Traditionally UML models are built with a set of primitive types assumed, per target type model. In XML, things are trickier, since even simple types can be invented in the scope of a single design. To support this, a simpleType stereotype of UML classes is used. UML specialization comes in handy here: a derived simpleType can identify its base type via a specialization relationship.

Under simpleType, the desired constraining facets might be modeled by overloading UML attributes and initial values, making for a convenient and readable notation. (Carlson goes this route, adding a stereotype for attributes as facets.) Even under a stereotype, though, attributes imply some actual state elements, and thus would be misleading. UML constraints offer a more correct notation for constraining facets.

Enumerated types are an exception to either notational choice for facets, since UML offers a standard stereotype for enumerations.

Simple Types
<xs:simpleType name="Review">
  <xs:restriction base="xs:decimal">
    <xs:minInclusive value="0" />
    <xs:maxInclusive value="5" />
    <xs:pattern value="[0-5](.5)?" />
  <xs:restriction base="xs:string" >
    <xs:enumeration value="G" />
    <xs:enumeration value="PG" />
    <xs:enumeration value="PG-13" />
    <xs:enumeration value="R" />
    <xs:enumeration value="NC-17" />
    <xs:enumeration value="X" />
<xs:complexType name="Movie">
    <xs:element name="Title" type="xs:string" />
    <xs:element name="review" type="Review" />
    <xs:element name="rating" type="Rating" />
List and union types
Problem: lists and unions of other types can be defined.
Solution: xs:list as a parameterized type; xs:union implied by {xor} constraint on multiple specializations.

List and union types pose a slightly different problem. Unions allow values from any one of several spaces as defined by other types. To model this, UML specialization with an {xor} constraint seems the cleanest expression, although no precedent for this combination has been found. The list is a more obvious mapping: it is a parameterized type, instantiable on any other simple or complex type in the model.

List and Union Types
<xs:simpleType name="IntegerList">
  <xs:list itemType="xs:integer" />

<!-- Enumeration Settings not shown. -->

<xs:simpleType name="OvenSetting">
  <xs:union memberTypes="xs:integer Settings" />

Modeling Complex Types

The UML class is well suited to model the WXS complex type. UML attributes pose a slight problem. In most UML-supported type models, there is only one way of implementing a state element, but in XML there are two, attributes and child elements. The latter are the only option when the state element is itself of complex type, but for single values either will do. In some schools, either attribute-only or element-only styles are favored, but for our purposes it's important to support a choice between the two.

UML attributes
Problem: necessary to distinguish XML attributes from XML elements.
Solution: attribute stereotype of UML attributes, conventionally presented as the single character '@'

Formally, this specialization of a metamodel element is best interpreted as a UML stereotype: the attribute stereotype of UML attributes. XMI allows for exactly this stereotype in "tailoring" schema production. This might seem unwieldy, but UML allows for graphical or textual shorthands for common stereotypes.

With the increasing prevalence of XPath in XML documents, application code, and design discussion, the XPath @ prefix for attribute names has much to recommend it. It is wonderfully brief, already in common parlance, and fits neatly as a shorthand for the attribute stereotype, in graphical or textual representations.

The type of a UML attribute can be specified in the usual way, whether simple or complex. Built-in WXS types may be represented by their local names or by using one of the common namespace prefixes xs: or xsd:. Also, compositional relationships can be drawn to identify (UML) attribute type, as we'll see in a moment.

Class with Attributes
<xs:complexType name="Part">
    <xs:element name="name" type="xs:string" />
    <xs:element name="price" type="xs:decimal" />
  <xs:attribute name="partID" type="xs:token" />
  <xs:attribute name="inStock" type="xs:positiveInteger" />

Pages: 1, 2

Next Pagearrow