XML.com: XML From the Inside Out
oreilly.comSafari Bookshelf.Conferences.

advertisement

Modeling XML Vocabularies with UML: Part II

September 19, 2001

In Part I of this series, I emphasized that models are an inevitable part of system analysis and design, even if a model is sometimes only in the developer's mind. By using UML to capture a conceptual model of the planned vocabulary, we are able to clarify the essential terms and relationships without getting caught up in the syntactic issues of the chosen schema language. In fact, industry standards groups may wish to use UML as the primary definition for their vocabularies and leave the final choice of schema language(s) to implementing vendors.

I also want to emphasize that choosing a model-driven approach to schema design does not force you into a long waterfall development process. The approach described in these articles illustrates an evolutionary and incremental development process. The first schema produced using default mapping rules from this purchase order model may not be ideal, but it accurately captures the domain semantics that were modeled. Part III of this series describes how the model may be specialized to capture design characteristics that are unique to XML schema generation. This approach is compatible with the contemporary methodologies for agile programming and modeling, where the models fulfill a very pragmatic role in the development process. (See XMLmodeling.com, a web portal that I have created to gather case studies and modeling resources.)

In order to achieve these rather lofty objectives, it's essential that we have a complete, flexible mapping specification between UML and XML schemas. The following examples do not present the complete picture but attempt to ease you into a maze of terminology from UML and the W3C XML Schema Definition Language (which I'll refer to hereafter as XSD).

Mapping UML Models to XML Schema

This is where the rubber meets the road when using UML in the development of XML schemas. A primary goal guiding the specification of this mapping is to allow sufficient flexibility to encompass most schema design requirements, while retaining a smooth transition from the conceptual vocabulary model to its detailed design and generation.

A related goal is to allow a valid XML schema to be automatically generated from any UML class diagram, even if the modeller has no familiarity with the XML schema syntax. Having this ability enables a rapid development process and supports reuse of the model vocabularies in several different deployment languages or environments because the core model is not overly specialized to XML.

Please note that the schema examples in this article are not fully compatible with the corresponding example in the XML Schema Primer. Nonetheless, the following schema fragments are still valid interpretations of the conceptual model. The third article in this series will continue the refinement process to its logical conclusion where the resulting schema can validate the XSD Primer example.

The conceptual model for purchase orders shown in Figure 1 is duplicated with very slight modification from the first article. We'll dissect this diagram into all of its major structures and map each part to the W3C XML Schema definition language. I'll note several situations where other alternatives are possible and also point out where the schema differs from the XSD Primer example.

Diagram.
Figure 1. Conceptual model of purchase order vocabulary.

Class and Attribute

A class in UML defines a complex data structure (and associated behavior) that maps by default to a complexType in XSD. As a first step, the PurchaseOrder class and its UML attributes produce the following XML Schema definition:

<xs:complexType name="PurchaseOrder">
  <xs:all>
    <xs:element name="orderDate" type="xs:date" 
                minOccurs="0" maxOccurs="1"/>
    <xs:element name="comment" type="xs:string" 
                minOccurs="0" maxOccurs="1"/>
  </xs:all>
</xs:complexType>

The attributes in a UML class are not restricted to a particular order, so an XSD <xs:all> element is used to create an unordered model group. In addition, a UML class creates a distinct namespace for its attribute names (i.e. two classes can contain attributes having the same name), so these are produced as local element definitions in the schema. See A New Kind of Namespace for more explanation of this topic. Both of these UML attributes are optional, indicated by [0..1] in Figure 1. These are mapped to minOccurs and maxOccurs attributes in the XSD. The UML attributes are defined using primitive data types from the XSD specification, so these are written directly to the generated schema using the appropriate namespace prefix. If other data types are used in the UML model, then an XSD type library can be created to define these types for use in a schema. For example, I have created an XSD type library for the Java primitive types and common Java classes such as Date, String, Boolean, etc.

As a useful default, a top-level element is automatically created for each complexType in the schema. The default name for this element is the same as the class name; this is allowed in W3C XML Schema because it uses separate namespaces within the schema itself for complexTypes and top-level elements. For PurchaseOrder, the top-level schema element is created as follows:

<xs:element name="PurchaseOrder" type="PurchaseOrder"/>

If you refer to the XSD Primer example, you'll see that orderDate is modeled as an XML attribute, not a local element in PurchaseOrder. It also uses a <sequence> model group instead of <all>. And, third, the top-level element is defined in the Primer using a lower-case first letter, i.e. purchaseOrder (often called "lower camel case" format). All of these differences are addressed in the third article by using a UML profile to expand the mapping to XML schemas.

Association

The PurchaseOrder type is specified not only by its UML attributes but also by its associations to other classes in the model. Figure 1 includes three associations that originate at PurchaseOrder, which is designated by navigation arrows at the opposite ends. Each association has a role name and multiplicity that specifies how the target class is related. These associations are added to the model group of the XSD complexType along with the elements created from the UML attributes.

<xs:complexType name="PurchaseOrder">
  <xs:all>
    <xs:element name="orderDate" type="xs:date" 
                minOccurs="0" maxOccurs="1"/>
    <xs:element name="comment" type="xs:string" 
                minOccurs="0" maxOccurs="1"/>
    <xs:element name="shipTo">
      <xs:complexType>
        <xs:sequence>
          <xs:element ref="Address"/>
        </xs:sequence>
      </xs:complexType>
    </xs:element>
    <xs:element name="billTo">
      <xs:complexType>
        <xs:sequence>
          <xs:element ref="Address"/>
        </xs:sequence>
      </xs:complexType>
    </xs:element>
    <xs:element name="items" minOccurs="0" maxOccurs="1">
      <xs:complexType>
        <xs:sequence>
          <xs:element ref="Item" 
                      minOccurs="0" maxOccurs="unbounded"/>
        </xs:sequence>
      </xs:complexType>
    </xs:element>
  </xs:all>
</xs:complexType>

Also in this Series

Modeling XML Vocabularies with UML: Part One

Modeling XML Vocabularies with UML: Part Three

Because the UML attributes for orderDate and comment have primitive data types, the schema embeds these values as element content. However, the default mapping for associations creates a wrapper element in XSD corresponding to the role name in UML. This element then contains the instances of the associated class, which the schema refers to using the top-level element created for each complexType.

If you want to create a W3C XML Schema using the <all> content model, then a wrapper element is necessary whenever the associated class has more than one occurrence. This is because <all> can be used only when the contained elements have either [0..1] or [1..1] multiplicity. So when generating the wrapper element for the association with Item, the element named item allows zero or one instances, which hold zero or more Item elements within it.

The difference between this default schema generated from UML and the schema included in the XSD Primer is that the Primer's shipTo and billTo roles contain the address content directly, without use of an element for the associated class. In other words, child elements for name, street, city, etc. are contained directly within shipTo and billTo. This design alternative is covered in the extensions presented in the third article.

Pages: 1, 2

Next Pagearrow