XML.com: XML From the Inside Out
oreilly.comSafari Bookshelf.Conferences.

advertisement

Modeling XML Vocabularies with UML: Part II
by Dave Carlson | Pages: 1, 2

User-Defined Datatype

The default mapping to XSD would produce a complexType definition for SKU and QuantityType, but we want these to become user-defined simple datatypes in the XML Schema. This is easily achieved by adding a UML stereotype to each of these two classes, which is shown as <<XSDsimpleType>> in Figure 1. This ability to include stereotypes is an integral part of the UML standard and is used to specify additional model characteristics that are usually unique to a particular domain; in this case, unique to XML schema design.

Using the stereotype, the schema generator knows to create the following definition for SKU:

<xs:simpleType name="SKU">
  <xs:annotation>
    <xs:documentation>Stock Keeping Unit, a code  
         for identifying products</xs:documentation>
  </xs:annotation>
  <xs:restriction base="xs:string">
    <xs:pattern value="\d{3}-[A-Z]{2}"/>
  </xs:restriction>
</xs:simpleType>

A UML model may also include documentation for any of its model elements, which is passed through to the XML schema definition as shown in this example. The UML generalization relationship indicates which existing simple datatype should be used as the base for this user-defined type. Finally, the pattern attribute on SKU is mapped to an XSD facet that constrains the SKU string value.

The second module in the purchase order schema definition represents a reusable set of specifications for addresses, as shown in Figure 2. These definitions are taken directly from section 4.1 of the XSD Primer. Two additional schema constructs are required by this model, in addition to those used when producing a schema from Figure 1.

Diagram.
Figure 2. Modularized Address schema component

Generalization

A fundamental and pervasive concept in object-oriented analysis and design is generalization from one class to another. The specialized subclass inherits attributes and associations from all of its parent classes. This is easily represented in W3C XML Schema, although it requires more indirect mechanisms when producing other XML schema languages.

In Figure 2, the Address class is shown in italic font, which is used in UML to indicate that this is an abstract class, only intended to be used for deriving other specialized classes. Following the same default rules used for PurchaseOrder, the complexType definitions for Address and USAddress are produced as follows:

<xs:element name="Address" type="Address" abstract="true"/>
<xs:complexType name="Address" abstract="true">
  <xs:all>
    <xs:element name="name" type="xs:string"/>
    <xs:element name="street" type="xs:string"/>
    <xs:element name="city" type="xs:string"/>
  </xs:all>
</xs:complexType>
   
<xs:element name="USAddress" type="USAddress" 
            substitutionGroup="Address"/>
<xs:complexType name="USAddress">
  <xs:complexContent>
    <xs:extension base="Address">
      <xs:all>
        <xs:element name="state" type="USState"/>
        <xs:element name="zip" type="xs:positiveInteger"/>
      </xs:all>
    </xs:extension>
  </xs:complexContent>
</xs:complexType>

There are three differences from previous examples. First, the top-level element and complexType definitions for Address include the XSD attribute abstract="true". Second, the USAddress element includes substitutionGroup="Address", which means that whenever the Address element is required as a content element, then USAddress may be substituted in its place. Thus, we may use USAddress (or, similarly, UKAddress) as the content of shipTo and billTo in the PurchaseOrder.

Third, the complexType definition for USAddress is extended from the base complexType named Address. There is, however, a significant point of difference in how this inheritance structure is interpreted in UML versus in XSD. In UML, the order of attributes and associations within a class is not specified and the features inherited from parent classes are freely intermingled with locally defined attributes and associations in a subclass. In XSD, inherited elements are treated as a group, so the three elements inherited from Address are an unordered group in USAddress, followed in sequence by another unordered group of the two elements defined in USAddress. You cannot define an unordered group of the five elements when one or more are inherited.

Enumerated Datatype

The state element of USAddress refers to a simple type definition for USState, which is generated from a UML enumeration. In Figure 2, USState is shown with an <<enumeration>> stereotype that notifies the schema generator to create an XSD enumeration value for each of the attributes defined for this class. An enumerated type in XSD is just a specialized kind of simpleType definitions, so it must also specify a superclass in UML to use as a base type in XSD. The schema is generated as follows:

<xs:simpleType name="USState">
  <xs:restriction base="xs:string">
    <xs:enumeration value="AK"/>
    <xs:enumeration value="AL"/>
    <xs:enumeration value="AR"/>
    <xs:enumeration value="PA"/>
  </xs:restriction>
</xs:simpleType>

Conclusions

The default mapping rules described in this article can be used to generate a complete XML schema from any UML class diagram. This might be a pre-existing application model that now must be deployed within an XML web services architecture, or it might be a new XML vocabulary model intended as a B2B data interchange standard. In either case, the default schema provides a usable first iteration that can be immediately used in an initial application deployment, although it may require refinement to meet other architectural and design requirements.

The first article in this series presented a process flow for schema design that emphasized the distinction between designing for data-oriented applications versus text-oriented applications. The default mapping rules are often sufficient for data-oriented applications. In fact, these defaults are aligned with the OMG's XML Metadata Interchange (XMI) version 2.0 specification for using XML as a model interchange format. This approach is also well aligned with the OMG's new initiative for Model Driven Architecture (MDA).

Text-oriented schemas, and any other schema that might be authored by humans and used as content for HTML portals, often must be refined to simplify the XML document structure. For example, many schema designers eliminate the wrapper elements corresponding to an association role name (but this also prevents use of the XSD <all> model group). This refinement and many others can be specified in a vocabulary model by setting a new default parameter for one UML package, which then applies to all of its contained classes.

We saw two examples of UML stereotypes in this article, which were used to indicate a specialized use of a UML class. More generally, these stereotypes and their associated property values are part of a UML profile for XML Schemas that I initially developed as part of my book on modeling XML applications. The third article in this series provides additional examples of using other stereotypes to customize the generated schema. I will also include description of a web-based tool we have developed that implements the complete UML profile for schema design and transforms any UML class model to either a W3C XML Schema or to an OASIS RELAX NG grammar.

Tips for Success

In order to help you when applying these ideas to your own e-business projects, I offer the following tips for success.

  • Plan for conceptual models of your business vocabularies that are reusable in several different deployment contexts, i.e. W3C XML Schema, DTD, relational DBMS, Java or EJB, etc. Alternative UML profiles can be used to transform the common business model to alternative platforms. But be aware that full realization of this goal is beyond the capabilities of many current UML tools.
  • Pre-existing UML models might be specialized to their deployment platform, platform libraries, and datatypes (Java, .NET, etc.). Isolate the platform independent domain model to enable its reuse and to generate XML schemas for data interchange.
  • Use consistent modeling guidelines for naming and structure, both within a single vocabulary and across a set of related models. For example, the FpML architecture specification provides clear guidelines for writing DTDs that are easily transferred to UML models, or any other object-oriented framework.


1 to 6 of 6
  1. One UML model to one XSD?
    2009-06-24 15:45:00 JuergT
  2. Representing parameter in a method/constructor
    2004-09-08 00:20:30 oopspolly
  3. Multiple inheritance
    2004-07-12 08:34:21 Shreya
  4. Association Classes
    2004-01-13 00:47:52 Ben McCartney
  5. Issue with inheritance and model group?
    2001-11-06 02:57:44 marco marcos
  6. Not a good article
    2001-09-26 21:21:50 Meng Zhou
1 to 6 of 6