Modeling XML Vocabularies with UML: Part III
by Dave Carlson
|
Pages: 1, 2, 3
Customizing the PO Schema Design Model
Consider the following sample XML document, which is a fragment from an example in the XSD Primer:
<ipo:purchaseOrder
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:ipo="http://www.example.com/IPO"
orderDate="1999-12-01">
<ipo:shipTo exportCode="1" xsi:type="ipo:UKAddress">
<ipo:name>Helen Zoe</ipo:name>
<ipo:street>47 Eden Street</ipo:street>
<ipo:city>Cambridge</ipo:city>
<ipo:postcode>CB1 1JR</ipo:postcode>
</ipo:shipTo>
. . .
</ipo:purchaseOrder>
We'll use this instance to derive requirements for refining the UML model. These design requirements are divided into four categories:
- Should the attributes of a UML class be produced as XML attributes or child elements in the schema?
- Which kind of model group (all, sequence, or choice) should be used to validate an element's content?
- Should we choose to include or exclude XML element tags that represent class names and roles in the UML associations?
- How do we map UML class names to XML element names?
The UML class diagram shown in Figure 1 includes profile extensions that resolve all of these design choices. This purchase order model should be very familiar by now. It was presented as a conceptual model of the vocabulary in two previous articles and is now refined to include stereotypes and properties that specify the XML schema design model. It's important to note that this is exactly the same structure as shown in previous diagrams, with a few additional labels added.
![]() Figure 1: Design model of purchase order vocabulary |
After applying these profile extensions, the following schema is produced for the PurchaseOrder class and its associations:
<xs:element name="purchaseOrder" type="ipo:PurchaseOrder"/>
<xs:complexType name="PurchaseOrder">
<xs:sequence>
<xs:element name="shipTo" type="ipo:Address"/>
<xs:element name="billTo" type="ipo:Address"/>
<xs:element name="comment" type="xs:string" minOccurs="0" maxOccurs="1"/>
<xs:element name="items" minOccurs="0" maxOccurs="1">
<xs:complexType>
<xs:sequence>
<xs:element ref="ipo:item" minOccurs="0" maxOccurs="unbounded"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
<xs:attribute name="orderDate" type="xs:date"/>
</xs:complexType>
If you want to work through these examples in more detail, the complete sample document is available, as is the PO module schema and the Address module schema.
The sample purchase order instance document includes two XML attributes:
orderDate on the purchaseOrder element and
exportCode on the shipTo element. By assigning an
<<XSDattribute>> stereotype to the orderDate
attribute in UML, we specify that it should be represented as an attribute
in XML. The exportCode attribute is similarly stereotyped on
the UKAddress class, although it's not shown here. The
comment UML attribute in the PurchaseOrder class follows the
default mapping to an element in the schema.
The XSD Primer uses a <sequence> model group for all complexType
content, whereas the default UML mapping uses an <all> unordered
content model. To modify this mapping we assign the
<<XSDcomplexType>> stereotype to the PurchaseOrder class and set
the modelGroup property to 'sequence'.
But the use of a sequence model group raises a new issue when mapping from UML to XML schemas. UML attributes and associations are inherently unordered within their owning class. So each UML attribute and association end that is part of a sequence group must be annotated with a profile property that specifies its position. These position property values are shown as annotations in Figure 1. The procedure for adding profile stereotypes and property values is different in each UML tool, although any tool that claims compliance with the UML specification must provide some means for adding them.
The default mapping rules allow an Address element (or one
of its subclasses) contained within the association role elements for
shipTo and billTo (see Part II of this series), whereas the
required instance document omits the Address tag and embeds its
element and attribute content directly within the role tag. To specify this
design choice, the <<XSDelement>> stereotype is
assigned to the association ends connected to the Address class and the
anonymousType property is set to 'true'. The stereotype label
is omitted from the diagram to minimize clutter, but the tagged value
properties are listed within curly braces.
Because the items role on the association to the
Item class is not specified as an
anonymousType, its definition in the schema shown above retains
the role's container element to hold elements for the related class. The
document instance for purchase order items looks like
<ipo:purchaseOrder>
<ipo:items>
<ipo:item partNum="833-AA">
<ipo:productName>Lapis necklace</ipo:productName>
<ipo:quantity>1</ipo:quantity>
<ipo:USPrice>99.95</ipo:USPrice>
<ipo:comment>Want this for the holidays!</ipo:comment>
<ipo:shipDate>1999-12-05</ipo:shipDate>
</ipo:item>
</ipo:items>
</ipo:purchaseOrder>
In this situation "anonymous type" has a slightly different, more general
meaning than is used in the W3C XML Schema specification. It's easiest to
understand the meaning I intend by looking at the UML class diagram in
Figure 1 rather than at the XSD Schema document. In the class diagram, if an
association end is marked as an anonymousType, then the name of
the associated class is anonymous when its instances appear in XML
documents, regardless of which schema language is actually used to define
those documents. The concept of anonymous types is realized differently in
different schema languages.
You may have noticed that the XML document elements for
purchaseOrder and item appear with a lower-case
first character; this is often called "lower camel case" format. However,
the default mapping from UML creates these element names equal to the class
names, which begin with upper-case letters. The "upper camel case"
convention used in the UML diagram is commonly used in object-oriented
models and languages, whereas a variety of conventions are followed in
current XML schema vocabularies.
This issue is resolved by adding an elementNameMapping
property to a UML class along with the
<<XSDcomplexType>> stereotype. This profile
property allows an XML schema designer to choose a preferred naming
convention when modeling the schema details. Like many other profile
properties, this value can be set as a default for the entire model so that
all class names will be mapped to XML element names in the same way.
