UML For W3C XML Schema Design
by Will Provost
|
Pages: 1, 2
Modeling Relationships
Relationships between UML classes line up nicely opposite the WXS
options. In UML, composition (aggregation by value) is not the most basic
relationship type, but in XML it is, and so we start there. Composition
maps to composition, and cardinality maps to occurrence constraints. The
role name can be mapped to the desired attribute or element name; it can
take advantage of the attribute stereotype and the @
prefix.
|
|
UML associations are trickier for XML, which is fundamentally hierarchical. XMI maps associations to XLinks, which is sound but exemplifies the problem with using XMI for WXS design, as XLinks are outside the Schema vocabulary. Within WXS associations map most naturally to key references. (This is the major issue with Carlson, who maps all associations as compositions, blurring the distinction between association by value and reference. Associations that are not explicitly modeled as compositions must be preserved in the schema, so that a single instance with multiple references in an object graph is not spuriously multiplied.)
|
Another difficulty crops up here. Core UML can describe the
cardinality of the association, can give it a name from each side and
express navigability. What it cannot do is identify the
selector and field components to be used in the
WXS.
This information is actually more relational than object-oriented in
nature, and it exposes one of UML's chief weaknesses: identifying key
fields. There is no real home for this information in the UML metamodel
-- in UML identity is strictly implicit -- and yet these paths
will need to be specified to complete a generated W3C XML Schema. This
could be addressed as a tag or a stereotype, and although it's a bit of a
forced fit, I propose a key stereotype of UML attributes, to be presented
via a simple shorthand. Note that this may overlap with the attribute
stereotype, resulting in notation such as «key»@unitID. UML
modeling tools can automate a mapping between this stereotype and the
definition of an xs:key governing the enclosing type.
Also, the association itself will need to identify the ordered list of
referencing fields to generate an xs:keyref. This can be
derived from the associating role name; multiple field names can be packed
into this name as a list, or can be attached as tagged values.
![]() |
|
Mappings of UML association cardinality are in fact the primary subject of an earlier XML Schema Clinic article; see "Enforcing Association Cardinality" for a full discussion of implementation strategies.
|
UML specialization maps more neatly to XML type extension for complex types. Both imply that the derived-type state elements are appended to the base type's state model.
The only trick here is that XML offers another means of complex-type derivation, i.e., estriction. This is another appropriate use of the UML stereotype, and so we define restriction as a stereotype of specialization. In this case the derived UML class will state the changes to the base-class content model; the Schema generator will be expected to merge these changes into the base content model for restatement in the restricted complex type.
Miscellaneous Schema Information
|
One key question we've yet to address is where the schema element fits into the UML model. There are options here: either the entire model can be directed to a schema, or in more complex models packages may be used to model XML namespaces. In either case there must be a property that identifies the target namespace URI.
The elementFormDefault and
atributeFormDefault attributes of the schema element truly
live outside the UML world view. These must be properties at the same
scope as the target namespace, whether package or model.
Also, we've thus far assumed that all content models are sequences. To
model a choice, use an {xor} UML constraint; if you need to
model xs:all, either an {unordered} constraint
or a separate stereotype of the UML class would do, but the former is a
better conceptual fit.
|
More from XML Schema Clinic |
One last problem is the distinction between local and global types in WXS. This is actually a more common problem than most we've considered: C++ and Java, among other metamodels, have namespace-partitioning constructs such as nested and inner classes. The UML specification offers a couple of possible notations for nesting one type within another (see section 3.48.2), and most tools have a means of establishing this relationship as well.
Directions: Behavioral XML?
For those of us who've found the absence of behavior modeling frustrating, it's a relief to realize that there is indeed more to XML than data structures. With a robust profile in hand by which WXS can be expressed as UML, we can turn to more adventurous uses, especially for XML messaging. As with all things XML, data is never far from metadata; the WSDL specification shows off XML's ability to encode method invocations, and it plays a schema-like role in prescribing XML message content.
From the humble beginnings of data-centric XML, WSDL descriptors rise once again to the level of object-oriented encapsulations. Now, suddenly, the full power of UML can be brought to bear. A WSDL portType stereotype can express the semantics for an entire Web service and can be the source for a complex generation of not only WSDL and WXS documents, but also service or client code to support SOAP or HTTP messaging. No specific mapping rules are proposed here, but hopefully the following hypothetical will whet the reader's appetite:

For taking the time to discuss various concepts in this article, I'd like to thank Richard K. Fisher and Jean Pierre LeJacq.
Recommended Reading
- XML Metadata Interchange (XMI) 1.2
- XMI Production for XML Schema
- David Carlson, Modeling XML Applications with UML, Addison Wesley.
- David Carlson, "Modeling XML Vocabularies with UML: Part II", XML.com.
- Migrating from XML DTD to XML-Schema using UML.
- UML to XML Design Rules Project for business messaging
Got a comment or question about this article? Share it in the forum.
(* You must be a member of XML.com to use this feature.)
Comment on this Article
| Titles Only | Titles Only | Newest First |
- Some comments
2002-09-02 08:12:02 John Arnett [Reply]
Definetly agree with the use of UML contraints to model XSD (WXS?!) facets. They're a natural fit.
However, don't think model groups (ie. all | sequence | choice) can be adequately modelled using association connections. Sequences etc are "anonymous" containers which need to be modelled as classes. The following two example illustrate the problems that arise with using associations.
Element MyElement contains a sequence. If the sequence contains multiple complex items, it is easy enough to model each using a separate association connected to the item in question. The many connections all belong to the same sequence.
<xsd:element name='MyElement'>
<xsd:complexType>
<xsd:sequence>
<!-- some conent... -->
</xsd:sequence>
</xsd:complexType>
</xsd:element>
However, what happens if we now add another sequence...
<xsd:element name='MyElement'>
<xsd:complexType>
<xsd:sequence>
<!-- some content... -->
</xsd:sequence>
<xsd:sequence>
<!-- some more content... -->
</xsd:sequence>
</xsd:complexType>
</xsd:element>
We can continue to add an association for each complex item, however, we start to lose track of which items belong to which sequence. We could of course label each connector "sequence1" or "sequence2" but it all starts to get a bit messy.
In the second example, MyOtherElement has a nested sequence:
<xsd:element name='MyOtherElement'>
<xsd:complexType>
<xsd:sequence>
<!-- some conent... -->
<xsd:sequence>
<!-- some more conent... -->
</xsd:sequence>
</xsd:sequence>
<xsd:complexType>
</xsd:element>
Again, it's difficult to model this in UML using only association connections - how do we show the source of the inner most sequence?
I encountered these problems whilst writing a stylesheet to transform our "legacy" schemas into XMI (for rendering as UML). The solution I decided upon was to map all model group elements, except the immediate children of complexTypes, to anonymous nested classes labeled with an appropriate stereotype. To avoid uneccessary clutter, the immediate children of complexTypes are represented by adding an appropriate tagged value to the parent.
Note also that tagged values can be used to indicate position in sequences.
- required sequence not represented?
2002-08-30 08:10:47 user2048 user2048 [Reply]
In the first example, the child elements of Movie
must be title, review, and rating, in that order.
How is this required order represented in the UML?

