Valid Frustrations
I am having trouble with a DTD not enforcing some rules I'm trying to create.
For example, a fruit_basket element must contain
between 9 and 11 banana elements. I think the following
works:
<!ELEMENT fruit_basket (
(banana, banana, banana, banana, banana, banana, banana,
banana, banana) |
(banana, banana, banana, banana, banana, banana, banana,
banana, banana, banana) |
(banana, banana, banana, banana, banana, banana, banana,
banana, banana, banana, banana)>
Is there a better way to do this?
A: No. Frustrating, isn't it?
What you're working with here is called a content model for
(in this case) the fruit_basket element type. This is a
very simple example; constructing a content model is even worse for,
say, a hypothetical month element in even a simple
calendar application: some months may legally contain 31 days, some
30, and one either 28 or 29, depending on the year.
As you probably know (or can guess from the name), a content model specifies what child elements, their sequence, and how many of each a given element may contain. It's the "how many" specification which is giving you fits here. The only shortcuts available are the following special characters, which may be appended to a child element name in the content model:
| Character | Meaning |
|---|---|
| (none) | This child must occur only once |
+ |
This child may occur one or more times |
? |
This child may occur once, or not at all |
* |
Any number of occurrences of this child is legitimate (the "0 or more" option) |
For instance, you can require that a fruit_basket
element must have at least one banana element like
this:
<!ELEMENT fruit_basket (banana+)>
This limitation of DTD content models is one which XML Schema is
designed to fix. I don't have space to provide details of that spec
here, but, in general, an element type's content model is built by
declaring that element type with an xsd:complexType
element; children of this element include various
xsd:element elements, each of which may have a
minOccurs and a maxOccurs attribute. The
values of these attributes are integers, representing respectively the
minimum and maximum number of times which that child element type may
appear within that parent. The default value for both is 1, which is
consistent with DTD syntax.
Thus, a simple XML Schema declaration of the
fruit_basket element type, with your desired number of
banana children, might look like
<xsd:complexType name="fruit_basket">
<xsd:element name="banana" minOccurs="9"
maxOccurs="11"/>
</xsd:complexType>
Using XML Schema may not solve all your problems: the spec is still so new that it's not as widely supported as DTDs. But it at least gets you in the right ballpark.
|
|
| Post your comments |
Example: Suppose a fruit_basket contains three
bananas. Each banana should be numbered, 1
through 3; this can be done as an attribute or an element. But I don't
know how to do this using either method. If there were only one
fruit_basket, I could use ID-type attributes. But those
IDs need to be unique over the entire document, and my document will
contain multiple fruit_baskets. What if each needs to
have a "banana #1"?
A: The answer to this question is the same as the answer to your first: you're asking DTDs to do something they can't do.
What you're after here is some way to constrain the document's
content, not its structure. DTDs absolutely cannot constrain an
element's text (#PCDATA) content. (The XML spec itself loosely
constrains that content: it must fall within certain specified ranges
of Unicode values, and it may not include unescaped markup-significant
characters like < and &.) That leaves
you with the "constrain via attribute values" approach.
You can approximate, yet still be frustratingly far away from, an
answer using an ATTLIST declaration for the banana
element which restricts the attribute to values 1, 2, or 3. For
instance, your DTD might look something like this:
<!ELEMENT fruit_basket (banana*)>
<!ELEMENT banana EMPTY>
<!ATTLIST banana banana_number (1 | 2 | 3) "1" >
Again, though, this isn't a complete (or even very satisfying) solution:
|
Also in XML Q&A | |
bananas in a
fruit_basket in a useful way.banana children and their
banana_number attribute values. (This DTD allows you to
have 25 banana children in a
fruit_basket, for instance -- each with a
banana_number whose value is 1.)The kinds of problems you're struggling to solve here might be amenable to using XML Schema. But there's another, often overlooked approach to validating document content (of both elements and attributes) which stands completely outside the normal DTD-vs.-XML Schema axis: validate with an XSLT stylesheet.
But I don't want to minimize how much work may be involved,
especially if you aren't already comfortable with XSLT. Still, here's
a stylesheet which tests for both the number of banana
elements and the correspondence between the
banana_number attribute's value and that banana
element's ordinal position within its fruit_basket
parent:
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns="http://www.w3.org/1999/xhtml" version="1.0">
<!-- Process fruit_basket element(s) -->
<xsl:template match="fruit_basket">
<html>
<body>
<!-- Validate number of banana children in
fruit_basket -->
<xsl:choose>
<!-- Note escaped form of boolean > and <
operators -->
<xsl:when test="count(banana) > 8 and
count(banana) < 12">
<h3># of banana children OK</h3>
</xsl:when>
<xsl:otherwise>
<h3>Whoops! # of banana children is
<xsl:value-of select="count(banana)"/></h3>
</xsl:otherwise>
</xsl:choose>
<!-- Set up table of info about banana children
-->
<table border="1">
<tr>
<th>banana #</th>
<th>banana_number</th>
</tr>
<!-- Process all banana children of
fruit_basket -->
<xsl:apply-templates select="banana"/>
</table>
</body>
</html>
</xsl:template>
<!-- Process banana element(s) -->
<xsl:template match="banana">
<!-- Each banana element goes in its own table row
-->
<tr>
<th><xsl:value-of
select="position()"/></th>
<td>
<!-- Test for banana's position matching
banana_number attribute value-->
<xsl:choose>
<xsl:when test="position() =
@banana_number">
OK
</xsl:when>
<xsl:otherwise>
<strong>Whoops!</strong>...
<xsl:value-of select="@banana_number"/>
</xsl:otherwise>
9557xnbo
</xsl:choose>
</td>
</tr>
</xsl:template>
</xsl:stylesheet>
This stylesheet "transforms" the source document into an XHTML
document, displaying the result of the validation process. (If your
XSLT processor supports it, you can use the xsl:message
element to notify the source document's author of the document's
validity, instead of transforming to XHTML.)
Assume the following simple document:
<fruit_basket>
<banana banana_number="8"/>
<banana banana_number="2"/>
</fruit_basket>
With this document as its source tree, the style sheet produces XHTML
which looks like the figure at right when viewed in a browser.
Note that this approach to validation is codified in the Schematron project. It's an extremely powerful (and cool) way to perform almost any "validation" you can think of, without the limitations of either DTDs or XML Schema.
XML.com Copyright © 1998-2006 O'Reilly Media, Inc.