White Space in Element Content

Let's look at our white-space example again, but let's attach a simple DTD to the top of the document:

<!DOCTYPE p [
<!ELEMENT p (#PCDATA|ol)*>
<!ELEMENT ol (li+)>
<!ELEMENT li (#PCDATA)>
]>
<p>Little boys, ingredients for:
  <ol>
    <li>Snips,</li>
    <li>snails,</li>
    <li>puppy dogs' tails.</li>
  </ol>
</p>

If you look at the declaration for the element p, you'll see that it can contain text mixed up among its child elements. That means that the leading spaces before the ol elements might well be part of the p; the XML processor can't do anything special with them.

On the other hand, the declaration for the ol element makes it clear that it's not really intended to contain anything but li elements; thus any white space that appears between them is probably decorative, intended for pretty-printing. This is what's called element content. In this case, the XML processor is required to tell the application about the special status of this white space.

Important Note: The application isn't required to do anything in particular because this white space appears in element content; it's just that the information might well be useful in some situations. Note also that only a validating XML processor can provide this information.

Back-link to spec

Copyright © 1998, Tim Bray. All rights reserved.