July 1, 1999

Norman Walsh

XML inherited Document Type Definitions (DTDs) from SGML. DTDs are the schema mechanism for SGML. XML Schemas are the first wide-spread attempt to replace DTDs with something "better".

DTDs can be used to define content models (the valid order and nesting of elements) and, to a limited extent, the datatypes of attributes, but they have a number of obvious limitations:

  • They are written in a different (non-XML) syntax.

  • They have no support for namespaces.

  • They only offer extremely limited datatyping. DTDs can only express the datatype of attributes in terms of explicit enumerations and a few coarse string formats, there's no facility for describing numbers, dates, currency values, and so forth. Furthermore, DTDs have no ability to express the datatype of character data in elements.

  • They have a complex and fragile extension mechanism based on little more than string substitution.

    The worst thing about the DTD extension mechanism (parameter entities) is that it doesn't really make relationships explicit. Two elements defined to have the same content models aren't the same thing in any explicit way. Likewise, a group of attributes defined as a parameter entity and reused aren't logically a group, they're just "coincidentally" a group.

XML Schema overcome these limitations and are much more expressive than DTDs. The additional expressiveness will allow web applications to exchange XML data much more robustly without relying on ad hoc validation tools.

Although XML Schema is poised to replace DTDs, in the short term DTDs still have a number of advantages:

  • Widespread tools support. All SGML tools and many XML tools can process DTDs.

  • Widespread deployment. A large number of document types are already defined using DTDs: HTML, XHTML, DocBook, TEI, J2008, CALS, etc.

  • Widespread expertise and many years of practical application.

Warts and all, DTDs are well understood by a large community of SGML and XML programmers and consultants.