Customizing the DocBook DTD, Part 2

October 20, 1999

Leonard Muellner and Norman Walsh

Understanding DocBook Structure

DocBook is a large and, at first glance, fairly complex DTD. Much of the apparent complexity is caused by the prolific use of parameter entities. This was an intentional choice on the part of the maintainers, who traded "raw readability" for customizability. This section provides a general overview of the structure of the DTD. After you understand it, DocBook will probably seem much less complicated.

DocBook Modules

DocBook is composed of seven primary modules. These modules decompose the DTD into large, related chunks. Most modifications are restricted to a single chunk.

Figure 1 shows the module structure of DocBook as a flowchart.

Figure 1. Structure of the DocBook DTD

The modules are:

Module Description Possible Modifications
docbook.dtd The main driver file. This module declares and references the other top-level modules.  


The hierarchy. This module declares the elements that provide the hierarchical structure of DocBook (sets, books, chapters, articles, and so on).


Changes to this module alter the top-level structure of the DTD. If you want to write a DocBook-derived DTD with a different structure (something other than a book), but with the same paragraph and inline-level elements, you make most of your changes in this module.
dbpool.mod The information pool. This module declares the elements that describe content (inline elements, bibliographic data, block quotes, sidebars, and so on) but are not part of the large-scale hierarchy of a document. You can incorporate these elements into an entirely different element hierarchy. The most common reason for changing this module is to add or remove inline elements.
dbnotn.mod The notation declarations. This module declares the notations used by DocBook. This module can be changed to add or remove notations.
dbcent.mod The character entities. This module declares and references the ISO entity sets used by DocBook. Changes to this module can add or remove entity sets.
dbgenent.mod The general entities. This is a place where you can customize the general entities available in DocBook instances. This is the place to add, for example, boiler plate text, logos for institutional identity, or additional notations understood by your local processing system.


The CALS Table Model.  See note.

Most changes to the CALS table model can be accomplished by modifying parameter entities in dbpool.mod; changing this DTD fragment is strongly discouraged. If you want to use a different table model, remove this one and add your own.

The ISO standard character entity sets. These entity sets are not actually part of the official DocBook distribution, but are referenced by default.



The CALS Table Model. CALS is an initiative by the United States Department of Defense to standardize the document types used across branches of the military. The CALS table model, published in MIL-HDBK-28001, was for a long time the most widely supported SGML table model (one might now argue that the HTML table model is more widely supported by some definitions of widely supported). In any event, it is the table model used by DocBook.

DocBook predates the publication of the OASIS Technical Resolution TR 9503:1995, which defines an industry standard exchange table model and thus incorporates the full CALS Table Model.

There are some additional modules, initially undefined, that can be inserted at several places for redeclaration. This is described in more detail in the section called Removing Admonitions from Table Entries

DocBook Parameterization

Customization layers are possible because DocBook has been extensively parameterized so that it is possible to make any changes that might be desired without ever editing the actual distributed modules. The parameter entities come in several flavors:

Parameter Entities Description Possible Modifications


Classes group elements of a similar type: for example all the lists are in the %list.class;.


If you want to add a new kind of something (a new kind of list or a new kind of verbatim environment, for example), you generally want to add the name of the new element to the appropriate class.

%*.mix; Mixtures are collections of classes that appear in content models. For example, the content model of the Example element includes %example.mix;. Not every element's content model is a single mixture, but elements in the same class tend to have the same mixture in their content model. If you want to change the content model of some class of elements (lists or admonitions, perhaps), you generally want to change the definition of the appropriate mixture.
%*.module; The %*.module; parameter entities control marked sections around individual elements and their attribute lists. For example, the element and attribute declarations for Abbrev occur within a marked section delimited by ¬ębrev.module;. If you want to remove or redefine an element or its attribute list, you generally want to change its module marked section to IGNORE and possibly add a new definition for it in your customization layer.
%*.element; The %*.element; parameter entities were introduced in DocBook V3.1; they control marked sections around individual element declarations.  
%*.attlist; The %*.attlist; parameter entities were introduced in DocBook V3.1; they control marked sections around individual attribute list declarations.  
%*.inclusion;, %*.exclusion;


These parameter entities control the inclusion and exclusion markup in element declarations.

Changing these declarations allows you to make global changes to the inclusions and exclusions in the DTD.

The %local.*; parameter entities are a local extension mechanism. You can add markup to most entity declarations simply by declaring the appropriate local parameter entity.