XML.com: XML From the Inside Out
oreilly.comSafari Bookshelf.Conferences.

advertisement

Mastering DocBook Indexes
by Jirka Kosek | Pages: 1, 2, 3

Multiple Indexes in a Document

Occasionally we have to deal with documents that contain more than one index. Combination of author and subject indexes is quite common. The stylesheets are ready for this situation. The only thing we have to do is to distinguish index entries by specifying an index identifier in the type attribute.


<para>
  Wealth of modern societies is built upon information
  <indexterm type="subj">
    <primary>information</primary>
  </indexterm>.
  Information theory was evolved in the Forties by Claude
  Shannon.
  <indexterm type="name">
    <primary>Shannon, Claude</primary>
  </indexterm>
</para>

Then we must place two index elements at the end of the document, each denoting one specialized index by its type.


<index type="subj"/>

<index type="name">
<title>Name index</title>
</index>

Generating multiple indexes is turned on by default, but can be suppressed by the index.on.type parameter. DocBook 4.2 and earlier versions do not support the new type attribute. In that case we can use the universal role attribute for index typing. The stylesheets also contain the corresponding index.on.role parameter.

From the Semantic Markup to the Index

In DocBook you can use dozens of different elements to distinguish between file names, function names, commands, etc. The following paragraph demonstrates how to use semantic markup.


<para>
  <command>rm</command> command can be very useful, but be
  careful when you are using it. There are several files in
  your system like <filename>/etc/passwd</filename> which
  are quite important.
</para>

Adding semantically distinguished terms into an index is important since readers often use indexes for quick lookups. In order to place terms from the previous example into the index we must use quite a lot of markup.


<para>
  <command>rm</command>
  <indexterm><primary>rm</primary></indexterm>
  <indexterm>
    <primary>commands</primary>
    <secondary>rm</secondary>
  </indexterm>
  command can be very useful, but be careful when you are
  using it. There are several files in your system like
  <filename>/etc/passwd</filename>
  <indexterm><primary>/etc/passwd</primary></indexterm>
  which are quite important.
</para>

This markup will produce the following output in the index:


            - Symbols -
/etc/passwd, 42

- C -
commands,
  rm, 42

- R -
rm, 42

The resulting index is useful, isn't it? But to be honest, no one wants to type all these redundant index terms manually. Fortunately, mapping from semantic markup to index entries is simple and unambiguous in this situation and can be easily automated. The following standalone stylesheet takes an arbitrary DocBook document and adds index entries for each command and filename element.

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl = 
"http://www.w3.org/1999/XSL/Transform" version="1.0">

<!-- By default copy the whole document -->
<xsl:template match="node()|@*">
  <xsl:copy>
    <xsl:apply-templates select="node()|@*"/>
  </xsl:copy>
</xsl:template>

<!-- Each command is placed twice into index -->
<xsl:template match="command">
  <!-- Copy original element -->
  <xsl:copy-of select="."/>
  <!-- Create new index entries -->
  <indexterm>
    <primary><xsl:value-of select="."/></primary>
  </indexterm>
  <indexterm>
    <primary>commands</primary>
    <secondary><xsl:value-of select="."/></secondary>
  </indexterm>
</xsl:template>

<!-- Each filename is placed into index -->
<xsl:template match="filename">
  <!-- Copy original element -->
  <xsl:copy-of select="."/>
  <!-- Create new index entry -->
  <indexterm>
    <primary><xsl:value-of select="."/></primary>
  </indexterm>
</xsl:template>

</xsl:stylesheet>

The result of applying this stylesheet to a document is a temporary document with added index entries for all commands and filenames. We can process this temporary document as any other DocBook document. The whole process can be easily automated using make, shell scripting, or a similar technique.

The DocBook stylesheets also offer a more sophisticated solution. Index terms can be automatically added even during normal stylesheet processing without need of a temporary file and two transformations. The idea is implemented on top of profiling stylesheets. The profiling stylesheets are special versions of standard stylesheets that can filter content before the real transformation starts.

This can be used for conditional documents where different parts of a document are presented to different, target audiences. The internal implementation of profiling performs a special copying-and-filtering phase before processing. During this phase, a temporary profiled document is created in a memory. We can alter this process to add index terms for semantic Elements, as these elements are rarely used for profiling.


<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl= 
"http://www.w3.org/1999/XSL/Transform" version="1.0">

<!-- Import of the original stylesheet -->
<xsl:import href = 
"http://docbook.sourceforge.net/release/xsl/current/fo
/profile-docbook.xsl"/>

<!-- Each command is placed twice into index -->
<xsl:template match="command" mode="profile">
  <!-- Copy original element -->
  <xsl:copy-of select="."/>
  <!-- Create new index entries -->
  <indexterm>
    <primary><xsl:value-of select="."/></primary>
  </indexterm>
  <indexterm>
    <primary>commands</primary>
    <secondary><xsl:value-of select="."/></secondary>
  </indexterm>
</xsl:template>

<!-- Each filename is placed into index -->
<xsl:template match="filename" mode="profile">
  <!-- Copy original element -->
  <xsl:copy-of select="."/>
  <!-- Create new index entry -->
  <indexterm>
    <primary><xsl:value-of select="."/></primary>
  </indexterm>
</xsl:template>

</xsl:stylesheet>

Conclusion

DocBook in conjunction with the DocBook XSL stylesheets offer complex solutions for creating and processing indexes. This article has shown how easily you can create and process indexes in DocBook. The stylesheets are also ready to fulfill challenging requirements for internationalized indexes and easy, semantic-markup indexing.

Related Links

[1] Download samples

[2] DocBook XSLStylesheets

[3] DocBook XSL: The Complete Guide from Bob Stayton is a must-read for everyone who wants to hack DocBook XSL stylesheets seriously.

[4] DocBook: The Definitive Guide

[5] XSL-List -- an open forum on XSL.



1 to 3 of 3
  1. Converting Word Index to Docbook index
    2004-07-17 07:00:13 zedkineece
  2. Indexing
    2004-07-16 05:33:14 dpawson1
  3. Emacs tricks
    2004-07-15 15:22:02 Bob DuCharme
1 to 3 of 3