Menu

Automated Tree Drawing: XSLT and SVG

September 8, 2004

Jirka Kosek

Trees are a very basic abstraction used in many computer science areas, including XML. If you're writing a document about tree structures you'll often want to show trees as graphics. This is not a problem for a few trees as they can be hand drawn in some graphics editor. But if you need dozens of trees, you would do well to use a compact text syntax for describing trees that can later be turned into nice pictures. In this article I'll show you how to parse simple text notation by means of XSLT and turn it into SVG graphics.

Compact Tree Syntax

Before we delve into details of XSLT implementation we should describe our task. We want to create a transformation that can take a compact tree notation and turn it into an SVG graphic. Our tree syntax will be very simple. Each character will correspond to one node, and subnodes will be closed in parenthesis. For example, the following expression a(bcd(ef)) represents the tree depicted in Figure 1.

Rendered tree a(bcd(ef)) Figure 1. Rendered tree a(bcd(ef)).

 

Our notation is also able to express hedges. A hedge is a sequence of trees or alternatively, “tree without the root node.” Figure 2 shows rendering of a hedge a(bcd)e(fgh).

Rendered hedge a(bcd)e(fgh) Figure 2. Rendered hedge a(bcd)e(fgh).

If the name of a node should be longer than one character, we can put the name inside curly braces. You can see the expression {foo}({bar}{baz}) rendered as a tree in Figure 3.

Rendered tree {foo}({bar}{baz})
Figure 3. Rendered tree {foo}({bar}{baz}).

Solution Overview

Conversion from the compact syntax into the full-fledged SVG image is not trivial, so we will do it in several steps. The first step will parse the textual representation of a tree and will create an XML structure representing the same tree. The following XML fragment is an XML representation of the tree from Figure 1.

<node label="a">

   <node label="b"/>

   <node label="c"/>

   <node label="d">

      <node label="e"/>

      <node label="f"/>

   </node>

</node>

Each node in the tree is represented by a node element. Element nesting corresponds to the hierarchy of the tree. The label of each node is captured in the label attribute.

Creating XML representation from the text string will be the hardest part of the whole task since XSLT does not offer powerful string manipulation functions. But once we get the XML representation of a tree we can process it quite easily with XSLT.

In the next step we'll decorate the XML tree with some layout information. We'll add depth of a node in the tree and width of the attached sub-tree to each node.

<node depth="1" width="4" label="a">

   <node depth="2" width="1" label="b"/>

   <node depth="2" width="1" label="c"/>

   <node depth="2" width="2" label="d">

      <node depth="3" width="1" label="e"/>

      <node depth="3" width="1" label="f"/>

   </node>

</node>

With this layout information it is quite easy to get an SVG image. We must just emit correct SVG constructs and turn the layout information to the real coordinates using several calculations. And with that the requested SVG image of the tree is created.

<svg:svg xmlns:svg = "http://www.w3.org/2000/svg" viewBox = "0 0 80 60">
 <svg:g transform = "translate(0,-5) scale(10)">
  <svg:text x = "4" y = "1.8" style = "text-anchor: middle; font-size: 0.9;">a</svg:text>
  <svg:rect x = "3.1" y = "1" width = "1.8" height = "1" rx = "0.4" ry = "0.4" style = "fill: none; stroke: black; stroke-width: 0.1;"/>
  <svg:line x1 = "4" y1 = "2" x2 = "1" y2 = "3" style = "stroke-width: 0.1; stroke: black;"/>
  <svg:line x1 = "4" y1 = "2" x2 = "3" y2 = "3" style = "stroke-width: 0.1; stroke: black;"/>
  <svg:line x1 = "4" y1 = "2" x2 = "6" y2 = "3" style = "stroke-width: 0.1; stroke: black;"/>
  <svg:text x = "1" y = "3.8" style = "text-anchor: middle; font-size: 0.9;">b</svg:text>   <svg:rect x = "0.1" y = "3" width = "1.8" height = "1" rx = "0.4" ry = "0.4" style = "fill: none; stroke: black; stroke-width: 0.1;"/>
  <svg:text x = "3" y = "3.8" style = "text-anchor: middle; font-size: 0.9;">c</svg:text>
  <svg:rect x = "2.1" y = "3" width = "1.8" height = "1" rx = "0.4" ry = "0.4" style = "fill: none; stroke: black; stroke-width: 0.1;"/>
  <svg:text x = "6" y = "3.8" style = "text-anchor: middle; font-size: 0.9;">d</svg:text>
  <svg:rect x = "5.1" y = "3" width = "1.8" height = "1" rx = "0.4" ry = "0.4" style = "fill: none; stroke: black; stroke-width: 0.1;"/>
  <svg:line x1 = "6" y1 = "4" x2 = "5" y2 = "5" style = "stroke-width: 0.1; stroke: black;"/>
  <svg:line x1 = "6" y1 = "4" x2 = "7" y2 = "5" style = "stroke-width: 0.1; stroke: black;"/>
  <svg:text x = "5" y = "5.8" style = "text-anchor: middle; font-size: 0.9;">e</svg:text>
  <svg:rect x = "4.1" y = "5" width = "1.8" height = "1" rx = "0.4" ry = "0.4" style = "fill: none; stroke: black; stroke-width: 0.1;"/>
  <svg:text x = "7" y = "5.8" style = "text-anchor: middle; font-size: 0.9;">f</svg:text>
  <svg:rect x = "6.1" y = "5" width = "1.8" height = "1" rx = "0.4" ry = "0.4" style = "fill: none; stroke: black; stroke-width: 0.1;"/>
 </svg:g>
</svg:svg>

Parsing the Compact Syntax

Given the minimalist set of string functions in XPath/XSLT, parsing the compact syntax of trees is the most challenging part of our task. Fortunately it is possible to create a parser with a combination of elementary string-manipulation functions and recursive processing. The XSLT template hedge2xml implements the following simple parsing algorithm:

  1. Get the name of a node. (It's the first letter to process or text between curly braces if the first character is '{'.)

  2. If the next token is '(' process text between matching parenthesis by this algorithm and place it inside the element labeled with the name from the first step. Then process the rest of the text after matching ')' by this algorithm.

  3. If the next token is not '(' emit the empty element labeled with the name from the first step and process the rest of the text by this algorithm.

<!--Simple text parser generates tree of nodes from

compact syntax -->

<xsl:template name="hedge2xml">

  <xsl:param name="hedge"/>

  <!-- Extract first token -->

  <xsl:param name="head" select="substring($hedge, 1, 1)"/>



<!-- If there is something left to parse then parse it -->

  <xsl:if test="$hedge != ''">

    <xsl:variable name="next" select="substring($hedge, 2, 1)"/>

    <xsl:variable name="tail" select="substring($hedge, 3)"/>



    <xsl:choose>



<!-- Node name is enclosed between { and }. Grab it into

$head and proceed. -->

      <xsl:when test="$head = '{'">

        <xsl:call-template name="hedge2xml">

<xsl:with-param name="hedge"

select="concat(' ', substring-after(concat($next, $tail), '}'))"/>

<xsl:with-param name="head"

select="substring-before(concat($next, $tail), '}')"/>

        </xsl:call-template>

      </xsl:when>



<!-- End of the sub-level. Proceed with next tokens. -->

      <xsl:when test="$head = ')'">

        <xsl:call-template name="hedge2xml">

          <xsl:with-param name="hedge"

            select="concat($next, $tail)"/>

        </xsl:call-template>

      </xsl:when>



<!-- There is no node name. Emit error message. -->

    <xsl:when test="$head = '('">

      <xsl:message>

        <xsl:text>Unexpected ( found in '</xsl:text>

        <xsl:value-of select="concat($head, $next, $tail)"/>

        <xsl:text>'.&#10;</xsl:text>

        </xsl:message>

      </xsl:when>

      

      <!-- Start of the sub-level. -->

      <xsl:when test="$next = '('">

        <!-- Find the end of the sub-level. -->

        <xsl:variable name="endPos"

                        select="my:closingParenPos($tail)"/>

        <!-- Create wrapper node and process content of the

         sub-level. -->

        <node label="{$head}">

          <xsl:call-template name="hedge2xml">



          <xsl:with-param name="hedge"

                     select="substring($tail, 1, $endPos)"/>

          </xsl:call-template>

        </node>

        <!-- Process content after the sub-level. -->

        <xsl:call-template name="hedge2xml">

          <xsl:with-param name="hedge"

                        select="substring($tail, $endPos)"/>

          </xsl:call-template>

      </xsl:when>



      <!-- Output node and process next nodes. -->

      <xsl:otherwise>

        <node label="{$head}"/>

        <xsl:call-template name="hedge2xml">

          <xsl:with-param name="hedge"

                            select="concat($next, $tail)"/>

        </xsl:call-template>

      </xsl:otherwise>

      

    </xsl:choose>

  </xsl:if>

  

</xsl:template>

I am using a custom function, my:closingParenPos, to find the position of the matching closing parenthesis. I decided to implement it as an EXSLT function because a regularly named template would be too verbose. The function scans the input string for parenthesis and counts the depth of nesting. The function ends when the closing parenthesis is found.

<!-- Find position of the matching right parenthesis -->
<func:function name = "my:closingParenPos">
  <xsl:param name = "text"/>
  <xsl:param name = "depth" select = "1"/>
  <xsl:param name = "pos" select = "1"/>
  <xsl:choose>
    <!-- Found closing ) which is not nested. We are done. -->
    <xsl:when test = "substring($text, 1, 1) = ')' and $depth = 1">
      <func:result select = "$pos"/>
    </xsl:when>
    <!-- Found opening (. Increase nesting depth and continue. -->
    <xsl:when test = "substring($text, 1, 1) = '('">
      <func:result select = "my:closingParenPos(substring($text, 2), $depth + 1, $pos+1)"/>
    </xsl:when>
    <!-- Found closing ) which is nested. Unwrap and continue on a shallower level. -->
    <xsl:when test = "substring($text, 1, 1) = ')'">
      <func:result select = "my:closingParenPos(substring($text, 2), $depth - 1, $pos+1)"/>
    </xsl:when>
    <!-- End of text while nested. Something is wrong with input. -->
    <xsl:when test = "$text = '' and $depth != 0">
      <xsl:message>
        <xsl:text>Warning. Unbalanced parenthesis.&#10;</xsl:text>
      </xsl:message>
    </xsl:when>
    <!-- In all other cases scan rest of the input for parenthesis. -->
    <xsl:otherwise>
      <func:result select = "my:closingParenPos(substring($text, 2), $depth, $pos+1)"/>
    </xsl:otherwise>
  </xsl:choose>
</func:function>

Adding Layout Information

Now that we have a tree structure captured as an XML fragment we must calculate the layout of the individual nodes. In order to do this we must know the depth and width of each node in the tree. The following templates in xml2layout mode copy node elements intact and add width and depth as attributes to each node.

<xsl:template match="node[node]" mode="xml2layout">

  <xsl:param name="depth" select="1"/>

  <xsl:variable name="subTree">

    <xsl:apply-templates select="node" mode="xml2layout">

      <xsl:with-param name="depth" select="$depth+1"/>

    </xsl:apply-templates>

  </xsl:variable>



  <!-- Add layout attributes to the existing node -->

  <node depth="{$depth}" width="{sum(exsl:node-set($subTree)/node/@width)}">

    <!-- Copy original attributes and content -->

    <xsl:copy-of select="@*"/>

    <xsl:copy-of select="$subTree"/>

  </node>



</xsl:template>



<!-- Add layout attributes to leaf nodes -->

<xsl:template match="node" mode="xml2layout">

  <xsl:param name="depth" select="1"/>

  <node depth="{$depth}" width="1">

    <xsl:copy-of select="@*"/>

  </node>

</xsl:template>

Generating SVG

It is surprisingly easy to create SVG graphics from the nodes with added layout information. The depth of a node directly corresponds to the Y-coordinate. The X-coordinate can be calculated from the width of a node — it is the sum of widths of the preceding nodes with the same depth or with the lower depth if there is no preceding node on the same level. This looks like a complicated calculation but in fact it can be expressed by one XPath expression. The following XSLT code takes as input the layout from the previous stage and creates an SVG image from it.

<!-- Magnifying factor -->
<xsl:param name="hedge.scale" select="10"/>

<!-- Convert layout to SVG -->
<xsl:template name="layout2svg">
  <xsl:param name="layout"/>

  <!-- Find depth of the tree -->
  <xsl:variable name = "maxDepth">
    <xsl:for-each select = "$layout//node">
      <xsl:sort select = "@depth" data-type = "number" order = "descending"/>
      <xsl:if test = "position() = 1">
        <xsl:value-of select = "@depth"/>
      </xsl:if>
    </xsl:for-each>
  </xsl:variable>

  <!-- Create SVG wrapper -->
  <svg:svg viewBox = "0 0 {sum($layout/node/@width) * 2 * $hedge.scale} {$maxDepth * 2 * $hedge.scale}">
    <!-- Note that some SVG implementations work better when you set explicit width and height also.
         In that case add following attributes to svg element:
            width="{sum($layout/node/@width)*5}mm"
            depth = "{$maxDepth*5}mm"
            preserveAspectRatio="xMidYMid meet" -->
    <svg:g transform = "translate(0,-{$hedge.scale div 2}) scale({$hedge.scale})">
      <xsl:apply-templates select = "$layout/node" mode = "layout2svg"/>
    </svg:g>
  </svg:svg>
</xsl:template>

<!-- Draw one node -->
<xsl:template match = "node" mode = "layout2svg">
  <!-- Calculate X coordinate -->
  <xsl:variable name="x" select = "(sum(preceding::node[@depth = current()/@depth or (not(node) and @depth &lt;= current()/@depth)]/@width) + (@width div 2)) * 2"/>
  <!-- Calculate Y coordinate -->
  <xsl:variable name = "y" select = "@depth * 2"/>
  <!-- Draw label of node -->
  <svg:text x = "{$x}"
            y = "{$y - 0.2}"
            style = "text-anchor: middle; font-size: 0.9;">
    <xsl:value-of select="@label"/>
  </svg:text>
  <!-- Draw rounded rectangle around label -->
  <svg:rect x = "{$x - 0.9}" y="{$y - 1}" width = "1.8" height="1"
            rx = "0.4" ry = "0.4"
            style = "fill: none; stroke: black; stroke-width: 0.1;"/>
  <!-- Draw connector lines to all sub-nodes -->
  <xsl:for-each select="node">
    <svg:line x1 = "{$x}"
              y1 = "{$y}"
              x2 = "{(sum(preceding::node[@depth = current()/@depth or (not(node) and @depth &lt; = current()/@depth)]/@width) + (@width div 2)) * 2}"
              y2 = "{@depth * 2 - 1}"
              style = "stroke-width: 0.1; stroke: black;"/>
  </xsl:for-each>
  <!-- Draw sub-nodes -->
  <xsl:apply-templates select = "node" mode = "layout2svg"/>
</xsl:template>

Putting the Pieces Together

I have so far described and implemented the process of getting an SVG image from the compact syntax as a series of steps. Now it's time to combine these steps together. The following templates create an SVG image of tree (or hedge) passed as a parameter.

<xsl:template name="hedge2svg">

  <!-- Hedge is passed as parameter -->

  <xsl:param name="hedge"/>



  <!-- Parse text hedge into XML nodes -->

  <xsl:variable name="tree">

    <xsl:call-template name="hedge2xml">

      <xsl:with-param name="hedge" select="$hedge"/>

    </xsl:call-template>

  </xsl:variable>



  <!-- Add layout information to XML nodes -->

  <xsl:variable name="layoutTree">

    <xsl:apply-templates select="exsl:node-set($tree)/node" mode="xml2layout"/>

  </xsl:variable>



  <!-- Turn XML nodes into SVG image -->

  <xsl:call-template name="layout2svg">

    <xsl:with-param name="layout" select="exsl:node-set($layoutTree)"/>

  </xsl:call-template>



</xsl:template>

We use node-set since several passes over the data are needed.

Integration with Existing Stylesheets

Drawing trees is an interesting exercise but my initial motivation was more practical. I plan to use trees as an integral part of a document. For example, I might want to use trees as part of a DocBook document:

<?xml version='1.0' encoding='utf-8'?>

<!DOCTYPE article PUBLIC '-//OASIS//DTD DocBook XML V4.3//EN'

                         'http://www.oasis-open.org/docbook/xml/4.3/docbookx.dtd'>

<article lang="en">

<title>DocBook article with hedges and trees</title>



<para>Inline hedges: 

        <phrase role="hedge">a(&epsiv;)</phrase>, 

        <phrase role="hedge">a(x)</phrase> and

        <phrase role="hedge">a(&epsiv;)b(c(&epsiv;)x)</phrase>.</para>



<para>Sample tree: <phrase role="hedge">a(b(pqr)c(xyz)d(mno))</phrase></para>



<para>Sample tree: <phrase role="hedge">a(bcd(ef))</phrase></para>



</article>

In order to draw the content of the phrase element as a hedge, we must create a customization layer that combines stock DocBook stylesheets with our hedge-to-SVG code. We use an instream-foreign-object formatting object to place SVG directly into the flow of formatting objects.

<?xml version="1.0" encoding="utf-8"?>

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"

                xmlns:fo="http://www.w3.org/1999/XSL/Format"

                version="1.0">

				
<xsl:import href = "http://docbook.sourceforge.net/release/xsl/current/fo/docbook.xsl"/>
<xsl:include href="hedge2svg.xsl"/>


<xsl:template match="phrase[@role='hedge']">

  <fo:instream-foreign-object alignment-baseline="middle">

    <xsl:call-template name="hedge2svg">

      <xsl:with-param name="hedge" select="."/>

    </xsl:call-template>

  </fo:instream-foreign-object>

</xsl:template>



<xsl:param name="draft.watermark.image" select="''"/>



</xsl:stylesheet>

The result of applying this stylesheet to the sample document is shown in Figure 4.

Rendered DocBook document with hedges and trees Figure 4. Rendered DocBook document with hedges and trees.

Conclusion

The techniques described in this article are useful because compact text syntaxes are much easier to type by hand than XML syntaxes. The examples presented in the article show how to create SVG representations of trees from such compact syntaxes.

Related Links

  1. Download samples.
  2. Understanding the node-set() Function is my article here on xml.com that will make you more familiar with node-set() function.
  3. xml.com articles about SVG.
  4. YAPP XSLT is a lexical scanner and recursive-descent parser generator implemented in XSLT. It can be handy if you want to parse texts based on more complicated grammars than the sample one from this article.
  5. XSL-List – open forum on XSL.