Sign In/My Account | View Cart  
advertisement


Listen Print Discuss

Controlling Whitespace, Part Two
by Bob DuCharme | Pages: 1, 2

Normalizing Space

    

Also in Transforming XML

Automating Stylesheet Creation

Appreciating Libxslt

Push, Pull, Next!

Seeking Equality

The Path of Control

Imagine that your source document has extra whitespace in places, but not consistently, and you want to get rid of this whitespace to make it consistent. For example, the first employee element in the following has no extra spaces or carriage returns within its child elements, but the second one has plenty.

<employees>


  <employee hireDate="09/01/1998">
    <last>Herbert</last>
    <first>Johnny</first>
    <salary>95000</salary>
  </employee>

  <employee hireDate="     04/23/1999">
    <last>
Hill
</last>
    <first>

      Phil

</first>
    <salary>100000
</salary>
  </employee>

</employees>

A simple stylesheet to create comma-delimited versions of each employee's data, like this,

<!-- xq546.xsl: converts xq543.xml into xq548.txt -->
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
     version="1.0">
<xsl:output method="xml" omit-xml-declaration="yes"/>

<xsl:template match="employee">

  <xsl:apply-templates select="@hireDate"/>
  <xsl:text>,</xsl:text>
  <xsl:apply-templates select="first"/>
  <xsl:text>,</xsl:text>
  <xsl:apply-templates select="last"/>

</xsl:template> 

</xsl:stylesheet>

creates output that includes all that extra whitespace:



  09/01/1998,Johnny,Herbert

       04/23/1999,

      Phil

,
Hill


The normalize-space() function, in addition to converting strings of multiple space characters into a single space, deletes any leading and trailing spaces from the string passed to it as an argument. Using it can solve the problem with the stylesheet above:

<!-- xq544.xsl: converts xq543.xml into xq547.txt -->
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
     version="1.0">
<xsl:output method="xml" omit-xml-declaration="yes"/>

<xsl:template match="employee">

  <xsl:value-of select="normalize-space(@hireDate)"/>
  <xsl:text>,</xsl:text>
  <xsl:value-of select="normalize-space(first)"/>
  <xsl:text>,</xsl:text>
  <xsl:value-of select="normalize-space(last)"/>

<!-- Following alternative won't work:
  <xsl:apply-templates select="normalize-space(@hireDate)"/>
  <xsl:text>,</xsl:text>
  <xsl:apply-templates select="normalize-space(first)"/>
  <xsl:text>,</xsl:text>
  <xsl:apply-templates select="normalize-space(last)"/>
-->
</xsl:template> 

</xsl:stylesheet>

Note the comment in the second half of the "employee" template rule. We can't just insert the normalize-space() function inside the select attributes of the previous stylesheet's xsl:apply-templates instructions, because this function returns a string and xsl:apply-templates expects to see a node set expression as the value of its select attribute. So, the template uses xsl:value-of instructions instead, the normalize-space() function works, and the result is formatted consistently:



  09/01/1998,Johnny,Herbert

  04/23/1999,Phil,Hill

In next month's final installment of this series on controlling whitespace, we'll look at how an XSLT stylesheet can add tabs to your result document and a built-in feature that automates intelligent indenting of your result documents.


Comment on this articleHow do you treat whitespace in XSLT processing? Share your tips and tricks in our forums.
(* You must be a
member of XML.com to use this feature.)
Comment on this Article


Titles Only Titles Only Newest First
  • Preserving whitespace
    2009-04-09 14:18:08 shark59 [Reply]


    It’s a great article. I used suggestions from it in my code, but ran into a problem.
    When I use my XSLT program to transform one XML file into another one, everything works ok. I can see First Name and Last Name on one line, Address1 on the 2nd line and City, State and Zip on the 3rd line.
    But when we run this XSLT program as a part of the project all the information comes as one long string without spaces.
    What do you think may be the problem?


    Thanks a lot.
    Ilya.


  • Works like a charm
    2004-11-23 12:50:32 WildBlue27 [Reply]

    Thanks a lot for this information. I looked everywhere and the 'normalize-space()' function hadn't been documented anywhere I looked. It's very helpful and contributed to solving a big problem! Thanks!

  • I get a space instead of a line break
    2002-02-12 12:47:07 Meili Huang [Reply]

    Hi,


    When I apply the xsl to my xml, I got a space instead of a line break. Does anyone know what could be the problem here? My xml parser is 4.0


    Thanks
    Meili

  • Sample problem
    2002-02-11 07:12:07 Antonio del Pozzo [Reply]

    The article is very interesting. Nevertheless, when try to execute the samples I get error that tell me "Reference to undeclared namespace prefix: 'xsl' ". I use MSXML 3.0 on W2K, what's wrong?
    Thanks in advance
    Antonio

    • Sample problem
      2002-02-13 14:00:33 Bob DuCharme [Reply]

      As I told Meili in a private e-mail, the example work properly with XSLT processors that completely conform to the XSLT Recommendation such as Xalan and Xerces, but Microsoft's processors don't always conform,and this is the kind of problem that can come up.


      http://www.vbxml.com/xsl/XSLTRef.asp has a good chart showing where each release of the Microsoft processors are and aren't compliant. I don't know if there's anything about the elements used to control whitespace there.


      Bob

  • CDATA
    2001-12-07 19:21:19 Jay Singh [Reply]

    Bob,
    These are good articles but I need a little more help. Does anyone know of a way to control carrige returns within a CDATA section? Is there a replace of a convert function that could replace a
    tag within the CDATA section?


    Thanks,
    -JS


    • CDATA
      2002-02-13 14:06:19 Bob DuCharme [Reply]

      Making something a CDATA section is a way of saying "don't treat this as a parsable XML, just pass it along as-is," so I avoid it wherever possible. The XSLT Recommendation offers tricks for adding things to the result tree as CDATA, but doesn't say anything that I could find about dealing with CDATA on the source tree. This didn't suprise me, if the reason for having CDATA is to tell a parser to leave it alone. If you want the parser to get the data and hand it to the XSLT processor in a way that you can get it and manipulate it, it's best to not have it be CDATA. Is this an option?


      Bob