Menu

Transforming XML with PHP

June 18, 2003

Bruno Pedro

This article compares two methods of transforming XML in PHP: PEAR's XML_Transformer package and the W3C XML transformation language XSLT. I will first describe the PEAR project and its philosophy, with a focus on its XML transformation techniques. I will then give a brief introduction to XSLT and the way to use it from PHP.

Introduction

PEAR's main goal is to become a repository for PHP extensions and libraries. Its members try to standardize the way developers write portable and re-usable code [MaiaAIP01].

PEAR offers a wide variety of packages ready to use by PHP developers. Most PEAR packages are subclasses of the standard base classes [MaiaADLP01]. One of these packages is the XML_Transformer. This package was created to help you transform existing XML files with the help of PHP code.

XSLT stands for "Extensible Stylesheet Language Transformations" and is a W3C Recommendation. As most readers know, it is a powerful implementation of a transformation language for converting XML into either XML, HTML or simple text [Holman00].

While you need PEAR to use XML_Transformer, XSL transformations can be processed internally by PHP. PHP offers XSLT functionality at its core, making it easy to incorporate transformation features into existing code.

As you can see, both technologies can transform XML files. But which technology best fits the needs of a PHP developer? Let's take a closer look at each one to find out.

PEAR::XML_Transformer

XML_Transformer lets you map PHP functionality to specified XML tags. It offers many possibilities of mapping XML tags. You can map a specific tag, a complete XML namespace, or only a specific tag within a given namespace. These methods will be described later.

The way XML_Transformer implements this functionality can be explained easily: it associates an opening tag to a specific PHP callback and a closing tag to another PHP callback. It's very similar to PHP's xml_parse() function.

How it works

You start the transformation engine by creating an XML_Transformer object. The constructor accepts an array of parameters that will change the behavior of the transformer. The most important ones are the case folding options and the recursive operation option.

Case folding lets you change the case of XML element names and attributes. You can set the target case to either upper or lower case. This can be accomplished by setting the caseFolding option to true and setting the caseFoldingTo option to either CASE_LOWER or CASE_UPPER.

<?php

require_once 'XML/Transformer.php';



$myTransformer = new XML_Transformer(

    array(

          'caseFolding' => true,

          'caseFoldingTo' => 'CASE_LOWER'

         )

);

?>

XML Namespaces

XML_Transformer XML namespace support is based on qualified name prefixes rather than namespace URIs. This lack of URI support has to do with the underlying XML parser. expat. PHP support for XML parsing has been available since version 3.0.6, whereas support for expat's namespace features have only been available since version 4.0.5.

What XML_Transformer considers a namespace is simply a qualified name prefix: the prefix that is sometimes used when addressing namespaces. Instead of transforming documents written in the following way:

<?xml version="1.0"?>

<mydoc>

    <start xmlns:myns="http://my/namespace">

        <sometag />

    </start>

</mydoc>

you should feed XML_Transformer documents like this:

<?xml version="1.0"?>

<mydoc>

    <myns:start>

        <myns:sometag />

    </myns:start>

</mydoc>

The overloadNamespace() method overloads an XML namespace prefix and binds all its elements to a PHP object. The object must provide the startElement() and endElement() methods. If you specify the &MAIN or null namespace prefix, XML_Transformer maps XML elements that don't belong to any namespace.

<?php

require_once 'XML/Transformer.php';



class My_Transformer_Object {



    function startElement($element, $attributes)

    {

    // Your code here.

    }



    function endElement($element, $cdata)

    {

    // Your code here.

    }

}



$myTransformer = new XML_Transformer();

$myTransformerObject = new My_Transformer_Object();

$myTransformer->overloadNamespace(

                                  'myprefix', 

                                  &$myTransformerObject

                                 );

?>

XML_Transformer provides an easier way to map namespaces. It's called XML_Transformer_Namespace, and lets you map the XML opening and ending tags to a start_ELEMENTNAME($attributes) and end_ELEMENTNAME($cdata), where ELEMENTNAME is substituted by the XML element name to be mapped.

<?php

require_once 'XML/Transformer.php';

require_once 'XML/Transformer/Namespace.php';



class My_Transformer_Namespace extends Transformer_Namespace {



    function start_myelement($attributes)

    {

    //

    // This method is mapped to the

    // 'myelement' opening tag.

    //

    }



    function end_myelement($cdata)

    {

    //

    // This method is mapped to the

    // 'myelement' closing tag.

    //

    }

}



$myTransformer = new XML_Transformer();

$myTransformerNamespace = new My_Transformer_Namespace();

$myTransformer->overloadNamespace(

                                  'myprefix', 

                                  &$myTransformerNamespace

                                 );

?>

These examples demonstrate the true power and versatility of the XML_Transformer package. You can manipulate XML files very easily using only PHP code. Of course, you'll need a midlevel knowledge of PEAR if you want to develop anything serious.

XSLT

XSLT is a stylesheet language that transforms XML documents by using a "transformation specification". This specification is a set of rules that match elements. These rules describe the output of each element, based on its contents [Ray01].

The major difference between XSLT and other transformation engines is that XSLT crawls through the XML tree applying rules recursively. This method increases the control you have over the the transformation process, as there's no need to track context.

PEAR also features the XML_XSLT_Wrapper, the goal of which is to provide an interface to XSL transformations. It looks very promising, but it's still in alpha state, so I'll stick to PHP native support.

PHP now comes with a builtin XSLT extension. This extension is based on Gingerall's Sablotron engine and the expat XML parser. You can check for this extension by issuing the phpinfo() function if you plan to use these features in your projects.

Using XSLT from within PHP

To start using XSLT directly from PHP, you will need an XSLT file and the XML document that you wish to transform.

<?php

$xh = xslt_create();



$myResult = xslt_process(

                         $xh,

                         'myContent.xml',

                         'myTransformation.xsl'

                        );



xslt_free($xh);?>

xslt_process() function accepts three more optional parameters: the result container file name, the array of arguments to the XSLT processor, and the array of parameters to the stylesheet. The following example illustrates these parameters by assigning parameters to the stylesheet.

<?php

$xh = xslt_create();



$args = array();

$params = array('foo' => 'bar');



$myResult = xslt_process(

                         $xh,

                         'myContent.xml',

                         'myTransformation.xsl',

                         null,

                         $args,

                         $params

                        );



xslt_free($xh);?>

XSLT is very easy to use from within PHP. All processing code is inside your XSLT file. You can also transform dynamic XML content without the need to read it from an external file. The PHP manual offers a more detailed explanation on the use of this and other features.

XSLT's transformation capacities rely on an external language. To maintain a large project's transformations you'll need to keep numerous external files. The advantage is that these files can be manipulated by a non-programmer.

Conclusion

While PEAR::XML_Transformer gives you greater flexibility through the use of PHP, XSLT is easier to use by non-programmers. XML_Transformer's approach lets you associate an XML element's opening and closing tags with specific functions. XSLT's transformation is tightly coupled with the XML tree.

If you plan to build your own set of namespaces and associated PHP libraries, then I think XML_Transformer is the way to go. If you want to give other people the ability to create custom transformations, then I recommend XSLT.

Bibliography