Transforming XML with PHP
This article compares two methods of transforming XML in PHP: PEAR's XML_Transformer package and the W3C XML transformation language XSLT. I will first describe the PEAR project and its philosophy, with a focus on its XML transformation techniques. I will then give a brief introduction to XSLT and the way to use it from PHP.
PEAR's main goal is to become a repository for PHP extensions and libraries. Its members try to standardize the way developers write portable and re-usable code [MaiaAIP01].
PEAR offers a wide variety of packages ready to use by PHP
developers. Most PEAR packages are subclasses of the standard base
classes [MaiaADLP01]. One of these
packages is the XML_Transformer. This package was created
to help you transform existing XML files with the help of PHP code.
XSLT stands for "Extensible Stylesheet Language Transformations" and is a W3C Recommendation. As most readers know, it is a powerful implementation of a transformation language for converting XML into either XML, HTML or simple text [Holman00].
While you need PEAR to use XML_Transformer, XSL
transformations can be processed internally by PHP. PHP offers XSLT
functionality at its core, making it easy to incorporate transformation
features into existing code.
As you can see, both technologies can transform XML files. But which technology best fits the needs of a PHP developer? Let's take a closer look at each one to find out.
XML_Transformer lets you map PHP functionality to
specified XML tags. It offers many possibilities of mapping XML tags.
You can map a specific tag, a complete XML namespace, or only a
specific tag within a given namespace. These methods will be described
later.
The way XML_Transformer implements this functionality can
be explained easily: it associates an opening tag to a specific PHP
callback and a closing tag to another PHP callback. It's very similar
to PHP's xml_parse() function.
You start the transformation engine by creating an
XML_Transformer object. The constructor accepts an array
of parameters that will change the behavior of the transformer. The
most important ones are the case folding options and the recursive
operation option.
Case folding lets you change the case of XML element names and
attributes. You can set the target case to either upper or lower case.
This can be accomplished by setting the caseFolding option
to true and setting the caseFoldingTo option to either
CASE_LOWER or CASE_UPPER.
<?php
require_once 'XML/Transformer.php';
$myTransformer = new XML_Transformer(
array(
'caseFolding' => true,
'caseFoldingTo' => 'CASE_LOWER'
)
);
?>
XML_Transformer XML namespace support is based on
qualified name prefixes rather than namespace URIs. This lack of URI
support has to do with the underlying XML parser. expat. PHP support for XML
parsing has been available since version 3.0.6, whereas support for
expat's namespace features have only been available since version
4.0.5.
What XML_Transformer considers a namespace is simply a
qualified name prefix: the prefix that is sometimes used when addressing
namespaces. Instead of transforming documents written in the following
way:
<?xml version="1.0"?>
<mydoc>
<start xmlns:myns="http://my/namespace">
<sometag />
</start>
</mydoc>
you should feed XML_Transformer documents like this:
<?xml version="1.0"?>
<mydoc>
<myns:start>
<myns:sometag />
</myns:start>
</mydoc>
The overloadNamespace() method overloads an XML namespace
prefix and binds all its elements to a PHP object. The object must
provide the startElement() and endElement()
methods. If you specify the &MAIN or null
namespace prefix, XML_Transformer maps XML elements that
don't belong to any namespace.
<?php
require_once 'XML/Transformer.php';
class My_Transformer_Object {
function startElement($element, $attributes)
{
// Your code here.
}
function endElement($element, $cdata)
{
// Your code here.
}
}
$myTransformer = new XML_Transformer();
$myTransformerObject = new My_Transformer_Object();
$myTransformer->overloadNamespace(
'myprefix',
&$myTransformerObject
);
?>
XML_Transformer provides an easier way to map
namespaces. It's called XML_Transformer_Namespace, and lets you map the
XML opening and ending tags to a
start_ELEMENTNAME($attributes) and
end_ELEMENTNAME($cdata), where ELEMENTNAME is
substituted by the XML element name to be mapped.
<?php
require_once 'XML/Transformer.php';
require_once 'XML/Transformer/Namespace.php';
class My_Transformer_Namespace extends Transformer_Namespace {
function start_myelement($attributes)
{
//
// This method is mapped to the
// 'myelement' opening tag.
//
}
function end_myelement($cdata)
{
//
// This method is mapped to the
// 'myelement' closing tag.
//
}
}
$myTransformer = new XML_Transformer();
$myTransformerNamespace = new My_Transformer_Namespace();
$myTransformer->overloadNamespace(
'myprefix',
&$myTransformerNamespace
);
?>
These examples demonstrate the true power and versatility of the
XML_Transformer package. You can manipulate XML files very
easily using only PHP code. Of course, you'll need a midlevel
knowledge of PEAR if you want to develop anything serious.
XSLT is a stylesheet language that transforms XML documents by using a "transformation specification". This specification is a set of rules that match elements. These rules describe the output of each element, based on its contents [Ray01].
The major difference between XSLT and other transformation engines is that XSLT crawls through the XML tree applying rules recursively. This method increases the control you have over the the transformation process, as there's no need to track context.
PEAR also features the
XML_XSLT_Wrapper, the goal of which is to provide an
interface to XSL transformations. It looks very promising, but it's still
in alpha state, so I'll stick to PHP native support.
PHP now comes with a builtin XSLT extension. This
extension is based on Gingerall's
Sablotron
engine and the expat
XML parser. You can check for this extension by issuing the
phpinfo() function if you plan to use these features in your
projects.
To start using XSLT directly from PHP, you will need an XSLT file and the XML document that you wish to transform.
<?php
$xh = xslt_create();
$myResult = xslt_process(
$xh,
'myContent.xml',
'myTransformation.xsl'
);
xslt_free($xh);?>
xslt_process() function accepts three more optional
parameters: the result container file name, the array of arguments to the
XSLT processor, and the array of parameters to the stylesheet. The
following example illustrates these parameters by assigning parameters to
the stylesheet.
<?php
$xh = xslt_create();
$args = array();
$params = array('foo' => 'bar');
$myResult = xslt_process(
$xh,
'myContent.xml',
'myTransformation.xsl',
null,
$args,
$params
);
xslt_free($xh);?>
XSLT is very easy to use from within PHP. All processing code is inside your XSLT file. You can also transform dynamic XML content without the need to read it from an external file. The PHP manual offers a more detailed explanation on the use of this and other features.
XSLT's transformation capacities rely on an external language. To maintain a large project's transformations you'll need to keep numerous external files. The advantage is that these files can be manipulated by a non-programmer.
While PEAR::XML_Transformer gives you greater flexibility
through the use of PHP, XSLT is easier to use by non-programmers.
XML_Transformer's approach lets you associate an XML
element's opening and closing tags with specific functions. XSLT's
transformation is tightly coupled with the XML tree.
If you plan to build your own set of namespaces and associated PHP
libraries, then I think XML_Transformer is the way to go.
If you want to give other people the ability to create custom
transformations, then I recommend XSLT.
Bibliography
|
XML.com Copyright © 1998-2006 O'Reilly Media, Inc.