XML.com: XML From the Inside Out
oreilly.comSafari Bookshelf.Conferences.

advertisement

XML::LibXML - An XML::Parser Alternative
by Kip Hampton | Pages: 1, 2

Writing

To create an XML document programmatically with XML::LibXML you simply use the provided DOM interface:

use strict;
use XML::LibXML;

my $doc = XML::LibXML::Document->new();
my $root = $doc->createElement('html');
$doc->setDocumentElement($root);
my $body = $doc->createElement('body');
$root->appendChild($body);

foreach my $item (keys (%camelid_links)) {
   my $link = $doc->createElement('a');
   $link->setAttribute('href', $camelid_links{$item}->{url});
   my $text = XML::LibXML::Text->new($camelid_links{$item}->{description});
   $link->appendChild($text);
   $body->appendChild($link);
}
print $doc->toString;

An important difference between XML::LibXML and XML::DOM is that libxml2's object model conforms to the W3C DOM Level 2 interface, which is better able to cope with documents containing XML Namespaces. So, where XML::DOM is limited to:

@nodeset = getElementsByTagName($element_name);

and

$node = $doc->createElement($element_name);

XML::LibXML also provides:

@nodeset = getElementsByTagNameNS($namespace_uri, $element_name);

and

$node = $doc->createElementNS($namespace_uri, $element_name);

The Joy of SAX

Also in Perl and XML

OSCON 2002 Perl and XML Review

XSH, An XML Editing Shell

PDF Presentations Using AxPoint

Multi-Interface Web Services Made Easy

Perl and XML on the Command Line

We've seen the DOM and XPath goodness that XML::LibXML provides, but the story does not end there. The libxml2 library also offers a SAX interface that can be used to create DOM trees from SAX events or generate SAX events from DOM trees.

The following creates a DOM tree programmatically from a SAX driver built on XML::SAX::Base. In this example, the initial SAX events are generated from a custom driver implemented in the CamelDriver class that calls the handler events in the XML::LibXML::SAX::Builder class to build the DOM tree.

use XML::LibXML;
use XML::LibXML::SAX::Builder;

my $builder = XML::LibXML::SAX::Builder->new();
my $driver = CamelDriver->new(Handler => $builder);
my $doc = $driver->parse(%camelid_links);

# doc is an XML::LibXML::Document object
print $doc->toString;

package CamelDriver;
use base qw(XML::SAX::Base);

sub parse {
  my $self = shift;
  my %links = @_;
  $self->SUPER::start_document;
  $self->SUPER::start_element({Name => 'html'});
  $self->SUPER::start_element({Name => 'body'});

  foreach my $item (keys (%camelid_links)) {
    $self->SUPER::start_element({Name => 'a',
                                   Attributes => {
                                     'href' => $links{$item}->{url}
                                               }
                                });
    $self->SUPER::characters({Data => $links{$item}->{description}});
    $self->SUPER::end_element({Name => 'a'});
  }

  $self->SUPER::end_element({Name => 'body'});
  $self->SUPER::end_element({Name => 'html'});
  $self->SUPER::end_document;

}
1;

You can also generate SAX events from an existing DOM tree using XML::LibXML::SAX::Generator. In the following snippet, the DOM tree created by parsing the file camelids.xml is handed to XML::LibXML::SAX::Generator's generate() method which in turn calls the event handlers in XML::Handler::XMLWriter to print the document to STDOUT.

use strict;
use XML::LibXML;
use XML::LibXML::SAX::Generator;
use XML::Handler::XMLWriter;

my $file = 'files/camelids.xml';
my $parser = XML::LibXML->new();
my $doc = $parser->parse_file($file);
my $handler = XML::Handler::XMLWriter->new();
my $driver = XML::LibXML::SAX::Generator->new(Handler => $handler);

# generate SAX events that are captured
# by a SAX Handler or Filter.
$driver->generate($doc);

Resources

Download the sample code.

Perl XML Quickstart: The Standard XML Interfaces

Writing SAX Drivers for Non-XML Data

Transforming Data With SAX Filters

This ability to accept and emit SAX events is especially useful in light of the recent discussion in this column of generating SAX events from non-XML data and writing SAX filter chains. You could, for example, use a SAX driver written in Perl to emit events based on data returned from a database query that creates a DOM object, which is then transformed in C-space for display using XSLT and the mind-numbingly fast libxslt library (which expects libxml2 DOM objects), and then emit SAX events from that transformed DOM tree for further processing using custom SAX filters to provide the finishing touches -- all without once having had to serialize the document to a string for re-parsing. Wow.

Conclusions

As we have seen, XML::LibXML offers a fast, updated approach to XML processing that may be superior to the first-generation XML::Parser for many cases. Do not misunderstand, XML::Parser and its dependents are still quite useful, well-supported, and are not likely to go away any time soon. But it is not the only game in town, and given the added flexibility that XML::LibXML provides, I would strongly encourage you to give XML::LibXML a closer look before beginning your next Perl/XML project.



1 to 6 of 6

  1. 2006-05-05 02:14:37 okellynoel@hotmail.com
  2. dtd validation/validating parser
    2002-07-10 09:54:24 Soumitra Bhattacharya
  3. don't forget XML::LibXSLT
    2001-12-11 04:39:48 c t
  4. Another great one
    2001-11-30 08:22:10 Ilya Sterin
  5. SUPER
    2001-11-21 07:48:30 Bart Schuller
  6. hi boys
    2001-11-19 02:13:12 mojtaba ardameh
1 to 6 of 6