XML.com: XML From the Inside Out
oreilly.comSafari Bookshelf.Conferences.

advertisement

XSH, An XML Editing Shell
by Kip Hampton | Pages: 1, 2

Creating and Editing XML Documents

We can begin by creating a new XML document:

xsh :> create mynews news-channels

This creates a new document with the ID mynews with the top-level element news-channels and changes the context to the root of the new document. Let's have look:

xsh mynews:/> ls
<?xml version="1.0" encoding="utf-8"?>
<news-channels/>
xsh mynews:/>

So far, so good. Now lets add an element to the news-channels element.

xsh mynews:/> cd news-channels
xsh mynews:/news-channels> add element channel into .

We use the add command to add a channel element into the current context element, which is represented by a period character, and we can verify the result by listing the current context:

xsh mynews:/news-channels> ls
<news-channels><channel/></news-channels>

Found 1 node(s).
xsh mynews:/news-channels>

Note that the first argument for the add command must be the type of node being added (element in this case, ).

Suppose we need to add a name attribute to the new channel element, as well as an rss-url child element.

xsh mynews:/news-channels> add attribute "name='seepan uploads'" into 
     ./channel[1]
xsh mynews:/news-channels> add element rss-url into ./channel[1]

Next, we'll add the URL of the CPAN RSS file as text node of the rss-url element:


xsh mynews:/news-channels> add text "http://search.cpan.org/recent.rdf" 
     into ./channel[1]/rss-url

Let's add another channel element:

xsh mynews:/news-channels> add element channel before //channel[1]
xsh mynews:/news-channels> add attribute "name='perl news'" 
    into ./channel[1]
xsh mynews:/news-channels> add element rss-url into ./channel[1]
xsh mynews:/news-channels> add text "http://search.cpan.org/recent.rdf"
     into ./channel[1]/rss-url

We used the before location expression as the third argument to the add command, specifying the first channel element as the evaluation context. This inserts the new channel into the list as the preceding sibling of the previously created channel.

Again, we van verify this by listing all the channels in the document:

xsh mynews:/news-channels> ls //channel
<channel name="perl news"><rss-url>http://www.perl.com/pace/perlnews.rdf
     </rss-url></channel>
<channel name="seepan uploads"><rss-url>http://search.cpan.org/recent.rdf
     </rss-url></channel>

Found 2 node(s).
xsh mynews:/news-channels>

Careful readers will have noticed the "seepan" typo -- we can fix this using map, which applies a block of Perl code to nodes returned by the subsequent XPath expression:

xsh mynews:/news-channels> map { $_ = 'cpan uploads' } //channel[2]/@name

Here's a view of the full contents of our new document, obtained by listing the document's root:

xsh mynews:/news-channels> ls /
<?xml version="1.0" encoding="utf-8"?>
<news-channels>
  <channel name="perl news">
    <rss-url>http://www.perl.com/pace/perlnews.rdf</rss-url>
  </channel>
  <channel name="cpan uploads">
    <rss-url>http://search.cpan.org/recent.rdf</rss-url>
  </channel>
</news-channels>

Found 1 node(s).

Our new document is a bit simplistic, to be sure. But our goal here is just to demonstrate the basics of editing documents with xsh. What we've learned so far can be applied to the most complex XML documents.

To finish up, let's save our new document to disk and quit the shell:

xsh mynews:/news-channels> saveas mynews files/perl_channels.xml
mynews=new_document2.xml --> files/perl_channels.xml (utf-8)
saved mynews=files/perl_channels.xml as files/perl_channels.xml 
in utf-8 encoding
xsh mynews:/news-channels>
xsh mynews:/news-channels> exit
[user@host user]$

xsh Scripting

No shell would be complete without the ability to perform automated or scripted tasks. As a final example, let's create an xsh script, which uses the data contained in the perl_channels.xml document we just created, to fetch all the current Perl news items from all the channels into a single XML document:

quiet;
open sources=files/perl_channels.xml;
create merge news-items;
$i = 0;
foreach sources://rss-url {
    $name = string(.);
    open $i=$name;
    xcopy $i://item into merge:/news-items;
    close $i;
    $i=$i+1;
};

close sources;
saveas merge files/headlines.xml;
close merge;

Looking closer at this script we see that it loads the perl_channels.xml document, iterates over all of its <rss-url> elements, fetches each document from the Web using the open command to grab the URL, and copies all of each channel's <item> elements into a new document. The new document is then saved to disk as headlines.xml before exiting.

Starting to see why an XML editing shell isn't such a crazy idea? I know I am.

Going Further

I've offered a glimpse of the ease and power that xsh provides, but there are many more commands and features available. For example,

xslt doc1 some_stylesheet.xsl doc2

transforms the document with the ID doc1 using the XSLT stylesheet some_stylesheet.xsl and stores the result in new document with the ID doc2.

Similarly, the command

xupdate myxupdate doc1

alters the content of the doc1 document using the rules contained in the XUpdate document stored in myxupdate.

For a complete list of commands, type help command at the xsh prompt, or help commandname for detailed usage of a specific command.

Conclusions

I was initially skeptical about the notion of an "XML editing shell". At first glance, it seemed to me to be pushing the file path/XPath metaphor a bit too far; surely it's little more than a technical curiosity? But I was very wrong, and I don't mind admitting it. XML::XSH is an astonishingly powerful tool which has quickly become a new tool in my daily XML work. I highly recommend it.

Resources