XML.com: XML From the Inside Out
oreilly.comSafari Bookshelf.Conferences.

advertisement

Getting Started with XOM
by Michael Fitzgerald | Pages: 1, 2

Serializing Output

You can use the Serializer class to encode output, format it, or send it to a file, among other things. The Time.java program shows you how to do this.


import java.io.FileOutputStream;
import java.io.IOException;
import nu.xom.Attribute;
import nu.xom.Builder;
import nu.xom.Element;
import nu.xom.Document;
import nu.xom.Serializer;
import nu.xom.ParseException;

public class Time {

    public static void main(String[] args)
        throws IOException, ParseException {

        Builder builder = new Builder();
        Document doc = builder.build("inst.xml");
        Element root = doc.getRootElement();
        Element utc = new Element("utc");
        Attribute att = new Attribute("offset", "-08:00");
        utc.addAttribute(att);
        root.insertChild(0, "\n ");
        root.insertChild(1, utc);
        root.removeChild(4);
        root.removeChild(4);

        Element time = new Element("time");    
        Element hr = new Element("hour");
        time.appendChild(hr);
        hr.appendChild("10");
        Element min = new Element("minute");
        time.appendChild(min);
        min.appendChild("17");
        Element sec = new Element("second");
        time.appendChild(sec);
        sec.appendChild("33");
        Element zone = new Element("zone");
        time.appendChild(zone);
        zone.appendChild("PST");
        root.appendChild(time);

        FileOutputStream out = new FileOutputStream("inst-new.xml");
        Serializer ser = new Serializer(out, "ISO-8859-1");
        ser.setIndent(1);
        ser.write(doc);

    }

}

The program creates five elements and appends these nodes after the last remaining child of instant, which happens to be the date element (the old time element having been removed). The FileOutputStream is also imported and an output file is created (inst-new.xml). The constructor for Serializer specifies an output stream and a character encoding (ISO-8859-1). Serializer also supports encoding for UTF-8, UTF-16, ISO-10646-USC-2, and ISO-8859-2 through ISO-8859-16. The setIndent() method indents child nodes by a line feed plus one space character. The write() method writes the document to the file inst-new.xml:


<?xml version="1.0" encoding="ISO-8859-1"?>
<instant> 
 <utc offset="-08:00"/> 
 <date month="December" day="1" year="2002"/> 
 <time>
  <hour>10</hour>
  <minute>17</minute>
  <second>33</second>
  <zone>PST</zone>
 </time>
</instant>

Without Serializer, the output of the new elements would appear without indentation, as in time2.xml (see Time2.java):


<?xml version="1.0"?>
<instant>
 <utc offset="-08:00" />
 <date month="December" day="1" year="2002" />
<time><hour>10</hour><minute>17</minute><second>33</second>
<zone>PST</zone></time></instant>

You could also send the new XML document to standard output instead of a file (see the Serializer constructor in Time3.java).

One More Program

This last program, Final.java, adds several other common structures to the XML document:


import java.io.FileOutputStream;
import java.io.IOException;
import nu.xom.Attribute;
import nu.xom.Builder;
import nu.xom.Comment;
import nu.xom.DocType;
import nu.xom.Element;
import nu.xom.Document;
import nu.xom.Serializer;
import nu.xom.Text;
import nu.xom.ProcessingInstruction;
import nu.xom.ParseException;

public class Final {

    public static void main(String[] args)
        throws IOException, ParseException {

        Builder builder = new Builder();
        Document doc = builder.build("inst.xml");
        Element root = doc.getRootElement();

        DocType dtd = new DocType("instant", "final.dtd");
        ProcessingInstruction pi =
            new ProcessingInstruction("xml-stylesheet",
                "href=\"final.xsl\" type=\"text/xsl\"");
        doc.insertChild(0, dtd);
        doc.insertChild(1, pi);

        Element utc = new Element("utc", "http://www.wyeast.net/utc");
        Comment gmt = new Comment(" Greenwich Mean Time ");
        Attribute att = new Attribute("offset", "-08:00");
        utc.addAttribute(att);
        root.insertChild(0, "\n ");
        root.insertChild(1, gmt);
        root.insertChild(2, "\n ");
        root.insertChild(3, utc);
        root.removeChild(6);
        root.removeChild(6);

        Element time = new Element("time");    
        Element hr = new Element("hour");
        time.appendChild(hr);
        Text h = new Text("11");
        h.setData("10");
        hr.appendChild(h);
        Element min = new Element("minute");
        time.appendChild(min);
        min.appendChild("17");
        Element sec = new Element("second");
        time.appendChild(sec);
        sec.appendChild("33");
        Element zone = new Element("zone", "urn:wyeast-net:utc");
        zone.setNamespaceURI("http://www.wyeast.net/utc");
        time.appendChild(zone);
        zone.appendChild("PST");
        root.appendChild(time);

        FileOutputStream out = new FileOutputStream("final.xml");
        Serializer ser = new Serializer(out, "UTF-8");
        ser.setIndent(3);
        ser.write(doc);

    } 
}

This program creates both a document type declaration and a processing instruction, then inserts them into the prolog of the final.xml. A namespace is declared for the utc element and a comment is inserted just above it. The text child or content of the hour element is set with the Text class; then it's changed with the setData() method of Text. Another namespace is set for zone in its Element constructor and then altered with the setNamespaceURI() method.

Here is the file that this program outputs:


<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE instant SYSTEM "final.dtd">
<?xml-stylesheet href="final.xsl" type="text/xsl"?>
<instant>
   <!-- Greenwich Mean Time -->
   <utc xmlns="http://www.wyeast.net/utc" offset="-08:00"/>
   <date month="December" day="1" year="2002"/> 
   <time>
      <hour>10</hour>
      <minute>17</minute>
      <second>33</second>
      <zone xmlns="http://www.wyeast.net/utc">PST</zone>
   </time>
</instant>

Wrapping Up

It's worth noting that XOM avoids convenience methods like the plague. But it is flexible enough to allows users to write their own methods in subclasses. XOM also has several other packages, which I haven't discussed in this article: nu.com.canonical, a serializer for outputting canonical XML; nu.xom.xslt, supporting XSLT transformations for TrAX-aware processors, such as Saxon; and nu.xom.xinclude, an implementation of XML Inclusions.

I've found XOM to be simple and straightforward. It offers me a lot of functionality without much fuss. If you have any suggestions for XOM's development, contribute them by subscribing to the XOM-interest mailing list.



1 to 3 of 3
  1. Very helpfull
    2006-06-14 13:53:02 Edderd
  2. Updated source?
    2006-06-14 12:02:47 Edderd
  3. XOM vs JDOM
    2002-12-06 22:41:24 Andre Mesarovic
1 to 3 of 3