XML.com: XML From the Inside Out
oreilly.comSafari Bookshelf.Conferences.

advertisement

From JDOM to XmlDocument

April 03, 2002

The Microsoft .NET framework is becoming well known for its integration of XML into nearly all data-manipulation tasks. In the first article in this series, I walked through the process of porting a simple Java application using SAX to one using .NET's XmlReader. I concluded that there are advantages and disadvantages to each language's way of doing things, but pointed out that if you are not a fan of forward-only, event-based XML parsing, neither one will really fit your fancy. This article focuses on porting whole-document XML programs from Java to C#.

The Exercise

I begin with one of those standard small programs that everyone has written at some point to learn about XML. I've written a Java program which used JDOM to read and write an XML file representing a catalog of compact discs. Of course this is really a program of little practical use given the relatively easy availability of similar free and open source applications; however, it represents a fairly simple problem domain, and it also allows me to show off the diversity of my CD collection.

So, to begin, here is the source listing for CDCatalog.java. I have not bothered to create any sort of DTD or schema at this time because it's a very simple document, and validation is not necessary.


package com.xml;

import org.jdom.Document;
import org.jdom.Element;
import org.jdom.JDOMException;
import org.jdom.input.SAXBuilder;
import org.jdom.output.XMLOutputter;

import java.io.File;
import java.io.FileWriter;
import java.io.IOException;
import java.io.PrintStream;
import java.util.Iterator;
import java.util.List;

public class CDCatalog {

    File file = null;
    Document document = null;

    public static void main(String args[]) {
        if (args.length > 0) {
            String xmlFile = args[0];
            CDCatalog catalog = new CDCatalog(xmlFile);

            String action = args[1];

            try {
                if (args.length == 5) {
                    String title = args[2];
                    String artist = args[3];
                    String label = args[4];
                    if (action.equals("add")) {
                        catalog.addCD(title, artist, 
                            label);
                    } else if (action.equals("delete")) {
                        catalog.deleteCD(title, 
                            artist, label);
                    }
                }

                // save the changed catalog
                catalog.save();
            } catch (Exception e) {
                e.printStackTrace(System.err);
            }
        }
    }

    public CDCatalog(String fileName) {
        try {
            file = new File(fileName);
            if (file.exists()) {
                loadDocument(file);
            } else {
                createDocument();
            }
        } catch (Exception e) {
            e.printStackTrace(System.err);
        }
    }

    private void loadDocument(File file) 
    throws JDOMException {
        SAXBuilder builder = new SAXBuilder();
        document = builder.build(file);
    }

    private void createDocument() {
        Element root = new Element("CDCatalog");
        document = new Document(root);
    }

    public void addCD(String title, String artist, 
        String label) {
        Element cd = new Element("CD");
        cd.setAttribute("title",title);
        cd.setAttribute("artist",artist);
        cd.setAttribute("label",label);

        document.getRootElement().getChildren().add(cd);
    }

    public void deleteCD(String title, String artist,  
        String label) {
        List cds = document.getRootElement().getChildren();
        for (int i = 0; i < cds.size(); i++) {
            Element next = (Element)cds.get(i);
            if (next.getAttribute("title").getValue()
                .equals(title) &&
                next.getAttribute("artist").getValue()
                .equals(artist) &&
                next.getAttribute("label").getValue()
                .equals(label)) {
                next.detach();
            }
        }
    }

    public void save() throws IOException {
        XMLOutputter outputter = new XMLOutputter();
        FileWriter writer = new FileWriter(file);
        outputter.output(document,writer);
        writer.close();
    }
}

I very intentionally used JDOM for this program rather than DOM; since the target of our port is a real DOM, I wanted to add the twist of starting off with an API that, while DOM-like, is really not DOM.

Compiling and running it a few times, we get this XML file (reformatted to be more human-readable):


C:\java com.xml.CDCatalog CDCatalog.xml add "Dummy" 
  "Portishead" "Go!"

<?xml version="1.0" encoding="UTF-8"?>
<CDCatalog>
  <CD title="Dummy" artist="Portishead" label="Go!" />
  <CD title="Caribe Atomico" artist="Aterciopelados" 
    label="BMG" />
  <CD title="New Favorite" artist="Alison Kraus + 
    Union Station" label="Rounder" />
  <CD title="Soon As I'm On Top Of Things" artist=
    "Zoe Mulford" label="MP3.com" />
  <CD title="Japanese Melodies" artist="Yo-Yo Ma" 
    label="CBS" />
  <CD title="In This House, On This Morning" artist=
    "Wynton Marsalis Septet" label="Columbia" />
</CDCatalog>

Everything's a Node

Related Reading

C# in a Nutshell

C# in a Nutshell
By Peter Drayton, Ben Albahari

XmlDocument is the .NET XML document tree view object. Much like JDOM's Document class, XmlDocument allows you to access any node in the XML tree randomly. Unlike Document, however, XmlDocument is itself a subclass of XmlNode. I didn't talk about XmlNode in the previous article, so let's take a look at it and some of its members now. If you are already familiar with DOM, this will explain how .NET implements it; if not, this should serve as a basic introduction, with the caveat that we're talking specifically about the .NET implementation.

The System.Xml assembly is somewhat monolithic; everything you might find in an XML document is a subclass of XmlNode. Besides XmlDocument, this includes the document type (XmlDocumentType), elements (XmlElement), attributes (XmlAttribute), CDATA (XmlCDataSection), even the humble entity reference (XmlEntityReference). And, as you can tell, the object names are very descriptive.

As an experienced object-oriented developer, you know that this makes for some very nicely polymorphic code. All of XmlNode's subclasses inherit, override, or overload several important members. Among the properties that you can access are the following.

  • Name - the name of the node
  • NamespaceURI - the namespace of the node
  • NodeType - the type of the node
  • OwnerDocument - the document in which the node appears
  • Prefix - the namespace prefix of the node
  • Value - the value of the node

I said all of these members are properties. It may at first seem like a violation of object-oriented encapsulation, but you may access any of these properties directly if you want to change, for example, the value of a node. Once you realize that getting or setting a C# property actually involves calling an implicit accessor method, however, any objections to this technique should disappear. These accessor methods are generated automatically through the use of a special syntax, which you'll see if you look up XmlNode.Value, for example, in the .NET framework SDK reference:

public virtual string Value {get; set;}

An explanation of the mechanism at work here is beyond the scope of this article; suffice to say, it works.

Comment on this articleComments or questions about this article? Share them in our forum.
Post your comments

There are exceptions to this willy-nilly access to C# properties, of course; you cannot set the NodeType of an XmlNode, because it already is what it is. Also, you can set some properties of some node types and not others; for example, while you can set the Value of an XmlAttribute, attempting to set the Value of an XmlElement will cause an InvalidOperationException to be thrown, because an element cannot have a value (though it can certainly have children elements and attributes and other types of nodes).

Additionally, some nodes may be marked as read-only. You can check the IsReadOnly property of any XmlNode to verify whether it can be changed. Setting some properties of a read-only node will cause an ArgumentException to be thrown. In those cases where the node is read-only, all you can really do is remove it from one location in the tree and insert it elsewhere.

XmlNodes have methods as well as properties. Some of the ones you'll use a lot are AppendChild() and InsertAfter(), both of whose names are fairly descriptive.

That's enough description for now, let's dive into the code.

Pages: 1, 2

Next Pagearrow