
Learning C# XML
In his opening keynote at the IDEAlliance XML 2001 Conference in Orlando, Florida, in December, James Clark said: "Just because it comes from Microsoft, it's not necessarily bad". With that in mind, I decided to explore what C# has to offer to the Java-XML community.
I've been watching the continuing Microsoft story with a vague combination of intrigue and apprehension. You almost certainly know by now that, due to an awkward combination of hubris and court orders, Microsoft has stopped shipping any Java implementation with Windows, choosing instead to hitch its wagon to a star of its own making, C#.
As a consumer, I'm not sure whether I like Microsoft's business practices. As a software developer, however, I'm interested in learning new languages and technologies. I've read enough to see that C# is enough like Java to make an interesting porting project. Even if I never write another line of C# code, there is certainly a lot to be learned from how Microsoft has integrated XML into its .NET platform.
In this series I'll be porting a few small XML applications, which I've hypothetically written in Java, to C# in order to see if I can improve my Java programming.
The Exercises
The first Java application to port to C#, which I call
RSSReader, does something that most XML programmers have done
at some point: read in an RSS stream using SAX and convert it to
HTML. For our purposes, I'll expect to be reading an RSS
1.0 stream using JAXP and outputing out an HTML stream using
java.io classes. We'll see that this example ports nicely
to the C# XmlReader class.
Future examples will convert JDOM to the C#
XmlDocument and XmlNode classes, as well as
experimenting with ports from an XML databinding framework to
ADO.NET. There's a lot to ADO.NET, and I'll discuss some of that as
well.
Here's our first Java program, RSSReader, which was adapted from Sun's JAXP Tutorial. I've stripped out some of the error handling and such for the sake of simplicity.
package com.xml;
import java.io.*;
import java.util.Stack;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import org.xml.sax.*;
import org.xml.sax.helpers.DefaultHandler;
public class RSSReader extends DefaultHandler {
static private Writer out;
static String lineEnd = System.getProperty("line.separator");
Stack stack = new Stack();
StringBuffer value = null;
String title = null;
String link = null;
String desc = null;
public static void main(String args []) {
// create an instance of RSSReader
DefaultHandler handler = new RSSReader();
try {
// Set up output stream
out = new OutputStreamWriter(System.out, "UTF8");
// get a SAX parser from the factory
SAXParserFactory factory = SAXParserFactory.newInstance();
SAXParser saxParser = factory.newSAXParser();
// parse the document from the parameter
saxParser.parse(args[0], handler);
} catch (Exception t) {
System.err.println(t.getClass().getName());
t.printStackTrace(System.err);
}
}
public void startDocument() throws SAXException {
emit("<html>" + lineEnd);
}
public void endDocument() throws SAXException {
emit("</html>" + lineEnd);
}
public void startElement(String namespaceURI,String sName,
String qName,Attributes attrs) throws SAXException {
String eName = sName; // element name
if ("".equals(eName)) eName = qName; // namespaceAware = false
stack.push(eName);
value = new StringBuffer();
}
public void endElement(String namespaceURI,String sName,String qName)
throws SAXException {
String eName = (String)stack.pop();
if (eName.equals("title") && stack.peek().equals("channel")) {
emit(" <head>" + lineEnd);
emit(" <title>" + value + "</title>" + lineEnd);
emit(" </head>" + lineEnd);
emit(" <body>" + lineEnd);
} else if (eName.equals("title") &&
stack.peek().equals("item")) {
title = null == value ? "" : value.toString();
} else if (eName.equals("link") &&
stack.peek().equals("item")) {
link = null == value ? "" : value.toString();
} else if (eName.equals("description") &&
stack.peek().equals("item")) {
desc = null == value ? "" : value.toString();
} else if (eName.equals("item")) {
emit(" <p><a href=\"" + link + "\">" +
title + "</a><br>" + lineEnd);
emit(" " + desc + "</p>" + lineEnd);
} else if (eName.equals("channel")) {
emit(" </body>" + lineEnd);
}
value = null;
}
public void characters(char buf [], int offset, int len)
throws SAXException {
String s = new String(buf, offset, len);
value.append(s);
}
private static void emit(String s) throws SAXException {
try {
out.write(s);
out.flush();
} catch (IOException e) {
throw new SAXException("I/O error", e);
}
}
}
Compiling and running this program, we get the following results when we try to read XMLhack.com's RSS 1.0 feed (some lines have been wrapped for legibility):
C:\> java com.xml.RSSReader http://xmlhack.com/rss.php <html> <head> <title>xmlhack</title> </head> <body> <p><a href="http://www.xmlhack.com/read.php?item=1511">Activity around the Dublin Core</a><br> The Dublin Core Metadata Initiative (DCMI) has seen a recent spate of activity, Recent publications include The Namespace Policy for the Dublin Core Metadata Initiative,
Expressing Simple Dublin Core in RDF/XML,
and Expressing Qualified Dublin Core in RDF/XML.</p> ...