org.brownell.xml
Class DomBuilder

java.lang.Object
  |
  +--org.brownell.xml.DomBuilder

public class DomBuilder
extends java.lang.Object

Builds a DOM document from the output of a SAX 2.0 (or SAX 1.0) parser, using a defaulted or specified DOM implementation and parser. For example, a validating XML parser can be connected with an XML DOM; or an HTML parser could be connected to an HTML (or non-HTML) DOM.

Note that some of the information exposed through DOM is not of general interest, and does not truly relate to the core semantic model of XML as consisting of elements, attributes, text, and processing instructions. By default, this builder only exposes those core node types, and will not create any "extra" nodes, such as those for comments or ignorable whitespace. This behavior may be changed by using the setSavingExtraNodes() method. The most useful of such nodes are probably comments, which are used in some legacy environments, such as HTML/XHTML, to wrap content such as inlined CSS style directives or scripting code.

As a rule, if you ignore "extra" nodes and all of the incomplete DOM Level 1 DTD functionality, and provide a SAX2 parser, the main portability issue your code may have is that some SAX nonvalidating SAX parsers will report ignorable whitespace characters as normal character data.


The DOM implementation class used is specified either as a parameter to a construction method, or is derived from the value of the org.brownell.xml.DomBuilder.Document system property. That class must provide a default constructor, which creates an object conforming to the DOM Level 1 core "Document" API.

The SAX parser used is either provided as a constructor parameter, or is gotten from the standard SAX ParserFactory. In the latter case, that parser should be an XML parser, for reasons of portability. The parser used should be a SAX 2.0 parser; see below for details of how the SAX 1.0 API does not provide information needed to support DOM correctly (it's more than just hiding data for "extra" nodes).

By providing a parser directly, the caller can ensure that it has been properly configured. For example, it might be set up to validate. The caller may set up the ErrorHandler, Locale, and EntityResolver. This builder will assign the SAX 1.0 Document and DTD handlers of the parser, and for a SAX 2.0 parser will also assign Lexical and Decl handlers if those are supported by the parser. Other handlers, except for the error handler and the entity resolver, are reserved for future use by this builder.


Because of missing functionality in the DOM Level 1 APIs, the following DTD-related functionality can't be supported by software, such as this builder, which does not use proprietary extensions to the DOM APIs.

Some other functionality is not available through SAX, even using the SAX2 parser APIs, and so can't be provided here. Some other functionality (some affecting correctness of DOM data models) is only available in those SAX2 APIs, and so can't be provided when a SAX1 parser is in use.


Using proprietary DOM builder APIs may well be faster than using this class and public APIs, because those proprietary APIs can eliminate some of the conversion costs (e.g. character arrays to strings, and often back again to character arrays) incurred by this class, as well as avoid repeating certain integrity tests (such as ensuring that names are legal XML 1.0 names).

At this time, only XML 1.0 conformance is relied on; documents using XML Namespaces are treated just like any other XML 1.0 document, and no checks are made for conformance with the optional namespace specification. Such additional checks would include: rejecting entity, notation, and processing instruction names with colons; rejecting ID, IDREF(S), NOTATION, and ENTITY(IES) attribute values with colons; rejecting certain legal XML 1.0 names, such as ::: or :foo:bar; requiring namespace prefixes to be declared; and ensuring that two values for one attribute aren't provided (using different namespace prefixes).

Version:
1.0 (30 June 1999)
Author:
David Brownell (db@post.harvard.edu)

Constructor Summary
DomBuilder()
          Constructs a builder using the default DOM document class and the default SAX parser.
DomBuilder(org.xml.sax.Parser parser)
          Constructs a builder using the default DOM document class and the specified SAX parser.
DomBuilder(java.lang.String DOMDocumentClassName)
          Constructs a builder using the specified DOM document class and the default SAX parser.
DomBuilder(java.lang.String DOMDocumentClassName, org.xml.sax.Parser parser)
          Constructs a builder using the specified DOM document class and the specified SAX (2.0 or 1.0) parser.
 
Method Summary
static org.w3c.dom.Document createDocument(org.xml.sax.InputSource input)
          Convenience routine, which uses the default DOM and parser (as described above) to parse the specified (XML) document into a DOM document tree.
static org.w3c.dom.Document createDocument(java.lang.String uri)
          Convenience routine, which uses the default DOM and parser (as described above) to parse the specified (XML) document into a DOM document tree.
 boolean isSavingExtraNodes()
          Returns true if the builder is saving "extra" nodes, and false (the default) otherwise.
 org.w3c.dom.Document parse(org.xml.sax.InputSource input)
          Parses the document provided, returning its contents as a DOM document.
 org.w3c.dom.Document parse(java.lang.String uri)
          Parses the specified document, returning its contents as a DOM document.
 void setSavingExtraNodes(boolean flag)
          Controls whether the builder will save "extra" nodes.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

DomBuilder

public DomBuilder()
           throws java.lang.ClassNotFoundException,
                  java.lang.NullPointerException,
                  java.lang.IllegalAccessException,
                  java.lang.InstantiationException,
                  java.lang.ClassCastException
Constructs a builder using the default DOM document class and the default SAX parser.
Throws:
java.lang.ClassNotFoundException - The default DOM implementation or SAX parser class could not be found. Check your class path and the system property values.
java.lang.NullPointerException - The default DOM implementation or SAX parser name was null. Check the system property values.
java.lang.IllegalAccessException - The default DOM implementation or SAX parser class could not be loaded.
java.lang.InstantiationException - The DOM implementation class or SAX parser class was successfully loaded, but could not be instantiated. Perhaps it has no default constructor.
java.lang.ClassCastException - The default DOM implementation class was loaded and instantiated, but it does not implement the DOM Document interface. Or the default SAX parser class was loaded and instantiated, but it does not implement the SAX 1.0 parser interface.

DomBuilder

public DomBuilder(java.lang.String DOMDocumentClassName)
           throws java.lang.ClassNotFoundException,
                  java.lang.NullPointerException,
                  java.lang.IllegalAccessException,
                  java.lang.InstantiationException,
                  java.lang.ClassCastException
Constructs a builder using the specified DOM document class and the default SAX parser.
Parameters:
DOMDocumentClassName - The name of the class implementing the kind of DOM document to be returned.
Throws:
java.lang.ClassNotFoundException - The specified DOM implementation or default SAX parser class could not be found. Check your class path and the values of those strings.
java.lang.NullPointerException - The specified DOM implementation or default SAX parser name was null. Check the values of those strings.
java.lang.IllegalAccessException - The specified DOM implementation or default SAX parser class could not be loaded.
java.lang.InstantiationException - The DOM implementation class or SAX parser class was successfully loaded, but could not be instantiated. Perhaps it has no default constructor.
java.lang.ClassCastException - The specified DOM implementation class was loaded and instantiated, but it does not implement the DOM Document interface. Or the default SAX parser class was loaded and instantiated, but it does not implement the SAX 1.0 parser interface.

DomBuilder

public DomBuilder(org.xml.sax.Parser parser)
           throws java.lang.ClassNotFoundException,
                  java.lang.NullPointerException,
                  java.lang.IllegalAccessException,
                  java.lang.InstantiationException,
                  java.lang.ClassCastException
Constructs a builder using the default DOM document class and the specified SAX parser.
Parameters:
parser - The SAX parser to be used; it may be partially configured, as described above.
Throws:
java.lang.ClassNotFoundException - The default DOM implementation class could not be found. Check your class path and the appropriate system property.
java.lang.NullPointerException - The default DOM implementation class name was null. Check the appropriate system property value.
java.lang.IllegalAccessException - The default DOM implementation class could not be loaded.
java.lang.InstantiationException - The DOM implementation class was successfully loaded, but could not be instantiated. Perhaps it has no default constructor.
java.lang.ClassCastException - The default DOM implementation class was loaded and instantiated, but it does not implement the DOM Document interface.

DomBuilder

public DomBuilder(java.lang.String DOMDocumentClassName,
                  org.xml.sax.Parser parser)
           throws java.lang.ClassNotFoundException,
                  java.lang.NullPointerException,
                  java.lang.IllegalAccessException,
                  java.lang.InstantiationException,
                  java.lang.ClassCastException
Constructs a builder using the specified DOM document class and the specified SAX (2.0 or 1.0) parser.
Parameters:
DOMDocumentClassName - The name of the class implementing the kind of DOM document to be returned.
parser - The SAX parser to be used; it may be partially configured, as described above.
Throws:
java.lang.ClassNotFoundException - The specified DOM implementation class could not be found. Check your class path.
java.lang.NullPointerException - The specified DOM implementation class name was null.
java.lang.IllegalAccessException - The specified DOM implementation class could not be loaded.
java.lang.InstantiationException - The DOM implementation class was successfully loaded, but could not be instantiated. Perhaps it has no default constructor.
java.lang.ClassCastException - The specified DOM implementation class was loaded and instantiated, but it does not implement the DOM Document interface.
Method Detail

createDocument

public static org.w3c.dom.Document createDocument(java.lang.String uri)
                                           throws org.xml.sax.SAXException,
                                                  java.io.IOException,
                                                  org.w3c.dom.DOMException,
                                                  java.lang.IllegalAccessException,
                                                  java.lang.InstantiationException,
                                                  java.lang.ClassNotFoundException
Convenience routine, which uses the default DOM and parser (as described above) to parse the specified (XML) document into a DOM document tree.
Parameters:
uri - Identifies the resource to be parsed.
Throws:
java.lang.ClassNotFoundException - The default DOM implementation or SAX parser class could not be found. Check your class path and the system property values.
java.lang.NullPointerException - The default DOM implementation or SAX parser name was null. Check the system property values.
java.lang.IllegalAccessException - The default DOM implementation or SAX parser class could not be loaded.
java.lang.InstantiationException - The DOM implementation class or SAX parser class was successfully loaded, but could not be instantiated. Perhaps it has no default constructor.
java.lang.ClassCastException - The default DOM implementation class was loaded and instantiated, but it does not implement the DOM Document interface. Or the default SAX parser class was loaded and instantiated, but it does not implement the SAX 1.0 parser interface.

createDocument

public static org.w3c.dom.Document createDocument(org.xml.sax.InputSource input)
                                           throws org.xml.sax.SAXException,
                                                  java.io.IOException,
                                                  org.w3c.dom.DOMException,
                                                  java.lang.IllegalAccessException,
                                                  java.lang.InstantiationException,
                                                  java.lang.ClassNotFoundException
Convenience routine, which uses the default DOM and parser (as described above) to parse the specified (XML) document into a DOM document tree.
Parameters:
input - Provided to the SAX parser as input.
Throws:
java.lang.ClassNotFoundException - The default DOM implementation or SAX parser class could not be found. Check your class path and the system property values.
java.lang.NullPointerException - The default DOM implementation or SAX parser name was null. Check the system property values.
java.lang.IllegalAccessException - The default DOM implementation or SAX parser class could not be loaded.
java.lang.InstantiationException - The DOM implementation class or SAX parser class was successfully loaded, but could not be instantiated. Perhaps it has no default constructor.
java.lang.ClassCastException - The default DOM implementation class was loaded and instantiated, but it does not implement the DOM Document interface. Or the default SAX parser class was loaded and instantiated, but it does not implement the SAX 1.0 parser interface.

isSavingExtraNodes

public boolean isSavingExtraNodes()
Returns true if the builder is saving "extra" nodes, and false (the default) otherwise. "Extra" nodes are defined to be ignorable whitespace, comments, and the use of CDATA nodes instead of normal text nodes. (Entity Reference nodes are also "extra", but can't be exposed in any case since the DOM doesn't expose an API that permits them to be constructed.)
See Also:
setSavingExtraNodes(boolean)

setSavingExtraNodes

public void setSavingExtraNodes(boolean flag)
Controls whether the builder will save "extra" nodes.
Parameters:
flag - True iff extra nodes should be saved; false otherwise.
See Also:
isSavingExtraNodes()

parse

public org.w3c.dom.Document parse(java.lang.String uri)
                           throws org.xml.sax.SAXException,
                                  java.io.IOException,
                                  org.w3c.dom.DOMException
Parses the specified document, returning its contents as a DOM document. This uses the SAX parser and DOM implementation which were specified in the constructor to this builder.
Parameters:
uri - Identifies the resource to be parsed.
Throws:
org.xml.sax.SAXException - As reported by the parser (in which case it is often a SAXParseException) or this builder.
java.io.IOException - As reported by the parser
org.w3c.dom.DOMException - As reported by the DOM; always indicates a bug in either the DOM or the SAX parser

parse

public org.w3c.dom.Document parse(org.xml.sax.InputSource input)
                           throws org.xml.sax.SAXException,
                                  java.io.IOException,
                                  org.w3c.dom.DOMException
Parses the document provided, returning its contents as a DOM document. This uses the SAX parser and DOM implementation which were specified in the constructor to this builder.
Parameters:
input - Provided to the SAX parser as input.
Throws:
org.xml.sax.SAXException - As reported by the parser (in which case it is often a SAXParseException) or this builder.
java.io.IOException - As reported by the parser
org.w3c.dom.DOMException - As reported by the DOM; always indicates a bug in either the DOM or the SAX parser