XML.com: XML From the Inside Out
oreilly.comSafari Bookshelf.Conferences.

advertisement

XML Pipelining with Ant

January 28, 2003

Ant is an extensible, open-source build tool written in Java and sponsored by Apache's Jakarta project. Ant has developed into something more than a just a build tool, however. It has gone beyond its predecessor make (and make's kin) to become a framework for performing an even larger variety of operations in a single step, not just compiling code or cleaning up after a build.

Ant's build files are written in XML, and Ant takes advantage of XML in a variety of ways. In my opinion, Ant is a suitable if not ideal framework for XML pipelining -- that is, a framework for performing a variety of XML processing, in the desired order and in one fell swoop. The reason why I say ideal is because Ant is open, somewhat mature, reasonably stable, readily available, widely known and used, easily extensible, and already amenable to XML processing. What else could you ask for?

In this article, I'll discuss the XML structures in an Ant build file, named build.xml by default, talk about some common XML-related tasks that Ant can perform, and then finish up with an example of XML pipelining.

I assume that you already know something of Ant and have probably used it. I plan to review the basics of the tool, but I also suggest that you read Tony Coates recent XML.com article ("Running Multiple XSLT Engines with Ant.") Along with an interesting approach to processing multiple XSLT stylesheets with multiple engines, Tony's article also provides good introductory material on Ant.

To get the examples in this piece to work, you'll of course need a recent version of Java on your system. You'll also need to download and install Ant version 1.5.1 (or later) binaries. Because you'll be using a new task that validates with RELAX NG schemas, you'll also need to download and install James Clark's Jing. All the example files discussed in this article are available for download in a ZIP archive and have been tested on the Windows XP Professional platform running Java 2 v1.4.

You can refer to Ant's HTML manual either online or, after installing Ant locally, by bringing up docs/manual/index.html in a browser.

Where Is Ant's DTD?

One of the first things I noticed about Ant was that it didn't have an explicit DTD available in the archives I downloaded, either the binary or source archive. I wanted to see Ant's DTD so I could figure out what went into a build file. Then I discovered the antstructure task. This task in essence extracts a DTD from Ant's source code.

The following snippet is a simple Ant build file that uses the antstructure task (build-dtd.xml in the example archive):

<?xml version="1.0"?>

<project default="dtd">
 <target name="dtd">
  <antstructure output="ant.dtd"/>
 </target>
</project>

Here's a quick review of some basics. The document begins with an optional XML declaration. The root element of an Ant build file is <project>. It has several possible attributes, but only one is required: default. This attribute names the default target for the project, and in this case the only target, dtd. A target represents a way to achieve an expected outcome from an operation, such as a set of compiled Java classes or, in the case of antstructure, a DTD.

The <target> element is a child of <project> and must have a name attribute. The value of this attribute matches the value of the default attribute of <project>. When there is more than one target in a build file, the value of default only matches the value of one name attribute in one <target>. The <target> element also has several other attributes such as depends (which will come to light in later examples).

The <antstructure> task element is empty. One of four possible attributes is output which gives the name of the output file that will contain the DTD that the task produces. This output file is written to the current directory by default; however, if you add a basedir attribute to <project>, you can specify a different output directory than the current one as a value of basedir, such as:

<project default="dtd" basedir="c:/temp">

Now give it a try. The following command presupposes that Ant's bin directory is in the path environment variable, that your working directory is C:\Java\Ant, and that you have unzipped the example archive there:

C:\Java\Ant>ant -f build-dtd.xml

Ant assumes that the build file is named build.xml. If it isn't, you need to use the -f option (or the synonyms -file or -buildfile), followed by a filename. You should see output from this command like this:

Buildfile: build-dtd.xml

dtd:

BUILD SUCCESSFUL
Total time: 2 seconds

The output lists the build filename, the target name dtd, and whether the build was successful. The target produces the file ant.dtd in your current directory. This DTD is straightforward (only three parameter entities), but is quite long (nearly 4000 lines). With this DTD available now, you can see for yourself how a build file is put together. For any element name in the DTD, you are likely to find a corresponding entry in the Ant manual.

At first I wondered how Ant validates build files. The answer lies in the source code, where it is clear that Ant validates build files in its own application-specific, rather than in a general-purpose way. (If you want to see how Ant does this, a good place to start looking is in the Java source of the class org.apache.tools.ant.helper.ProjectHelperImpl.) Ant is in effect self-validating and avoids the use of namespaces.

Validating an XML Document

Ant has a task for validating XML documents called xmlvalidate. By default Ant validates with Xerces version 2.2.0. Consider the small XML document date.xml:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE date SYSTEM "date.dtd">

<date>2003-01-31T00:00:01</date>

And its equally small DTD date.dtd:

<!ELEMENT date (#PCDATA)>

You can validate date.xml with the build file build-valid.xml by using the xmlvalidate task:

<?xml version="1.0"?>

<project default="valid">
 <target name="valid">
  <xmlvalidate file="date.xml"/>
 </target>
</project>

The attribute file specifies the document to validate. Issuing the command

C:\Java\Ant>ant -f build-valid.xml

produces the following output, if successful:

Buildfile: build-valid.xml

valid:
[xmlvalidate] 1 file(s) have been successfully validated.

BUILD SUCCESSFUL
Total time: 2 seconds

In Ant, types are elements that can help performs tasks, such as on groups of files. Using the fileset type as a child of xmlvalidate, you can validate a series of XML documents, as shown in build-fileset.xml:

<?xml version="1.0"?>

<project default="valid">
 <target name="valid">
  <xmlvalidate>
   <fileset file="date*.xml"/>
  </xmlvalidate>
 </target>
</project>

The file attribute of fileset allows you to specify a series of files with wildcards. If you run this build file, you will see that Ant validates six XML documents in one step (all XML documents in the current directory beginning with the name date).

The xmlvalidate task has several other features worth mentioning:

  • An attribute of lenient="true" means that the task will only do well-formedness checking.
  • The classname and classpathref attributes allow you to specify a different XML parser than the default and where to find it.
  • The child element <dtd> lets you indicate a formal public identifier (publicId) attribute as well as the local whereabouts (location attribute) of a DTD.

Pages: 1, 2

Next Pagearrow