Overriding Concerns
November 26, 2003
Q: How do I merge two XML source trees into one?
I've tried so many things that it's driving me crazy: I want to merge or join two XML files. Which two files are to be merged are specified in a third file (call it merge.xml). This third file looks like this:
<?xml version="1.0"?> <merge> <appxml>testapp.xml</appxml> <userxml>user.xml</userxml> </merge>
And here are the two files to which merge.xml refers. First, testapp.xml:
<app name="testapp" lifetime="900">
<mainmenu>
<menu id="1" caption="test"/>
<menu id="2" caption="another test"/>
</mainmenu>
<forms>
<testform autosize="1"/>
<testform2 autosize="0"/>
</forms>
</app>
And here is user.xml:
<app lifetime="100">
<mainmenu>
<menu id="2" caption="my test"/>
<menu id="3" caption="my menu"/>
</mainmenu>
<forms>
<testform2 autosize="1"/>
</forms>
</app>
The result must be:
<app name="testapp" lifetime="100">
<mainmenu>
<menu id="1" caption="test"/>
<menu id="2" caption="my test"/>
<menu id="3" caption="my menu"/>
</mainmenu>
<forms>
<testform autosize="1"/>
<testform2 autosize="1"/>
</forms>
</app>
I'm using merge.xml as the source tree for the transformation. Basically I use the
string-values of the appxml and userxml elements as input to the
document() function, like this:
<xsl:template match="merge" >
<xsl:variable name="app_xml" select="string(appxml)"
/>
<xsl:variable name="user_xml" select="string(userxml)"
/>
<xsl:call-template name="domerge">
<xsl:with-param name="app_nodes"
select="document($app_xml)" />
<xsl:with-param name="user_nodes"
select="document($user_xml)" />
</xsl:call-template>
</xsl:template>
As you can see, I want to process the nodes with a named template called
domerge. But what the heck should domerge contain?
A: Although you didn't say as much in your question, obviously the nature of the basic problem is how to use the user.xml file's contents to override in the result tree those of the testapp.xml file. That is, testapp.xml establishes rules for how some application is to behave, and user.xml permits some or all of those rules to be overridden.
This is one of my favorite uses for simple XML. You've got an additional twist -- the third file, which specifies which files to use -- but the basic approach is the same.
To start with, here's a basic domerge named template:
<xsl:template name="domerge">
<xsl:param name="app_nodes" />
<xsl:param name="user_nodes" />
<app>
<mainmenu>
<xsl:copy-of select="$app_nodes//menu" />
<xsl:copy-of select="$user_nodes//menu" />
</mainmenu>
<forms>
<xsl:copy-of select="$app_nodes//forms/*" />
<xsl:copy-of select="$user_nodes//forms/*" />
</forms>
</app>
</xsl:template>
This doesn't do everything you need, but it gets you part of the way there. It begins
by
declaring the two parameters app_nodes and user_nodes, whose
values you're supplying in the xsl:call-template element you've already
constructed. Then it establishes the basic structure of the result tree -- an
app root element, with mainmenu and forms child
elements. Within each of those two children, it instantiates copies of the corresponding
portions of testapp.xml and user.xml. The result tree from the stylesheet looks like
this,
so far:
<app>
<mainmenu>
<menu id="1" caption="test"/>
<menu id="2" caption="another
test"/>
<menu id="2" caption="my test"/>
<menu id="3" caption="my menu"/>
</mainmenu>
<forms>
<testform autosize="1"/>
<testform2 autosize="0"/>
<testform2 autosize="1"/>
</forms>
</app>
The problems with this result tree are two-fold. First, it doesn't yet include any
attributes for the app element. Second, the elements from testapp.xml which are
overridden by user.xml -- these overridden elements are boldfaced above -- shouldn't
be
appearing at all.
Let's start with those attributes for the app element, name and
lifetime. (Your sample code doesn't indicate that name can be
overridden in user.xml, but I assume it can.) What you need to do is build each attribute
using the ones in testapp.xml unless the same attribute appears in user.xml. Here's
one approach, with the new code highlighted in boldface:
<app>
<xsl:attribute name="name">
<xsl:choose>
<xsl:when
test="$user_nodes/app/@name"><xsl:value-of
select="$user_nodes/app/@name"/></xsl:when>
<xsl:otherwise><xsl:value-of
select="$app_nodes/app/@name"/></xsl:otherwise>
</xsl:choose>
</xsl:attribute>
<xsl:attribute name="lifetime">
<xsl:choose>
<xsl:when
test="$user_nodes/app/@lifetime"><xsl:value-of
select="$user_nodes/app/@lifetime"/></xsl:when>
<xsl:otherwise><xsl:value-of
select="$app_nodes/app/@lifetime"/></xsl:otherwise>
</xsl:choose>
</xsl:attribute>
[etc. as above]
</app>
What I've added here is a pair of xsl:attribute elements, which instantiate in
the result tree the name and lifetime attributes and, then, assign
their values (using an xsl:choose block for each attribute) depending on
whether or not those attributes have been assigned values in user.xml. As desired,
the start
tag of the result tree's app element now looks like this:
<app name="testapp" lifetime="100">
Fixing the other problem with the named template so far -- that the elements from
testapp.xml which were overridden by user.xml are still showing up in the result tree
--
will be a little trickier. The problem is that the plain-old xsl:copy-of
elements are too indiscriminate. For processing the menu elements, you can do
something like this:
<app>
[etc. as above]
<mainmenu>
<xsl:for-each select="$app_nodes//menu">
<xsl:choose>
<xsl:when
test="$user_nodes//menu/@id[.=current()/@id]"/>
<xsl:otherwise><xsl:copy-of select="."
/></xsl:otherwise>
</xsl:choose>
</xsl:for-each>
<xsl:copy-of select="$user_nodes//menu" />
</mainmenu>
[etc. as above]
</app>
|
|
|
|
Also in XML Q&A |
|
Here the xsl:copy-of for all the testapp.xml menu elements has
been replaced by an xsl:for-each which examines each of those menu
elements in turn. If the current menu element's id attribute is
matched by one in the user.xml tree, the template does nothing; otherwise, it instantiates
in the result (via a simplified xsl:copy-of) a copy of the current
menu element (again, from the testapp.xml file). Note also that the
xsl:copy for the menu elements in user.xml hasn't been changed
at all; all of those menu elements go straight into the result.
Handling the form elements -- whether overridden by user.xml or not -- is
similar to the solution for the menu elements. (There aren't really any
form elements as such in either testapp.xml or user.xml; instead, there are
testform, testform1, etc. elements. I'm referring to them as
form elements just as a sort of collective shorthand.) But menu
elements were "matched" (or not) between the two input files by way of their id
attributes' values; the key to matching form elements is by their element
names. For example, testapp.xml has testform and
testform2 elements; user.xml, only a testform2. So the structure
for processing these elements is similar to that for processing the menu
elements, but with a different test attribute in the xsl:when
element:
<app>
[etc. as above]
<forms>
<xsl:for-each select="$app_nodes//forms/*">
<xsl:choose>
<xsl:when
test="$user_nodes//forms/*[name()=name(current())]"/>
<xsl:otherwise><xsl:copy-of select="."
/></xsl:otherwise>
</xsl:choose>
</xsl:for-each>
<xsl:copy-of select="$user_nodes//forms/*" />
</forms>
</app>
As before, the assumption is that all form elements from user.xml get
transcribed straight to the result tree; it's only the testapp.xml forms which
need to be tested for inclusion.
One additional note about the xsl:when
test attributes for the menu and form elements: they
both use the current() function to refer to the node in testapp.xml currently
being processed by their respective xsl:for-each loops. Inside an XPath
expression predicate, this is often necessary; the context node at such a point is
often
different from the current node. For example, the context node in either of these
two
predicates is a matching element from the user.xml file, not the respective element
from testapp.xml -- and it's the latter which need to be tested before copying to
the result
tree.
For reference, here's the final domerge named template:
<xsl:template name="domerge">
<xsl:param name="app_nodes" />
<xsl:param name="user_nodes" />
<app>
<xsl:attribute name="name">
<xsl:choose>
<xsl:when
test="$user_nodes/app/@name"><xsl:value-of
select="$user_nodes/app/@name"/></xsl:when>
<xsl:otherwise><xsl:value-of
select="$app_nodes/app/@name"/></xsl:otherwise>
</xsl:choose>
</xsl:attribute>
<xsl:attribute name="lifetime">
<xsl:choose>
<xsl:when
test="$user_nodes/app/@lifetime"><xsl:value-of
select="$user_nodes/app/@lifetime"/></xsl:when>
<xsl:otherwise><xsl:value-of
select="$app_nodes/app/@lifetime"/></xsl:otherwise>
</xsl:choose>
</xsl:attribute>
<mainmenu>
<xsl:for-each select="$app_nodes//menu">
<xsl:choose>
<xsl:when
test="$user_nodes//menu/@id[.=current()/@id]"/>
<xsl:otherwise><xsl:copy-of select="."
/></xsl:otherwise>
</xsl:choose>
</xsl:for-each>
<xsl:copy-of select="$user_nodes//menu" />
</mainmenu>
<forms>
<xsl:for-each select="$app_nodes//forms/*">
<xsl:choose>
<xsl:when
test="$user_nodes//forms/*[name()=name(current())]"/>
<xsl:otherwise><xsl:copy-of select="."
/></xsl:otherwise>
</xsl:choose>
</xsl:for-each>
<xsl:copy-of select="$user_nodes//forms/*" />
</forms>
</app>
</xsl:template>
The result tree matches your desired output.