Under the Hood: Oracle Berkeley DB XML
Storing an XML document in a relational database has its limitations. XML's hierarchical structures of elements and element attributes do not necessarily map well to relational database structures, which is where an embeddable (non-relational) XML database has its advantages over a relational database.
Oracle databases (since Oracle 9i database R2) provide the Oracle XML DB feature to store and query XML documents in an XMLType data-type column, but you still need a DBA to manage the Oracle database. On the other hand, Oracle Berkley DB XML is an embeddable XML database for storing and retrieving XML documents, one which provides efficient querying of XML documents using XQuery. Oracle Berkley DB XML is built on the embeddable Oracle Berkley DB database and inherits all the features of the database. For a comparison of Oracle Berkeley DB and relational database systems refer to A Comparison of Oracle Berkeley DB and Relational Database Management Systems.
Oracle Berkeley DB XML database stores XML data in a container.
A container is managed with a
XmlManager object. XML
documents may be stored in a Oracle Berkeley DB XML database as
whole documents or a set of nodes. It's recommended to store whole
documents if the documents are relatively small and loading
performance is more important than query performance. Storing in
nodes is recommended if query performance is more important than
loading performance and documents are relatively large. BDB XML
supports XQuery 1.0 and XPath 2.0 specifications to query an XML
document in the database. XML documents in a BDB XML database may
be modified, updated and deleted.
Installing Oracle Berkley DB XML
You need to download Oracle Berkley DB XML 2.3.10 or a later edition. Download the Oracle Berkeley DB XML Windows Installer .msi file. To install Oracle BDB XML, double-click on the dbxml-2.3.10.msi file. The Installer gets started. Click on Next. Accept the Open Source License and click on Next. Select the default installation folder, C:\Program Files\Oracle\Berkeley DB XML 2.3.10, and click on Next. Select the default Berkley DB XML features and click on Next.
Berkeley DB XML requires some environment variables
PATH) to be modified.
Select the "Set the values in the environment variables" checkbox
and click on Next. Click on the Install button to install Berkeley
DB XML. Berkeley DB XML 2.3.10 gets installed. Click on the Finish
button. The Oracle Berkeley DB XML jar files get added to the
CLASSPATH environment variable. The following
bin directory gets added to the
C:\Program Files\Oracle\Berkeley DB XML 2.3.10\bin
Berkeley DB XML binaries are not available for UNIX/POSIX systems. To build Berkeley DB XML on UNIX/POSIX refer Building Berkeley DB XML for UNIX/POSIX. Berkley DB XML may be accessed from a command line shell or with a Java API. You will learn about each of these methods in the following sections.
Using the Command Shell
dbxml shell may be started with the command
shown below from a command line.
The command shell gets started and the following prompt gets displayed.
Berkley DB XML commands may be specified at the
dbxml command prompt. Some of the commonly used
commands are discussed in Table 1.
|append||Appends an object of type element, attribute, comment, or processing instruction to node/s selected by a query expression.|
|createContainer||Creates a container. The first argument specifies the container name. The second argument specifies the container type. Default value is 'in', which creates a node storage container with node indexes. Value 'd' creates a Wholedoc container. Value 'n' creates a node storage container. Value 'id' creates a document storage container with node indexes. The third argument specifies if documents are to be validated on insertion.|
|getDocuments||Gets all the documents in the default container, or, if the docName is specified, gets the specified document.|
|insertAfter||Inserts new content after the nodes specified by the query expression. The new content may be element, attribute, comment, or PI. Content type is specified with <objectType> argument and content value is specified with <objectContent> argument.|
|insertBefore||Similar to insertAfter except that the content is inserted before the specified nodes.|
|openContainer||Opens a container and sets the container as the default container. The validate/novalidate option specifies if documents are to be validated on insertion.|
|Prints the most recent results to stdout. Results may be printed to a file by specifying path to a file. A specified number of results may be output with the 'n' argument.|
|putDocument||Adds an XML document to the default container. The <namePrefix> argument specifies the document name. The <string> parameter specifies the XML document string. The third argument specifies the content of the second argument. If the value is 'f' the string specifies a file name. If value is 's' (the default), the string is the XML document to be added. If value is 'q', the string specifies an XQuery expression.|
|query||Evaluates an XQuery expression.|
|removeContainer||Removes a container. The .dbxml file corresponding to the container gets deleted.|
|removeDocument||Removes a document from the default container.|
|removeNodes||Removes the nodes specified by the XQuery expression.|
|renameNodes||Renames the nodes specified by the XQuery expression.|
|run||Runs the specified file as a script.|
|updateNodes||Replaces the content of the nodes specified by query expression with new content.|
The first thing that you will need to do once the database
itself is set up is to create a container to store the XML
documents. For instance, to create a container called
catalog for storing catalog entries for journals, use
A node storage container with nodes indexed gets created and opened as shown in Figure 1.
Figure 1. Creating a Container
The BDB XML database is just a
.dbxml file created
in the directory in which the
dbxml command was
Adding XML Documents
Next, you’ll want to add XML documents to the BDB XML
database using the
putDocument command. For example,
to add an XML document that represents a catalog entry to the
database, use the following:
dbxml>putDocument catalog1 '<catalog title="Oracle Magazine" publisher="Oracle Publishing"> <journal date="March-April 2006"> <article> <title>Using Bind Variables</title> <author> Steve Muench </author> </article> </journal> </catalog>' s
Single quotes are used for any command parameters that span multiple lines or contain spaces. The character 's' used to terminate the command indicates that the XML document is added using a string. The XML document gets added; the output from the command is shown in Figure 2.
Figure 2. Adding a Document
Similarly add catalog 'catalog2'.
dbxml>putDocument catalog2 '<catalog title="Oracle Magazine" publisher="Oracle Publishing"> <journal date="May-June 2006"> <article> <title>From Application Express to XE</title> <author>David A. Kelly </author> </article> </journal> </catalog>' s
The XML documents added may be retrieved with the following command.
The output from the
getDocuments command is as
2 documents found.
The documents retrieved may be output to stdout with the
Querying XML Documents with XQuery
Once you have created the appropriate documents, you’ll be
able to query them using XQuery, an XPath based SQL-like language
for XML. XQuery queries retrieve subsets of data similar to SELECT
statement in SQL. Each query has two parts; the first part
identifies the set of documents to be queried using an XQuery
navigation function such as
collection() function is used
to navigate to a container. For example, container
'container1.dbxml' may be navigated to with the following XQuery
Multiple containers may be queried using the '|' operator. For
container2.dbxml may be queried using the following
doc() function may be used to navigate to a
specified XML document in a container. The
function takes a base URI (
dbxml: by default), a
container name and an XML document name in the XQuery expression.
For example, XML document
container1.dbxml may be navigated to with
the following expression.
As an example of a query select all the article titles in the
catalog.dbxml container. If Berkeley DB XML 2.3.8 is
used, before you may query an XML document in a container you need
to set the base URI.
dbxml> setBaseUri dbxml:/
The XQuery Expression to select all the titles is as follows.
dbxml>query 'collection("catalog.dbxml")/ catalog/journal/article/title/text()'
To output the results of the query use the
The results of the query are as follows.
Using Bind Variables
From Application Express to XE
As another example retrieve the author of the article with title 'Using Bind Variables'. The XQuery expression for retrieving the author is as follows.
dbxml>query 'collection ("catalog.dbxml")/catalog/journal/article[title= "Using Bind Variables"]/author/text()'
Output the results with the
Modifying XML Documents
The Oracle BDB XML command shell provides various commands such
insertBefore to modify an XML document in the
database. For example, to append the 'section' attribute to the
'article' elements in the XML documents in the
catalog.dbxml container, use the
dbxml>append './catalog/journal/article' 'attribute' 'section' 'Developer'
append command in the previous example won't
append an attribute if the context has not been set. The query
expression for modifying nodes should be relative; it should
navigate from the context item (".") rather than evaluating an
expression with the
function. The context is set prior to the modifications with the
query command. For example, set the context to the
catalog.dbxml container. You will use the
catalog1 container context for the subsequent
dbxml> query 'collection("catalog.dbxml")'
Having set the context you may run an
command. As an example add after the
article node in
catalog1 XML document another
dbxml shell command to append an
article node is as follows.
dbxml> append './catalog/journal[article/title= "Using Bind Variables"]' 'element' 'article' '<title>Commanding ASM</title> <author>Arup Nanda</author>'
Output the modified
catalog1 document with the
The output from the
<catalog title="Oracle Magazine" publisher="Oracle Publishing"> <journal date="March-April 2006"> <article> <title>Using Bind Variables </title> <author> Steve Muench </author> </article> <article> <title> Commanding ASM</title> <author> Arup Nanda</author> </article> </journal> </catalog>
Replacing/Deleting XML Documents
Using the Oracle BDB XML command shell, nodes in an XML document in the database may be replaced, renamed, and removed. For example replace the article title 'Using Bind Variables' with 'Introduction to Bind Variables'. The shell command to replace the title is as follows.
dbxml>updateNodes './catalog/journal/article[title= "Using Bind Variables"]/title' 'Introduction to Bind Variables'
As another example, rename the
journal nodes to
magazine. The shell command to rename
journal nodes to
magazine is as
dbxml>renameNodes './catalog/journal' 'magazine'
Next, remove the
magazine node with date attribute
March-April 2006. The shell command to remove the
magazine node is as follows.
dbxml>removeNodes './catalog/magazine [@date="March-April 2006"]'
An XML document may be deleted from the database with the
removeDocument command. For example, remove the
A database container may be removed with the
removeContainer command. For example, remove the
Using the Berkley DB XML API
In BDB XML documents are stored in containers. Containers are of
Wholedoc container and
Wholedoc container stores the complete
XML document without any modifications to line breaks or
whitespaces. In a
Node container XML documents are
stored as nodes. Each element in an XML document is stored as a
node in the database along with the element attributes, attribute
values and text nodes. BDB XML also stores information about
reassembling an XML document from the nodes stored in the database.
Node container is the preferred type and is the
default type. A comparison of
Node container and
Wholedoc container is discussed in Table 2.
|Container Type||Storage Mode||Query Performance||Load Performance||Application|
|Node Container||XML document stored in nodes.||Faster to query||Lower load performance.||Use Node container if faster query performance is required. Use Node container if document size is more than 1MB.|
|Wholedoc Container||Whole XML document stored.||Lower query performance, because complete document has to be navigated.||Faster document loading, because an XML document does not have to be deconstructed into nodes.||Use Wholedoc container if load performance is more important than query performance. Use Wholedoc container if document is relatively small and requires to be frequently retrieved.|
A container is represented with class
settings for a container are specified in the
XmlContainerConfig object. An
object is used to manage many of the objects used in a BDB XML
application, including managing a
preparing and running XQuery queries.
the main class in a BDB XML application. In this section you will
create a container, add XML documents to the container, query the
XML documents with XQuery, modify the XML documents, and update the
XML documents, all with the BDB XML API in the
com.sleepycat.dbxml package. First, create a
XmlManager object and set the default container type
XmlManager xmlManager = new XmlManager(); xmlManager.setDefaultContainerType (XmlContainer.NodeContainer);
Next, create a container,
container is the BDB XML database.
XmlContainer xmlContainer = xmlManager.createContainer("catalog.dbxml");
Pages: 1, 2