Introduction to dbXML
by Kimbro StakenNovember 28, 2001
What it Offers
The dbXML Core has been under development for a little more than a year. The current version is 1.0 beta 4 with a 1.0 final release expected to appear shortly. Full source code is available from the dbXML site.
Most of the basic native XML database features are covered, including:
- storage of collections of XML documents,
- multi-threaded database engine optimized for XML data,
- schema independent semi-structured data store,
- pre-parsed compressed document storage,
- XPath query engine,
- collection indexes to improve query performance,
- XML:DB XUpdate implementation for updates,
- XML:DB Java API implementation for building applications, and
- complete command line management tools.
Proper transaction support is the major missing feature right now; it will appear in the 1.5 release.
In order to get the most from this article you might want to download dbXML and follow the installation instructions (UNIX or Windows) to get it running.
The Basic Model
The main idea behind dbXML is to provide a simple way to store and manage large numbers of XML documents. This is accomplished by storing documents in collections, where each individual document is stored in a compressed pre-parsed form. This significantly enhances the speed attainable when working with XML data. The dbXML engine is optimized for smaller sized XML documents of up to about 50K in size. The server can store larger documents, but it isn't an ideal scenario.
Storing documents in collections provides an easy mechanism for querying and manipulating the documents as a set. If you wanted to draw a parallel to a relational database, you could consider a collection roughly equivalent to a table and each document in a collection equivalent to a row in that table. One major difference beside the obvious use of XML, is that in dbXML the schema of what can be stored in a collection is not constrained. This means you have tremendous flexibility on what document types can be stored in a collection. If you want, you can even mix documents of completely different schemas in the same collection. There probably isn't much benefit in doing that, but there is benefit in being able to store and query documents that are similar, but not exactly the same in structure. For instance, a product catalog where each different product type needs specialized data. In this case all products will have some common data and also some specialized data. In dbXML you could store all products together and then query the common data as a set or restrict your query to a particular product type and query on the product specific data.
Working from the Command Line
The dbXML server comes with a nice set of command line tools that allow you to perform all the basic administration functions that you would expect. To get a feel for how these work, let's look at a few examples. I won't explain the details of these commands, but it's likely to be clear what they are doing. A more detailed explanation of their usage can be found in the dbXML users guide.
For all the commands we'll assume myaddress.xml contains this simple document.
<address id="1">
<name>
<first>John</first>
<last>Smith</last>
</name>
</address>
Using the command line tools we can --
create a collection:
dbxmladmin add_collection -c /db -n addresses
add a document:
dbxmladmin add_document -c /db/addresses -n myaddress -f myaddress.xml
retrieve a document:
dbxmladmin retrieve_document -d /db/addresses -n myaddress
create an index on the id attribute:
dbxmladmin add_indexer -c /db/addresses -n id_idx -p @id
run an XPath query:
dbxmladmin xpath -c /db/addresses -q /address[@id = 1]
The basic pattern is to run the dbxmladmin command, tell it what operation you want, what collection context it should be executed in (-c switch), and any operation specific arguments. The gory details on all the possible operations are available in the dbXML command line tools reference.
Developing Applications
|
|
| Post your comments |
While having a nice set of administration tools is important, the real value of dbXML comes when developing custom applications. For this we use the Java XML:DB API. This API is intended to enable the development of portable XML database applications and can be considered the equivalent of JDBC or ODBC for relational databases. The XML:DB API is fairly simple to use and gives you a fair amount of flexibility when developing applications. To get a flavor of what the API offers let's take a look at a simple program that works with the collection we created earlier.
import org.xmldb.api.base.*;
import org.xmldb.api.modules.*;
import org.xmldb.api.*;
public class Example1 {
public static void main(String[] args) throws Exception {
Collection col = null;
try {
String driver = "org.dbxml.client.xmldb.DatabaseImpl";
Class c = Class.forName(driver);
Database database = (Database) c.newInstance();
DatabaseManager.registerDatabase(database);
col =
DatabaseManager.getCollection("xmldb:dbxml:///db/addresses");
String xpath = "/address[@id = 1]";
XPathQueryService service =
(XPathQueryService) col.getService("XPathQueryService", "1.0");
ResourceSet resultSet = service.query(xpath);
ResourceIterator results = resultSet.getIterator();
while (results.hasMoreResources()) {
Resource res = results.nextResource();
System.out.println((String) res.getContent());
}
}
catch (XMLDBException e) {
System.err.println("XML:DB Exception occurred " + e.errorCode + " " +
e.getMessage());
}
finally {
if (col != null) {
col.close();
}
}
}
}
This program simply creates a connection to the dbXML server,
performs a basic XPath query, and then prints the results. In
dbXML, queries can be executed against a collection of documents or
a single document. In this case we're querying the entire
collection. To query a single document you change the
query() method to queryResource() and
provide the dbXML id of the document to query.
If you're storing large numbers of documents it might be useful to index your collection to improve query response. In this case we indexed the id attribute in an earlier example, so the XPath engine should be quite speedy. Of course you'd need more then one document in your collection for this to be of any real value.
Another noteworthy feature of dbXML is support for XML:DB XUpdate. XUpdate provides a simple update language for XML documents. This allows you to declaratively specify what changes should be made without worrying about the details of how the database makes the changes. For instance using our sample document, if we wanted to change John's last name to "Herman" we could use this XUpdate document.
<xupdate:modifications version="1.0"
xmlns:xupdate="http://www.xmldb.org/xupdate">
<xupdate:update select="/address[@id = 1]/name/last">Herman</xupdate:update>
</xupdate:modifications>
This makes it easy to make changes to XML documents and in the case of dbXML you can use XUpdate to make changes to entire collections of XML documents. You can execute XUpdate modifications through the XML:DB API or using the dbXML command line tools. More XUpdate examples are available in the XUpdate use cases document.
If you're interested in developing applications for dbXML, much more detail is available in the dbXML developers guide.
Wrapping Up
|
|
|
The dbXML Project |
In this article I've just skimmed the surface of dbXML functionality. I encourage you to find out whether it could be something useful to you. Like all native XML databases dbXML is just a tool. It will be right for some jobs and completely wrong for others, and like all tools the best way to find out if it works is to try it.
This is an exciting time for dbXML; it's on the verge of an initial production release and will soon be receiving a new name and a new home. Development of dbXML is coming under the stewardship of the Apache Software Foundation XML sub-project, and dbXML will be renamed Xindice in the process. The project has come a long way, and now is the best time to get involved to help shape the future of open source native XML database technology.
Ask your questions about the dbXML project in our forum.
(* You must be a member of XML.com to use this feature.)
Comment on this Article
| Titles Only | Titles Only | Newest First |
- general questions about dbXML
2003-12-15 12:08:17 Paul Willis [Reply]
Hi,
I am collecting info for a project in need of NXD support. The following questions I could not find answers to elsewhere. Thanks in advance for your support.
Concurrency Support:
1. Does dbXML support concurrent connections/access?
2. Is dbXML “thread-safe”? (multiple lightweight processes)
3. Is dbXML “JVM-safe”? (multiple heavyweight processes)
4. Does dbXML support locking of content?
5. What level of locking is supported? (e.g. None, Database, Collection, Document, Node/Element)
Access Control:
1. What access control schemes are supported? (e.g. read, write, modify, none)
2. At what level is the access control applied? (e.g database, collection, document, node/element)
3. How is the access control scheme managed? (e.g GUI, command line, API)
Paul W.
NGIT
- XPath query with Xindice-XMLRPC and perl
2002-06-20 02:33:22 Koos Fourie [Reply]
(Sorry for this long entry, this is probably not even the right forum to bring this up, but I'm desperate)
I've managed to connect to Xindice via XMLRPC with perl scripts such as following:
use XMLRPC::Lite;
print XMLRPC::Lite
-> proxy('http://localhost:4080/')
-> call('db.getDocument', '/db/addressbook', "123020")
-> result;
but I can't get the db.queryCollection method (using XPath) to work, as I don't know how to pass a hashtable through perl to Xindice-XMLRPC
thus, the following code doesn't work:
use XMLRPC::Lite;
print XMLRPC::Lite
-> proxy('http://localhost:4080/')
-> call('db.queryDocument', '/db/addressbook', 'XPath', '/item', 'new Hashtable\(\)')
-> result;
(where item is the name of the root node of documents in the collection)
This produces a warning:
Use of uninitialized value in print at query.pl line 2
I get my expected result through the xindice command line call:
xindice xpath -c /db/addressbook -q /item
so I expect my xpath query is correct. I'm obviously making some stupid mistake in trying to pass the empty hashtable (I don't even know what that is), and I don't have enough experience with java and perl to solve this, and not enou. Help me. Please.
- XPath query with Xindice-XMLRPC and perl
2002-06-24 02:08:11 Daniel Kröger [Reply]
Hello vleislollie (wow, this is a strange name),
try using
{X => "http://www.xmldb.org/xpath"} instead of your 'new Hashtable\(\)'........and no single quotes.
This should work.
ZeSolo
- ZeSolo for President !!!
2002-06-25 01:54:28 Koos Fourie [Reply]
Thanks ZeSolo! It works like a charm.
I was tinkering with Perl2UCS.pm in order to get some kind of connection to Java, but it seemed kinda stupid, seeing as perl already has a hashtable class. Also, you need to download a Netware client and development kit just to get to the module. Ignorance kills bandwidth.
- ZeSolo for President !!!
- XPath query with Xindice-XMLRPC and perl
- ASP or PHP?
2002-01-21 08:42:31 Karl Bauer [Reply]
Hi!
Maybe it is a stupid question, but:
Is it possible to access dbXML by ASP or PHP? My Java isn't very well ;-)
if yes, can you please post a little example??
THX!
- ASP or PHP?
2002-04-28 04:41:53 Daniel Kröger [Reply]
Hi,
it is possible to access Xindice from other languages using an available XML-RPC plugin.(http://xindice-xmlrpc.sourceforge.net/)
The library has been tested with the following client languages.
Java
PHP
Perl
Applescript (Mac OS X)
Bye
ZeSolo
- ASP or PHP?
2002-06-14 02:52:42 Shane Dempsey [Reply]
HI,
Can somebody tell me what the php commands to access the Xindice database are. I have downloaded the xmlrpc plugin for php but cannot find any examples which use php to qeury an xindice database.
Regards
Shane
- ASP or PHP?
- ASP or PHP?
- on macos x
2001-11-30 06:56:43 marcelo alfaro [Reply]
great!! i just get the binaries, configure as is explained (set two enviroment variables) and it works!! i have a question however, the XPath query just works for the atributes of the root element of the document? the child atributes seems not to be considered when you query for them in a collection. example:
<product_list>
<product id="100">....</product>
<product id="102">....</product>
<product id="103">....</product>
</product_list>
if you query /product_list/product[@id="101"] get a 'no match' response...
anyway a great product.
- DBXML FEATURES....
2005-10-22 02:43:22 shuch [Reply]
Hi all,
I am trying to learn NXDs n i m new to DBXML...i m learning it from the site http://www.dbxml.com/docs/programmer-main.html...though its a very good site but is it possible to get some explanation of the commands given there for example for the dbXML security command there is a command like USER only..i want to c how to write it EXACTLY on the command prompt...and also i want to know how to put data inside dbxml using the command prompt only??????so,basically if someone knows any site or if someone has any documents from where i can learn how to write different commands...that will be very BENEFICIAL for me...
thnks in advance..
- on macos x
2001-12-02 01:25:34 Kimbro Staken [Reply]
Seems the problem may be that there is no id = 101 in the file. :-) Other then that I tested your example on Mac OS X it works fine. BTW, we do most of our development on Mac OS X now.
Kimbro
- DBXML FEATURES....
- XQuery in XMLdb
2001-11-29 01:10:15 Antoine Quint [Reply]
This all sounds damn good. Querying with XPath is already quite handy, and the XUpdate looks like it is doing some of what XQuery can do, but is there a schedule for an XQuery implementation just yet? Thanks for providing an open project like this one.
Antoine
- XQuery in XMLdb
2001-12-02 01:27:57 Kimbro Staken [Reply]
There isn't any schedule yet, but it certainly will come up once the Apache Xindice project gets into full swing. Of course we'll need developer help to get XQuery going.
Kimbro
- XQuery in XMLdb
