XML.com: XML From the Inside Out
oreilly.comSafari Bookshelf.Conferences.

advertisement

Storing XML in Relational Databases
by Igor Dayen | Pages: 1, 2, 3, 4

Sybase Adaptive Server

SQL to XML Mapping

Sybase employs an XML document type, ResultSet, to describe both XML document metadata (element name, type, size, etc.) and actual row data. Here is excerpt from a hypothetical FxTradeSet.xml:

<?xml version="1.0"?>    
<!DOCTYPE ResultSet SYSTEM "ResultSet.dtd">    
<ResultSet>    
    <ResultSetMetaData>    
        <ColumnMetaData    
        ...
        getColumnLabel="CURRENCY1"    
        getColumnName="CURRENCY1"    
        getColumnType="12"    
         ... />    
   ...    
   </ResultSetMetaData>    
   <ResultSetData>    
        <Row>    
           <Column name="CURRENCY1">GBP</Column>    
        ...    
        </Row>    
    </ResultSetData>    
</ResultSet>    

The ResultSet DTD does not appear to permit definition of the nested elements.

Extracting XML from the database

The Java class ResultSetXml has a constructor which takes an SQL query as an argument, thereafter the getXmlLText method extracts an XML document from result set:

jcs.xml.resultset.ResultSetXml     rsx = new jcs.xml.resultset.ResultSetXml    
           ("Select * from FxTrade", <other parameters>);    
FileUtil.string2File ("FxTradeSet.xml", rsx.getXmlText());    

Storing XML in the database

The ResultSetXml class constructor can also take an XML document as an argument. Then the method toSqlScript generates sequences of SQL statements for insert/update into a specified table from a result set.

String    xmlString = FileUtil.file2string ("FxTradeSet.xml");    
jcs.xml.resultset.ResultSetXml     rsx = new jcs.xml.resultset.ResultSetXml    
           (xmlString);    
String     sqlString  = rsx.toSqlScript ("FxTrade", <other parameters>)    

Summary

The extraction and storage of XML documents are symmetrical in nature. Storing does not seem to permit modifying more than one table. Extraction transforms the results of an SQL query into a document with a flat structure.

Vendor Comparison

Vendor

Mapping rules

Single table / Multiple tables

Means of transformation

Symmetrical extraction / storing

Oracle

Implicitly; by constructing object-relational data model

Multiple

Designated Java classes

Symmetrical, if XML document and object-relational model match

IBM

Data Access Definition file

Multiple

Designated stored procedures

Symmetrical

Microsoft

SQL extension; row set function

Multiple for extraction;

Single for storing

By using SQL construct FOR XML and row set OPENXML

Asymmetrical

Sybase

Result Set DTD

Single; query may encompass multiple tables

By using Java classes

Symmetrical

Common features between vendors:

  1. XML persistence is achieved on an ad hoc basis, i.e., there's no general facility for storing arbitrary XML documents); special mapping is required if the XML document uses a new grammar;
  2. storage often requires data preprocessing, such as number/date reformatting with respect to the locale being used; XSLT can be utilized to perform XML data massaging.

An Alternative Strategy

Can the task of XML document storage in a database be split into stages? First, store XML in generic database structures without applying any mapping rules; and, second, applying a framework for subsequent data processing? The advantage of this approach would be to have a universal acquisition agent for virtually any XML document. The remainder of this article proposes just such a solution.

In this strategy, XML documents are saved as normalized tree structures -- raw tree data -- for further application processing. The framework for further data processing is then a set of database objects (stored procedures), similar to DOM APIs, for navigating data by employing traditional SQL processing.

There are advantages to using a framework approach as opposed to full automation: the population of database tables is normally controlled by application logic distributed between "pure" application code, transaction processing, database layer, and stored procedures, not to mention that some tables may reside in remote databases.

Additionally, in a heterogeneous database environment, having a unified tool to perform the same task for different databases has obvious advantages.

The author's solution, ObjectCentric Solutions x-Persistence Utility, implements the strategy discussed here.

Raw tree structures are defined by means of the following tables.

1) Tree Node Definition    

CREATE TABLE NodeDef  (
   nodeID       udd_id         NOT NULL, // unique node id    
   dimID        udd_id         NOT NULL, // tree category: XML    
   name         udd_name       NOT NULL, // node name    
   value        udd_paramvalue NULL,     // node value    
   value_type   udd_type       NULL,     // value type    
   power        udd_power      NULL,     // node tree level    
   isRoot       udd_logical    NOT NULL, // top node flag    
   descriptor   udd_descriptor NULL,     // DOM Node type     
   color        udd_color      NULL      // informal data    
)    

2) Parent-Child relationships between nodes    

CREATE TABLE NodeLinks    (    
   parentID     udd_id     NOT NULL,   // parent node    
   childID      udd_id     NOT NULL    // child node    
)    

Example usage

Storing an XML document in a database is a matter of invoking the XML2SQL program:

XML2SQL  <XML document filename>

Extracting an XML document from a database as a tree structure is implemented using the following stored procedures:

  • get_tree_roots <tree category> -- extract all document root ids for a given forest
  • get_tree <root id> -- extract all nodes for a given root (document)
  • get_children <node, level> -- extract all children of a certain level for a given node

Implementation details:

  • Current platform coverage: Sybase, MS SQL Server.
  • Platforms being evaluated: Oracle, DB2, and PostgreSQL.
  • The utility is built on top of the Xerces XML Parser.
  • Database installation involves the addition of just a few tables, and stored procedures to navigate the database tables as tree structures.

The x-Persistence Utility is currently used as the core of an XML-based data acquisition agent in Open Finance Laboratory, a customizable integration platform for a range of financial applications such as portfolio management, asset-liability management, risk management, etc. Open Finance Laboratory adopted FpML as a descriptive language for Interest Rate Derivatives products. x-Persistence Utility is currently used as a FpML/FRA data acquisition agent.

References



1 to 20 of 20
  1. xml to oracle
    2010-04-07 06:49:58 w_ing
  2. SportsFeed Xml Processor
    2008-01-22 02:18:47 twostepted
  3. Please rewrite and update this article!
    2006-06-14 16:42:42 craign1
  4. Importing a Data Array from XML to DB
    2005-11-10 18:25:41 bdrudolph
  5. HELP! newbie needing to do something tricky...
    2005-04-08 15:08:16 RacerZack
  6. how to measure performance of tow diff xml databases
    2005-01-03 08:43:46 amolee
  7. XML document to be decomposed into a collection of xml subdocuments, using XSLT ?
    2003-08-26 09:36:11 punam verma
  8. Storing XML in Relational Databases
    2003-08-21 06:07:40 Barry Schaeffer
  9. how to transfer data from xml document to database tables?
    2003-03-25 09:38:43 manas mallik
  10. Problem: Large Tables mapped to XML Documents result in large XML Documents. Any solutions for this problem?
    2002-11-21 05:46:12 stefan fischerfeier
  11. How to store XML data into relational database
    2002-09-18 04:05:12 cvinoth Chidambara
  12. Please state version of Oracle
    2001-07-24 13:25:32 Mark Drake
  13. Do you have any rough perofrmance comparation among these DBs ?
    2001-07-12 16:34:24 jun wang
  14. obvious problem with this approach
    2001-07-03 10:58:16 K Vainstein
  15. XML/SQL components
    2001-06-28 13:51:52 Cory Casanave
  16. What about XML Extensibility?
    2001-06-25 12:33:53 Vittorio Viarengo
  17. Generic approaches to storing XML in RDBs
    2001-06-22 05:25:53 Richard Edwards
  18. Oracle Internet File System
    2001-06-22 03:41:25 Peter Bull
  19. XTM Topic Maps in relational database
    2001-06-20 21:28:32 Jack Park
  20. XMLDB
    2001-06-20 20:42:00 Pete Ryland
1 to 20 of 20