XML.com: XML From the Inside Out
oreilly.comSafari Bookshelf.Conferences.

advertisement

Storing XML in Relational Databases
by Igor Dayen | Pages: 1, 2, 3, 4

IBM DB2 XML Extender

SQL to XML Mapping

IBM's XML Extender provides two access and storage methods for using DB2 as an XML repository:

  1. XML column: stores and retrieves entire XML documents as DB2 column data
  2. XML collection: decomposes XML documents into a collection of relational tables, or composes XML documents from a collection of relational tables.

DTDs are stored in the DTD repository, a DB2 table called DTD_REF; it's schema name is "db2xml". Each DTD in the DTD_REF table has a unique ID. The mapping between the database tables and the structure of the XML document is defined by means of a Data Access Definition (DAD) file. DAD refers to a processed document DTD, thus providing a bridge between an XML document, its DTD, and mapping rules onto database tables.

Here's an example DAD.

<?xml version="1.0"?>
<!DOCTYPE    DAD    SYSTEM    "dad.dtd">    
<DAD>    
    <dtdid>FXTRADE.DTD</dtdid>    
    <validation>YES</validation>    
    <Xcollection>    
        <prolog>?xml version="1.0"?</prolog>    
        <doctype>!DOCTYPE     FXTRADE       FXTRADE.DTD </doctype>    
        <root_node>    
   <element_node     name="FXTRADE">    
       <RDB_node>    
           <table name="FXTRADE"/>    
           <table name="ACCOUNT" key="ID"/>    
           <condition>    
      FXTRADE.ACCOUNT=ACCOUNT.ID    
           </condition>    
       </RDB_node>    
       <element_node name="CURRENCY1">    
               <text_node>    
           <RDB_node>    
                 <table name="FXTRADE"/>    
                 <column name="CURRENCY1" type="CHAR(3)"/>    
           </RDB_node>    
                </text_node>
       </element_node>
       <element_node name="CURRENCY2">
               <text_node>
           <RDB_node>
                 <table name="FXTRADE"/>
                 <column name="CURRENCY2" type="CHAR(3)"/>
           </RDB_node>
              </text_node>
       </element_node>
       <element_node name="AMOUNT">
               <text_node>
           <RDB_node>
                 <table name="FXTRADE"/>
                 <column name="AMOUNT" type="DECIMAL(18,2)"/>
           </RDB_node>
               </text_node>
       </element_node>
       <element_node name="SETTLEMENT">
             <text_node>
           <RDB_node>
                 <table name="FXTRADE"/>
                 <column name="SETTLEMENT" type="DATE"/>
           </RDB_node>
              </text_node>
       </element_node>
       <element_node name="ACCOUNT"> 
           <element_node name="BANKCODE">
               <text_node>
                   <RDB_node>
                       <table name="ACCOUNT"/>
                       <column name="BANKCODE"
                           type="VARCHAR(100)"/>
                   </RDB_node>
               </text_node>
           </element_node>
           <element_node name="BANKACCT">
               <text_node>
                   <RDB_node>
                        <table name="ACCOUNT"/>
                        <column name="BANKACCT"
                            type="VARCHAR(100)"/> 
                    </RDB_node>
               </text_node>
           </element_node>
       </element_node> <!--end of  Account element-->
   </element_node>    <!-- end of  FxTrade element -->
        </root_node>
    </Xcollection>
</DAD> 

DAD defines a mapping between XML elements and relational database columns by employing element_node to RDB_node associations. The top-level element_node FXTRADE is defined as a join between tables FXTRADE and ACCOUNT, with field ID in ACCOUNT table as a primary key. Child element CURRENCY1 is mapped onto field CURRENCY1 in the table FXTRADE, and so on. Note that the ACCOUNT element does not include any RDB_node -- this is not required since the link between ACCOUNT and FXTRADE tables has been defined earlier. Child elements of ACCOUNT, BANCCODE, and BANKACCT respectively are defined in the corresponding columns of ACCOUNT table. Atomic XML elements are tagged in DAD as text_node. In the example above all elements, except FXTRADE and ACCOUNT, are atomic.

Extracting XML from the database

XML document composition and decomposition are handled by stored procedures: stored procedure dxxGenXML() extracts XML documents from a database; stored procedure dxxShredXML() stores XML documents in a database.

The main input parameters of dxxGenXML() are

  • DAD: stored in form of DB2 CLOB data type;
  • result table name: the constructed XML document is forwarded to this table.

Other input parameters specify the maximum number of rows returned, and <RDB_node> <condition> element override options. The output parameters include the actual number of returned rows, return code, and return message.

An extensive example of stored procedures within a C program can be found in "IBM DB2 Universal Database XML Extender Administration and Programming, Version 7".

Storing XML in the database

Putting an XML document into a database is accomplished by dxxShredXML(), a stored procedure.

The input parameters of dxxShredXML() are

  • DAD: stored in the form of DB2 CLOB data type;
  • input XML document: stored in the form of DB2 XMLCLOB data type.

Its output parameters are a return code and return message.

Summary

The XML-SQL mapping rules are specified by means of a Data Access Definition (DAD) file, which is an XML document. The DB2 XML Extender administration facilities include a means to construct DAD files for each persistent DTD.

Future enhancements will include a new XML-SQL transformation syntax, which will employ an XML Transforms language, a subset of W3C XSLT.

Pages: 1, 2, 3, 4

Next Pagearrow