Storing XML in Relational Databases
by Igor Dayen
|
Pages: 1, 2, 3, 4
Sybase Adaptive Server
SQL to XML Mapping
Sybase employs an XML document type, ResultSet, to describe both XML document metadata (element name, type, size, etc.) and actual row data. Here is excerpt from a hypothetical FxTradeSet.xml:
<?xml version="1.0"?>
<!DOCTYPE ResultSet SYSTEM "ResultSet.dtd">
<ResultSet>
<ResultSetMetaData>
<ColumnMetaData
...
getColumnLabel="CURRENCY1"
getColumnName="CURRENCY1"
getColumnType="12"
... />
...
</ResultSetMetaData>
<ResultSetData>
<Row>
<Column name="CURRENCY1">GBP</Column>
...
</Row>
</ResultSetData>
</ResultSet>
|
The ResultSet DTD does not appear to permit definition of the nested elements.
Extracting XML from the database
The Java class ResultSetXml has a constructor which
takes an SQL query as an argument, thereafter the
getXmlLText method extracts an XML document from result
set:
jcs.xml.resultset.ResultSetXml rsx = new jcs.xml.resultset.ResultSetXml
("Select * from FxTrade", <other parameters>);
FileUtil.string2File ("FxTradeSet.xml", rsx.getXmlText());
|
Storing XML in the database
The ResultSetXml class constructor can also take an
XML document as an argument. Then the method toSqlScript
generates sequences of SQL statements for insert/update into a
specified table from a result set.
String xmlString = FileUtil.file2string ("FxTradeSet.xml");
jcs.xml.resultset.ResultSetXml rsx = new jcs.xml.resultset.ResultSetXml
(xmlString);
String sqlString = rsx.toSqlScript ("FxTrade", <other parameters>)
|
Summary
The extraction and storage of XML documents are symmetrical in nature. Storing does not seem to permit modifying more than one table. Extraction transforms the results of an SQL query into a document with a flat structure.
Vendor Comparison
|
Vendor |
Mapping rules |
Single table / Multiple tables |
Means of transformation |
Symmetrical extraction / storing |
|
Oracle |
Implicitly; by constructing object-relational data model |
Multiple |
Designated Java classes |
Symmetrical, if XML document and object-relational model match |
|
IBM |
Data Access Definition file |
Multiple |
Designated stored procedures |
Symmetrical |
|
Microsoft |
SQL extension; row set function |
Multiple for extraction; Single for storing |
By using SQL construct FOR XML and row set OPENXML |
Asymmetrical |
|
Sybase |
Result Set DTD |
Single; query may encompass multiple tables |
By using Java classes |
Symmetrical |
Common features between vendors:
- XML persistence is achieved on an ad hoc basis, i.e., there's no general facility for storing arbitrary XML documents); special mapping is required if the XML document uses a new grammar;
- storage often requires data preprocessing, such as number/date reformatting with respect to the locale being used; XSLT can be utilized to perform XML data massaging.
An Alternative Strategy
Can the task of XML document storage in a database be split into stages? First, store XML in generic database structures without applying any mapping rules; and, second, applying a framework for subsequent data processing? The advantage of this approach would be to have a universal acquisition agent for virtually any XML document. The remainder of this article proposes just such a solution.
In this strategy, XML documents are saved as normalized tree structures -- raw tree data -- for further application processing. The framework for further data processing is then a set of database objects (stored procedures), similar to DOM APIs, for navigating data by employing traditional SQL processing.
There are advantages to using a framework approach as opposed to full automation: the population of database tables is normally controlled by application logic distributed between "pure" application code, transaction processing, database layer, and stored procedures, not to mention that some tables may reside in remote databases.
Additionally, in a heterogeneous database environment, having a unified tool to perform the same task for different databases has obvious advantages.
The author's solution, ObjectCentric Solutions x-Persistence Utility, implements the strategy discussed here.
Raw tree structures are defined by means of the following tables.
1) Tree Node Definition CREATE TABLE NodeDef ( nodeID udd_id NOT NULL, // unique node id dimID udd_id NOT NULL, // tree category: XML name udd_name NOT NULL, // node name value udd_paramvalue NULL, // node value value_type udd_type NULL, // value type power udd_power NULL, // node tree level isRoot udd_logical NOT NULL, // top node flag descriptor udd_descriptor NULL, // DOM Node type color udd_color NULL // informal data ) 2) Parent-Child relationships between nodes CREATE TABLE NodeLinks ( parentID udd_id NOT NULL, // parent node childID udd_id NOT NULL // child node ) |
Example usage
Storing an XML document in a database is a matter of invoking the XML2SQL program:
XML2SQL <XML document filename>
Extracting an XML document from a database as a tree structure is implemented using the following stored procedures:
get_tree_roots<tree category> -- extract all document root ids for a given forestget_tree<root id> -- extract all nodes for a given root (document)get_children<node, level> -- extract all children of a certain level for a given node
Implementation details:
- Current platform coverage: Sybase, MS SQL Server.
- Platforms being evaluated: Oracle, DB2, and PostgreSQL.
- The utility is built on top of the Xerces XML Parser.
- Database installation involves the addition of just a few tables, and stored procedures to navigate the database tables as tree structures.
The x-Persistence Utility is currently used as the core of an XML-based data acquisition agent in Open Finance Laboratory, a customizable integration platform for a range of financial applications such as portfolio management, asset-liability management, risk management, etc. Open Finance Laboratory adopted FpML as a descriptive language for Interest Rate Derivatives products. x-Persistence Utility is currently used as a FpML/FRA data acquisition agent.
References
- Oracle XML-SQL Utility, http://otn.oracle.com/tech/xml/oracle_xsu
- IBM DB2 XML Extender, www.ibm.com/software/data/db2/extenders/xmlext
- XML Perspective. In control with FOR XML Explicit. SQL Server Magazine, http://msdn.mcrosoft.com/library/periodic/period01/xmlExplicit.htm
- Writing XML Using OPENXML
- OPENXML
- Technology Preview: x-Persistence Utility, ObjectCentric Solutions, Inc., 2001, www.objcentric.com
- xml to oracle
2010-04-07 06:49:58 w_ing - SportsFeed Xml Processor
2008-01-22 02:18:47 twostepted - Please rewrite and update this article!
2006-06-14 16:42:42 craign1 - Importing a Data Array from XML to DB
2005-11-10 18:25:41 bdrudolph - Importing a Data Array from XML to DB
2005-11-10 18:28:03 bdrudolph - Importing a Data Array from XML to DB
2005-12-04 11:14:16 Igor Dayen - Storing .xml files into database
2007-08-21 04:01:35 santoshlocuz - Storing .xml files into database
2007-08-26 01:18:02 John1980 - Importing a Data Array from XML to DB
2007-08-21 04:00:59 santoshlocuz - Importing a Data Array from XML to DB
2007-08-21 19:26:35 Igor Dayen - Importing a Data Array from XML to DB
2005-12-04 11:06:38 Igor Dayen - HELP! newbie needing to do something tricky...
2005-04-08 15:08:16 RacerZack - HELP! newbie needing to do something tricky...
2005-04-09 09:55:03 Igor Dayen - how to measure performance of tow diff xml databases
2005-01-03 08:43:46 amolee - how to measure performance of tow diff xml databases
2005-02-28 19:34:40 Igor Dayen - how to measure performance of tow diff xml databases
2005-01-03 20:27:21 Igor Dayen - how to measure performance of tow diff xml databases
2005-02-27 14:40:36 amolee - XML document to be decomposed into a collection of xml subdocuments, using XSLT ?
2003-08-26 09:36:11 punam verma - XML document to be decomposed into a collection of xml subdocuments, using XSLT ?
2003-08-28 15:40:23 punam verma - XML document to be decomposed into a collection of xml subdocuments, using XSLT ?
2003-08-26 19:42:06 Igor Dayen - Storing XML in Relational Databases
2003-08-21 06:07:40 Barry Schaeffer - Storing XML in Relational Databases
2003-08-23 08:17:52 Igor Dayen - Storing XML in Relational Databases
2004-08-23 23:38:02 query - Storing XML in Relational Databases
2004-08-24 19:09:57 Igor Dayen - how to transfer data from xml document to database tables?
2003-03-25 09:38:43 manas mallik - how to transfer data from xml document to oracle database tables?
2004-08-20 05:14:40 Deepthi9 - how to transfer data from xml document to oracle database tables?
2004-08-21 10:42:49 Igor Dayen - how to transfer data from xml document to database tables?
2003-07-14 19:58:21 Igor Dayen - Problem: Large Tables mapped to XML Documents result in large XML Documents. Any solutions for this problem?
2002-11-21 05:46:12 stefan fischerfeier - How to store XML data into relational database
2002-09-18 04:05:12 cvinoth Chidambara - How to store XML data into relational database
2006-10-13 07:34:06 pvaidya - How to store XML data into relational database
2003-07-14 13:56:18 dave Mack - How to store XML data into relational database
2003-09-29 00:51:29 thandar win - How to store XML data into relational database
2007-10-17 04:33:48 picchii - Please state version of Oracle
2001-07-24 13:25:32 Mark Drake - Please state version of Oracle
2001-07-25 06:32:21 Igor Dayen - Do you have any rough perofrmance comparation among these DBs ?
2001-07-12 16:34:24 jun wang - Do you have any rough perofrmance comparation among these DBs ?
2001-07-18 21:14:05 Igor Dayen - obvious problem with this approach
2001-07-03 10:58:16 K Vainstein - obvious problem with this approach
2001-07-18 20:53:23 Igor Dayen - XML/SQL components
2001-06-28 13:51:52 Cory Casanave - What about XML Extensibility?
2001-06-25 12:33:53 Vittorio Viarengo - What about XML Extensibility?
2001-06-26 08:53:18 jack smith - What about XML Extensibility?
2001-06-26 04:33:25 Martin Vanderjagt - What about XML Extensibility?
2001-06-26 11:54:55 Vittorio Viarengo - Generic approaches to storing XML in RDBs
2001-06-22 05:25:53 Richard Edwards - Oracle Internet File System
2001-06-22 03:41:25 Peter Bull - XTM Topic Maps in relational database
2001-06-20 21:28:32 Jack Park - XMLDB
2001-06-20 20:42:00 Pete Ryland