This is a very good article that gives more insight of the present developments. However, it would be more useful if one explains how a 'schema' and 'data' are reprented in the raw form. For example, we know what data and index files are stored in an ISAM structure.
http://www.informatik.uni-trier.de/~ley/db/conf/vldb/vldb2001.html#CooperSFHS01
The article from
Brian Cooper, Neal Sample, Michael J. Franklin, Gísli R. Hjaltason, Moshe Shadmon:
A Fast Index for Semistructured Data. 341-350
or a link to the pdf :
http://www.vldb.org/conf/2001/P341.pdf
It was very useful for me and my exames :)