Structured Writing, Structured Search
by Jon Udell
|
Pages: 1, 2
Since I couldn't parameterize the query string, I left it as a placeholder (match="query") and looked to DOM manipulation as a way to reach in and change it. Here's what I came up with:
<html>
<head>
<link rel="stylesheet" type="text/css" href="style.css"/>
<script>
var xslurl = 't.xsl';
var xmlurl = 'ml.xml';
function transform(queryText)
{
var appName = navigator.appName;
var appVersion = navigator.appVersion;
if (appName == 'Netscape')
{
MOZtransform(queryText);
return;
};
if (appName == 'Microsoft Internet Explorer')
{
IEtransform(queryText);
return;
}
alert('unsupported: ' + appName + ', ' + appVersion);
}
function MOZtransform(queryText)
{
var xsl;
var xml;
try
{
xsl = document.implementation.createDocument("", "xslt", null);
xsl.async = false;
xsl.load (xslurl);
var queryTemplate = xsl.getElementsByTagName('template')[1];
queryTemplate.setAttribute('match', queryText);
}
catch(e)
{
alert('error: modify xsl: ' + e.message);
}
try
{
xml = document.implementation.createDocument("", "xml", null);
xml.async = false;
xml.load (xmlurl);
}
catch(e)
{
alert('error: load xml: ' + e.message);
}
try
{
var xslp = new XSLTProcessor();
xslp.importStylesheet ( xsl );
var results = xslp.transformToFragment(xml,document);
var resultDiv = document.getElementsByTagName('div')[0];
resultDiv.innerHTML = '';
resultDiv.appendChild(results);
document.queryBox.q.value = queryText;
}
catch(e)
{
alert('error: do xslt: ' + e.message);
}
}
function IEtransform(queryText)
{
var xsl;
var xml;
try
{
xsl = new ActiveXObject("MSXML2.FreeThreadedDOMDocument");
xsl.async = false;
xsl.load(xslurl);
var xsldoc = xsl.documentElement;
var nodelist = xsldoc.selectNodes('//*[@match="query"]');
var queryTemplate = nodelist.item(0);
queryTemplate.setAttribute('match', queryText);
}
catch(e)
{
alert('error: modify xsl: ' + e.description);
}
try
{
xml = new ActiveXObject("MSXML.DOMDocument");
xml.async = false;
xml.load(xmlurl);
}
catch(e)
{
alert('error: load xml: ' + e.description);
}
try {
var templ = new ActiveXObject("MSXML2.XSLTemplate");
templ.stylesheet = xsl;
var xslp = templ.createProcessor();
xslp.input = xml;
xslp.transform();
var results = xslp.output;
var resultDiv = document.getElementsByTagName('div')[0];
resultDiv.innerHTML = results;
document.queryBox.q.value = queryText;
}
catch(e)
{
alert('error: do xslt: ' + e.description);
}
}
</script>
</head>
<body>
<table>
<tr>
<td>choose xpath query from list</td>
<td>enter or modify xpath query</td>
</tr>
<tr><td>
<form name="queryList" method="post">
<select name="q"
onChange="javascript:transform(document.queryList.q.value)">
<option value="/">choose your query</option>
<option value="//s:title[contains( . , 'SlideML')]">
slide titles containing 'SlideML'</option>
<option value="//img">
image references</option>
<option value="//img[contains(@src, 'zope')]">
image references containing 'zope'</option>
<option value="//p[contains(. , 'OpenOffice')]">
paragraphs containing 'OpenOffice'</option>
<option value="//*[@class='code']">
elements with class='code'</option>
<option value="//*[@class='code' and contains(@id, 'python')]">
//class='code' and id contains 'python'</option>
<option value="//a[contains(@href , 'bray')]">
links with URL containing 'bray'</option>
<option value="//a[contains(./text() , 'bray')]">
links with text containing 'bray'</option>
<option value="//a[contains( translate (
text(), 'ABCDEFGHIJKLMNOPQRSTUVWXYZ',
'abcdefghijklmnopqrstuvwxyz'), 'bray')]">
links with text containing 'bray', case-insensitive</option>
</select>
</form>
</td><td>
<form name="queryBox" method="post"
action="javascript:transform(document.queryBox.q.value)">
<input name="q" size="60">
</form>
</td></tr></table>
<div class="results">
</div>
</body>
</html>
In slightly different ways, MSIE and Mozilla are following the same recipe:
Load the stylesheet into an XML DOM.
Find the <xsl:template match="query"> element.
Reset the value of its match attribute to the XPath string obtained from one or the other of the UI widgets.
Load the package of SlideML data into another XML DOM.
Create an XSLT processor.
Apply the modified XSLT to the SlideML data.
Replace a DIV element with the search results.
Using XPath Search
From a user's point of view, XPath query strings are pretty darned geeky. I'm hopeless with them myself unless I have examples in front of me. I find that having a list of examples available in the context of my own live data, and synchronizing it to an input box in which examples can be modified, leads me to discover and record more useful patterns. A subtler thing happens too. As you're writing the XHTML, the search possibilities begin to guide your choices.
For example, I chose a very simple markup strategy for the slideshow. Rather than go with complex outlining, I decided that I really only needed two levels of indentation. I attached those levels to <p> and <div>. For purposes of indentation, it didn't matter whether I wrote like this:
<p>...</p>
<div>...</div>
<div>...</div>
Or like this:
<p>
<div>...</div>
<div>...</div>
</p>
I chose the latter style because I sensed that I wanted a <p> to enclose a complete thought. That was a somewhat abstract notion, but it suddenly became crystal clear when I made a simple change to the XSLT stylesheet. The change was from
<xsl:value-of select="."/>
to
<xsl:copy-of select="."/>
More from Jon Udell | |
Lightweight XML Search Servers, Part 2 | |
In other words, instead of simply dumping the text of the found element -- which is what search engines almost universally do, since they can't rely on the markup in the text they find -- this engine returns well-formed fragments. Images display as images, links as proper links, tables as tables, and when the query says "find a paragraph that contains" the result is the complete XHTML paragraph element, rendered as it is in its original context.
Sooner or later, I'll be using a real XML database to enjoy this level of control over the XHTML content I post to my weblog and that others post to theirs. With a little luck, I won't have to provide that service myself. Somebody will build one that latches onto my XHTML feed and others. Meanwhile, being lazy and having some RAM to spare, I'll probably see how far I can push this serverless approach.
Share your comments on this article in our forum.
(* You must be a member of XML.com to use this feature.)
Comment on this Article
| Titles Only | Titles Only | Newest First |
- The christmastree lights company Los Angeles 1-818-386-1022
2008-10-16 10:12:36 orellytos [Reply]
Christmas Tree Lights Indoor Outdoor Los Angeles 1-310-925-1720 Decoration Sale and Installation
Christmas Lights Decoration Installation
Welcome: Holiday lights Christmas lights (also sometimes called fairy lights, twinkle lights or holiday lights in the United States) are strands of electric lights used to decorate homes, public/commercial buildings and Christmas trees during the Christmas season. Christmas lights come in a dazzling array of configurations and colors. The small "midget" bulbs commonly known as fairy lights are also called Italian lights in some parts of the U.S.,
Experience pays off! Our experience can save you hundreds, if not thousands, of dollars by determining the best combination of services to meet your needs — that means every project we build is customized for you, not all home Christmas lights decorations project are identical.
We are known for our reliability, superior workmanship & impeccable service. Using only quality materials, our standards of excellence provide you the most return for your investment. Over the years, we have developed a deep respect for the importance of individual expression in home Christmas lights decor. Right from the start of every project, we strive to fully understand and incorporate your individuality into every phase of planning, design and Christmas lights Sale and decorations.
We offer the following Products and Services:
Christmas Lighting New inside / outside christmas planter that lights up
Full service sales and installation departments
Custom pole-mounted banner sales and installation
Large animated holiday displays
Custom holiday displays
Leasing and rental programs
In house graphic arts department
Knowledgeable and helpful year round staff.
call 1-310-925-1720
- Great; SlideML and Namespaces; Wysiwyg Editor
2003-06-13 17:30:22 Roger Fischer [Reply]
Hi Jon,
Thanks for exploring SlideML in detail, we very much appreciate it.
As for the namespaces: you only need three:
- SlideML Namespace
- XHTML Namespace (default namespace)
- Dublin Core Namespace
the other three are only necessary, if really needed. If you don't xlink or do any other "crazy" thing, you won't need the other three namespaces.
***
A Wysiwyg XML/XHTML Editor
Christian started working with Conor (the EDom/Mozile) together. The aim: extend EDom and get the Bitflux Editor NG ready for End of July (with Schema and all).
If you have any good suggestions, what we should do, let us know. Write us at flux@bitflux.ch
Best and thanks again
Roger
- disaggregating content
2003-06-12 16:20:45 Lucas Fletcher [Reply]
The great thing about XHTML + XPath is it disaggregates content at the page level in the same way that RESTful web services disaggregate content at the URI level. XPointer was supposed to do this, but all you really need to do is add an xpath parameter to the querystring of an XHTML doc.
I still think structuring content with XML is better than XHTML in general. (bold and italics markup and perhaps even the occasional XHTML island will still need to be in the xml o/c) True, XHTML structuring requires less of the creator so I can see how you are hopeful it is a way that creators will actually do something somewhat structured. But I think the majority of us need structure imposed beforehand, and don't have the discipline to create it on-the-fly, even if it is less work. Sort of like the benefits of iambic pentameter over free verse...
With XML structuring, XSLT transformations need to happen on the client for things to really open up to 3rd parties, so I'm glad to see you promoting it.
Winfield
http://dealersinnotions.com
