Introducing OpenSearch
by Uche Ogbuji
|
Pages: 1, 2
URL Templates
Take another look at the Url element in Listing 1, which serves as the mechanism for telling the search client how to query the search engine. The template attribute looks like a URL, but the parts within curly braces are parameters the client provides to specialize the search. There are half a dozen parameter names like searchTerms with a purpose established within the OpenSearch spec. The searchTerms parameter is a placeholder for the search criteria (e.g., john+doe); startPage allows the client to specify a page within the result set. More on result pages in a later section. You can use a parameter in the form of an XML QName for a foreign namespace to cover meanings not provided by the standard parameters. Notice the question mark after the startPage parameter in Listing 1. This means the search client is free to not provide a value for this parameter (i.e., to substitute it with an empty string). The search client must provide a non-empty value for searchTerms because it does not have the question mark.
Result Format
Once again, a central idea of OpenSearch is that search results come as web feeds. The supported formats are RSS 2.0 and Atom 1.0. In this article, and in my personal recommendation, I stick to the latter. Each search result corresponds to an Atom entry, using the usual semantics for entry syntax. There is, however, some interesting specialization at the feed element level. Listing 3 is a sample OpenSearch query result with a single result item.
Listing 3: OpenSearch Atom search result
<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" xmlns:os="http://a9.com/-/spec/opensearch/1.1/">
<id>urn:uuid:60a76c80-d399-11d9-b93C-0003939e0af6</id>
<title>XML.com search: atom store python</title>
<link rel="self" type="application/atom+xml"
href="http://search.xml.com/?q=atom+store+python&p=&format=atom"/>
<link rel="alternate" type="text/html"
href="http://search.xml.com/?q=atom+store+python&p="/>
<link rel="search" type="application/opensearchdescription+xml"
href="http://example.com/opensearchdescription.xml"/>
<os:totalResults>1</os:totalResults>
<os:startIndex>1</os:startIndex>
<os:itemsPerPage>10</os:itemsPerPage>
<os:Query role="request" searchTerms="atom store python" startPage=""/>
<updated>2007-07-07T12:00:00Z</updated>
<author>
<name>XML.com</name>
<email>admin@xml.com</email>
</author>
<rights>All content Copyright 2007, O'Reilly and Associates</rights>
<entry>
<title>XML.com: Implementing the Atom Publishing Protocol</title>
<!-- Note: following URL modified for article formatting reasons -->
<link href="http://www.xml.com/pub/a/2006/07/19/implementing-app-python-wsgi.html"/>
<id>http://www.xml.com/2006/07/19/implementing-app-python-wsgi</id>
<updated>2006-07-19T15:00:00Z</updated>
<content type="text">
Joe Gregorio's latest Restful Web column implements the Atom Publishing Protocol as a
Python web service using WSGI.
</content>
<author>
<name>Joe Gregorio</name>
</author>
</entry>
</feed>
The links with rel="self" and rel="alternate" follow the usual Atom semantics. The rel="search" link is a convention added by OpenSearch for feed auto-discovery. When accessing this URL you should get a search engine description document like Listing 1. Notice the application/opensearchdescription+xml media type, which OpenSearch proposes for description documents. You can also use special link types for paging search results. If a search would produce thousands of results, neither the client nor the service provider is likely to want to pile them all into a single result feed document, especially considering that most search engines provide hits of most likely interest in early pages. The Feed Paging and Archiving (aka Feed History) extension to Atom provides a simple mechanism for breaking down large virtual feeds into pages or sections in such cases. It's currently an IETF Internet Draft, but probably will be adopted as a standard soon. OpenSearch adopts its conventions for paging search results. An OpenSearch response might be one of a series of feeds, each of which represents a subset of the total results, including links with types such as first, last, prev, and next to inform the search client how to navigate through the results. Listing 3 also shows elements, totalResults, startIndex, and itemsPerPage in the OpenSearch namespace that provide additional contextual metadata for search results. The common URL parameter startPage allows the search client to jump to a specific result page, and the count parameter controls the number of result items per page.
OpenSearch provides a few conventions to work with HTML web pages, including meta tags for auto-discovery of description documents and the totalResults, startIndex, and itemsPerPage information about search results.
Searching, Agile Web Style
OpenSearch really just provides the framework of a query mechanism to complement the Atom Protocol. It defines enough semantics to tell you how to express simple full-text searches. You can extend it for more specialized query by adding your own extension parameters in URL templates. For example, you might want to specify a parameter to limit searching to a specific element in Atom feeds using a template like http://search.xml.com/?q={searchTerms}&f={x:restrictField?} (you'd have to define a namespace for the "x" prefix). This would allow the search client to search for "xslt" within summaries by specifying http://search.xml.com/?q=xslt&f=atom:summary. By keeping it simple, OpenSearch complements other related technologies very well, and adheres to solid Agile Web principles. There is a long and growing list of OpenSearch tools and search engines, so there is a good chance this specification will guide how we approach search and query for Web 2.0.
- #1 Carpet Cleaning Los Angeles call 1-323-678-2704
2009-06-11 14:54:04 whats - Right on!
2007-07-24 19:57:30 Amoussou - IDEAS
2007-07-25 06:52:40 Uche Ogbuji