The Next Web?

March 15, 2006

The only things more annoying than the broken tools of today are the better tools of tomorrow which aren't here yet. Technologists often pass quickly through cycles of delight with a new toy and frustration with its limitations, looking for the next new thing as soon as they've figured out the last old thing.

Since the Web first appeared, developers have talked about what's wrong with it. (Heck, I thought HTML seemed pitifully weak compared to the hypertext work I'd been able to do in HyperCard.) Critics examined the roots of HTML structure and semantics, and the HTTP protocol on which the Web rides. Their complaints led to bursts of innovation, bursts which seemed finally to slow down as the bubble burst, Netscape left the browser field, and Microsoft let Internet Explorer stagnate.

It sometimes seems like widely popular web-standards innovation halted around 2000, and the last few years have been a period of very slow catch-up. Various visions of a new Web, a better Web, have come and gone, leaving behind useful parts but not yet transforming the Web. Are we on the edge of the next big thing? It may make sense to look at the last few big things, comparing their visions with what's happening today.

The XML Web

The incredible snarl of HTML (and then JavaScript) bothered many people in the SGML community, from which HTML had borrowed its angle brackets. HTML was the bad child whose lousy habits had catapulted it to fortune, corrupting others along the way. Despite the snarls, some saw promise. Yuri Rubinsky's dream of bringing together the Web and SGML communities bore fruit in XML, often called "SGML for the Web" in its early days.

XML has occasionally found its way to the Web, but it's hard to remember now that once upon a time, XML was supposed to be directly on the Web, the files people loaded and manipulated. Jon Bosak and Tim Bray wrote in the May 1999 Scientific American about the benefits of changing from an HTML-based Web to an XML-based one:

As programming legend Brian Kernighan once noted, the problem with "What You See Is What You Get" is that what you see is all you've got ....

The solution, in theory, is very simple: use tags that say what the information is, not what it looks like ...

As XML spreads, the Web should become noticeably more responsive ... the structural and semantic information that can be added with XML allows these devices to do a great deal of processing on the spot. That not only will take a big load off Web servers but also should reduce network traffic dramatically.

Bosak and Bray went on to discuss the three key aspects of the XML standards family that would make this work: structure (XML), style (XSL), and linking (XLink). XLink in particular promised a huge advance on the relatively simple hypertext options HTML provided, promising much richer choices for readers and authors.

That sounded great, and I really thought I saw it happening. When I wrote the first edition of XML: A Primer, I expected my audience to be web developers looking to get started on replacing the HTML Web with the XML Web. In the second edition, I went further into that, but suddenly got a lot of complaints. XML book buyers weren't interested in the next generation of the Web. They were programmers, who just wanted to get data from point A to point B with fewer incompatibilities.

The particular XML Web described by Bosak and Bray never happened. (It still could, but hasn't.) Browser implementors, even when they supported XML, have never supported more than the tiniest subset of XLink, and XSLT support arrived slowly. Web developers, mystified by the emphasis on structure, the lack of support for XLink, and the different approach XSLT took to styling, never widely supported even the XML Web functionality subset that browser makers did implement.

Personally, I hoped for an XML+CSS+XLink web--leaving out the XSLT that was so alien to web design--but despite some promising steps forward in CSS2, it never came close to happening. Browser vendors never concluded a market existed beyond me, and the XLink never got implemented far enough to handle things like show="embed" for images.

The XML Web never happened. It might yet, though without XLink, it seems unlikely to offer much advantage over HTML.

The Semantic Web

In the next phase of the conversation, XML never yielded all that wonderful improvement for searches, but that's OK, since that was meant to be one of the things the Semantic Web would deliver. Why rely on tag names when you could have a whole infrastructure of knowledge representation formalisms (RDF and OWL) that could tell you exactly where to find what you need? There could be vast collections of metadata at your fingertips, ready for slicing, dicing, and analysis.

Tim Berners-Lee, creator of the World Wide Web and chief promoter of the Semantic Web, explained the vision like this:

While Web pages are not generally written for machines, there is a vast amount of data in them, such as stock quotes and many parts of online catalogues, with well-defined semantics. I take as evidence of the desperate need for the Semantic Web the many recent screen-scraping products, such as those used by the brokers, to retrieve the normal Web pages and extract the original data. What a waste: Clearly there is a need to be able to go publish and read data directly.

Most databases in daily use are relational databases—databases with columns of information that relate to each other, such as the temperature, barometric pressure, and location entries in a weather database. The relationships between the columns are the semantics—the meaning—of the data. These data are ripe for publication as a semantic web page. For this to happen, we need a common language that allows computers to represent and share data, just as HTML allows computers to represent and share hypertext. The consortium is developing such a language, the Resource Description Framework (RDF), which, not surprisingly, is based on XML. In fact it is just XML with some tips about which bits are data and how to find the meaning of the data. RDF can be used in files on and off the Web. It can also be embedded in regular HTML Web pages. The RDF specification is relatively basic, and is already a W3C Recommendation. What we need now is a practical plan for deploying it. (Writing the Web, 1999, p. 181)

While development of RDF, OWL, and related standards, as well as software to support those standards, continues, no one has yet found a "practical plan for deploying it," at least not in the broad way Berners-Lee proposed. Mixing XML with HTML is a much less complicated proposition, but web developers never found that all too appealing either. Semantic Web technologies and projects--consider RSS 1.0, FOAF, and DOAP--are definitely designed to address real problems, but they haven't yet come together to create anything like the Semantic Web vision.

The Services Web

As the Semantic Web story was taking off, a different group of developers (also largely under the aegis of the W3C) saw another group of possibilities for getting those stock quotes to those hungry brokers. Their vision still combined XML and the Web, but freed the notion of web from the notion of a web browser. Web Services initially combined the protocol side of the web equation, HTTP, with XML. The trio of SOAP, WSDL, and UDDI would let developers create, define, and share their mechanisms for exchanging data among computers.

As it turned out, while web services have proven popular for Enterprise Application Information (EAI), Service Oriented Architecture (SOA), and a wide variety of other business-oriented projects, they've had very little impact on the traditional Web. Some web companies do expose their data to outsiders through SOAP-based APIs, but the vision of a large market of open services accessible to anyone who needs data or processing has largely faded. Instead, web services have become a replacement for CORBA and similar architectures.

Services haven't vanished, though perhaps it's unfortunate that SOAP-based services got the title of "Web Services" simply for their use--or some would say abuse--of HTTP. Another architecture for web services, REST, is quite deliberately built on the traditional web browser/web server model, allowing for much easier integration with things like human visitors exploring a service through a web browser. Despite its greater compatibility with traditional web models, though, REST hasn't become an instant business success either.

The Next XHTML

While grand visions of an XML- or RDF-enriched Web competed with SOAP-based services for attention, the HTML community, both at the W3C and elsewhere, had some ideas of its own.

XHTML 1.0, recasting HTML as an XML vocabulary, was the first small step. The vast majority of web developers haven't noticed XHTML, though the acronym is becoming more common as new editions of HTML books roll off the presses with an 'X' appended to their titles. XHTML eases the pain of the screen-scraping Berners-Lee complained of earlier. It also offers a key advantage for advanced developers using CSS and Dynamic HTML: a clear set of document structures on which to build.

XHTML 1.1 attempted to modularize HTML into smaller and reusable components. For many XML developers XHTML 1.1 served primarily to demonstrate how DTDs and W3C XML Schemas both lacked clean mechanisms for modularization, but XHTML 1.1 did at least open the way for smaller versions of XHTML, notably XHTML Basic, aimed at mobile devices with limited bandwidth and processing power. (Actually, a lot of those phones have more power and bandwidth than the computer/modem combination on which I first surfed the Web.)

While all of this was rearranging the furniture that already existed in HTML, there were a few other initiatives which promise to enrich the set of choices. Scalable Vector Graphics (SVG) promises manipulable graphics to complement the images we have now, while XForms offers a new generation of vastly more flexible web-based forms.

Currently, XHTML 2.0 promises to rearrange XHTML, finally cleaning up features that have been marked for deletion for years and developing a new path forward for browser vendors and developers. Unfortunately, it hasn't been that simple. XHTML 2.0 collided with the failed dreams of XLink a few years ago, and now faces an uprising from browser vendors who'd rather make smaller changes through the invitation-only decision-making process of the WHATWG.

As much as I'd like to see XHTML of some form happen, once again we're missing a "practical plan for deploying it." Who's going to migrate? What drives that migration into mainstream web development?

AJAX and Web 2.0

I first started writing during the heat of the browser wars. A cool new feature would appear in one browser or another, and everyone rushed to copy it, where possible. The last few years have been very different, with Internet Explorer moldering away as Firefox, Opera, and a small host of other new browsers focused on cool new features that didn't make big changes to the HTML itself. The basic functionality stayed the same, but over the last year or so--greatly helped by the final demise of Netscape 4.0--developers have made huge strides using tools that were designed and implemented in the late 1990s.

While all of these other proposals for the Next Web require substantial new infrastructure, AJAX instead does a wonderful job of applying a technology that was frequently derided in the early days of XML: Dynamic HTML, which combined JavaScript and HTML. AJAX even frequently combines that technology with XML for data, and once again relies on the Web's foundation protocol, HTTP, for hypertext transfer. Even within the strict bounds of web browsers' JavaScript sandboxes, where programs are allowed to contact only the server from which they came, it turns out that there is an incredible amount that developers can do. AJAX frees web pages from the old model of regular trips from the client to the server and back.

As it turns out, the server-side support for AJAX can also frequently act as an interface to the server that can be used by other programs, without the need for a browser. Developers who craft smart APIs on their servers for use by AJAX-based web pages can then expose those APIs to other developers, getting the benefits of better interfaces for users who use web browsers to consume the data and for users who have their own custom programs consuming the data. Depending on how carefully the developer models AJAX transactions on traditional web HTTP transactions, these services even look a lot like the REST approach proposed earlier for web services.

The success of AJAX, and the business models it opens, have driven talk of Web 2.0. The definition of Web 2.0 varies depending on who you talk to, and there's some clear pushback against one of the more explicit efforts to define Web 2.0. I've suggested that Web 2.0 is what happens when you cleanly separate web client logic from web server logic, but that's also probably too broad a definition to be useful.

Whatever the version number of the Web may be today, the AJAX resurgence has a lesson. After waiting for all of those promises of better tools to come, it seems that developers looked at the parts they had available, and chose the ones they could use today. It can be annoyingly hard work, but the results are impressive.