Second Generation Web Services
In the early days of the Internet, it was common for enlightened businesses to connect to it using SMTP, NTTP, and FTP clients and servers to deliver messages, text files, executables, and source code. The Internet became a more fundamental tool when businesses started to integrate their corporate information (both public and private) into the emerging Web framework. The Internet became popular when it shifted from a focus on transactional protocols to a focus on data objects and the links between them.
The technologies that characterized the early Web framework were HTML-GIF/JPEG, HTTP, and URIs. This combination of standardized formats, a single application protocol, and a single universal namespace was incredibly powerful. Using these technologies, corporations integrated their diverse online publishing systems into something much more compelling than any one of them could have built.
Once organizations converged on common formats, the HTTP protocol, and a single addressing scheme, the Web became more than a set of Web sites. It became the world's most diverse and powerful information system. Organizations built links between their information and other people's. Amazing third party applications also weaved the information togethe; examples include Google, Yahoo, Babelfish, and Robin Cover's XML citations.
First generation web services are like first generation Internet connections. They are not integrated with each other and are not designed so that third parties can easily integrate them in a uniform way. I think that the next generation will be more like the integrated Web that arose for online publishing and human-computer interactions. In fact, I believe that second generation web services will actually build much more heavily on the architecture that made the Web work, using the holy trinity: standardized formats (XML vocabularies), a standardized application protocol, and a single URI namespace.
This next generation of web services will likely adhere to an architectural style called REST, the underlying architectural model of the current Web. It stands for "representational state transfer". Roy Fielding of eBuilt created the name in his PhD dissertation. Recently, Mark Baker of Planetfred has been a leading advocate of this architecture.
The Current Generation
SOAP was originally intended to be a cross-Internet form of DCOM or CORBA. The name of an early SOAP-like technology was "WebBroker" -- Web-based object broker. It made perfect sense to model an inter-application protocol on DCOM, CORBA, RMI etc. because they were the current models for solving inter-application interoperability problems.
These technologies achieved only limited success before they adapted for the Web. Some believe that the problem was that Microsoft and the OMG supporters could not get along. I disagree. There is a deeper issue. RPC models are great for closed-world problems. A closed world problem is one where you know all of the users, you can share a data model with them, and you can all communicate directly as to your needs. Evolution is comparatively easy in such an environment: you just tell everybody that the RPC API is going to change on such and such a date and perhaps you have some changeover period to avoid downtime. When you want to integrate a new system you do so by building a point-to-point integration.
On the other hand, when your user base is too large to communicate coherently you need a different strategy. You need a pre-arranged framework that allows for evolution on both the client and server sides. You need to depend less on a shared, global understanding of the rights and responsibilities of a participant. You need to put in hooks where compliant clients and serves can innovate without contacting you. You need to leave in explicit mechanisms for interoperating with systems that do not have the same API. RPC protocols are usually poorly suited for this kind of evolution. Changing interfaces tends to be extremely difficult. Integrating services typically takes complicated software "glue".
I believe this is the reason no enterprise has ever successfully unified all of their systems with DCOM, CORBA, or RMI.
Now we come to the crux of the problem: SOAP RPC is DCOM for the Internet.
There are many problems that can be solved with an RPC methodology. But I believe that the biggest, hairiest problems will require a model that allows for independent evolution of clients, servers, and intermediaries. It is important, then, for us to study the only distributed applications to ever scale to the size of the Internet.
The Archetypal Scalable Application
The two most massively scalable, radically interoperable, distributed applications in the world today are the Web and email. What makes these two so scalable and interoperable? They depend on standardized, extensible message formats (HTML and MIME). They depend on standardized, extensible application protocols (HTTP and SMTP). But I believe that the most important thing is that each has a standardized, extensible, global addressing scheme.
There's an old real estate joke that the only three things which make a property valuable are location, location, and location. The same is true in the world of XML web services. Properly implemented, XML web services allow you assign addresses to data objects so that they may be located for sharing or modification.
In particular, the web's central concept is a single unifying namespace of URIs. URIs allow the dense web of links that make the Web worth using. They bind the Web into a single mega-application.
|Does the REST model make sense, or is SOAP enabling something that couldn't have happened otherwise?|
URIs identify resources. Resources are conceptual objects. Representations of them are delivered across the web in HTTP messages. These ideas are so simple and yet they are profoundly powerful and demonstrably successful. URIs are extremely loosely coupled. You can even pass a URI from one "system" to another using a piece of paper and OCR. URIs are late bound. They do not declare what can or should be done with the information they reference. It is because they are so radically "loose" and "late" that they scale to the level of the Web.
Unfortunately, most of us do not think of web services in these terms. Rather we think of them in terms of remote procedure calls between endpoints that represent software components. That's CORBA, DCOM thinking. Web thinking is organizing around URIs for resources.
Claim: The next generation of web services will use individual data objects as endpoints. Software component boundaries will be invisible and irrelevant.