From P2P to Web Services: Addressing and Coordination

April 7, 2004

Organizations are facing new technological challenges, often finding them perplexing or even insolvable, as they modernize their use of the Internet and intranets. But the common element which these problems share is that their solutions go beyond technology. These problems require a social infrastructure, a framework that determines whether or not technological change is successful. This article summarizes what researchers and standards committees are doing in tentative attempts to create that infrastructure.

After making great strides in the 1990s to install LANs, web servers, virtual private networks, and other facets of the TCP/IP revolution, system administrators notice that:

mobile users are logging in from all over the continent;
employees are attaching wireless devices to your networks at heaven's knows what access point from minute to minute;
people are exchanging sensitive data over instant messaging, an outrageously insecure protocol by default, and one that additionally replicates many of the problems of email such as viruses;
employees are collaborating with people outside the company, using email or more sophisticated collaboration tools, freely sharing company data in ways you can't control;
people are running servers on their PCs for the first time and, thus, exposing services to the network--theoretically opening their systems to compromise--through technologies such as Rendezvous;

These problems have to be solved at a fundamental level, involving changes in business requirements, training, and organizational communication patterns. In short, they require a social infrastructure, which makes them harder to solve. Computer and software vendors sell technical solutions for these problems; the products are probably good ones. But they are not enough.

Perhaps we can gain some insight by looking at the ways in which peer-to-peer technology was received and accomodated a few years ago? No one ever offered a great definition of "peer-to-peer". Sometimes the term is used to cover only file-sharing systems (Napster, Gnutella, and Kazaa) , despite the fact that peer-to-peer researchers and implementers were looking beyond file-sharing. Other people defined peer-to-peer so broadly that it included email. When used appropriately, "peer-to-peer" covers grid computing, the new generation of collaborative tools (for example, Groove), and new types of distributed databases and distributed filesystems (for example, OceanStore).

For our purposes here, it's fine to think of peer-to-peer as any networking technology where crucial responsibility lies at the end-points. This definition includes all the issues I mentioned earlier. It may also characterize some aspects of Microsoft's Office 2003 suite.

In fact, definitional inadequacies aside, peer-to-peer isn't really a set of technologies as much as it is a set of problems. And now the problems of peer-to-peer are the problems we all face. Peer-to-peer exposed the weaknesses that exist in the current implementation of the Internet; it was an avant-garde. And while few peer-to-peer technologies have been adopted thus far, I expect that in a decade or so they will be adopted because the problems in social infrastructure now must be solved.

The challenges of and lessons from peer-to-peer fall under one of three categories: addressing, coordination, and trust. I discuss the first two of these in the present article, taking up the third in a future article.

Addressing

Most of us don't run applications that require personal, persistent addresses. Suppose you have a great sale to offer customers and want to promote it through a web service. SOAP offers a way to expose the information to your customers, who can query you for promotions through a SOAP call:

<s:Envelope xmlns:s="http://www.w3.org/2001/06/soap-envelope">
    <s:Body>
       p:QueryPromotions xmlns:p="urn:Promotions">
          <category>travel/category>
          <expiration>2003-10-31/expiration>
       </p:QueryPromotions>
    </s:Body>
</s:Envelope>

But why wait for users to think about querying you? Perhaps this promotion lasts only one week and you want to reach out to loyal customers in time. You want push technology. Companies do this now through email. And you're stuck with email; you can't do push through web services. The problem is that there is no persistent address where a user can be reached by way of a web service. Web services are asymmetric: users can query a server, but the server can't query users. It would also be good if a user could make a web service request and then disconnect, letting you send the results to the user at a later time. You're going to have to use email for that too. Web services are synchronous: the sender has to wait for the reply.

The Real Problem

The two situations I've just described are related. I could restate these points by saying that our current social infrastructure provides only one persistent address for a user, an email address, and that it cannot currently be used easily for other protocols. And even email lacks adequate persistence; just look at the thousands of MediaOne subscribers who had to change to AT&T accounts and then change again to Comcast.

But the venerable email address, for lack of anything better, is being used as a unique identifier. In other words, web clients lack a robust return address. In theory, I could send you an IP address, properly encrypted and signed to prevent spoofing. And a protocol could be developed for you to send the results of your long-running operation to the IP address when you're done. But there is no guarantee I'll be at that IP address; I may have logged off long ago and my neighbor, who may be my sworn enemy and work for a competing company, may have logged in and received my old address.

When the addressing problem, which is related to resource discovery, is raised, some people say, "Implement IPv6, thus providing enough addresses for every device to be manufactured over the next several hundred years. Give every device an address; and, while you're at it, eliminate Network Address Translation and DHCP, and the problem is solved." No, it's not. People are not tied to individual devices. We go to work, we go home, we log in through PDAs and telephones. I am not my computer device.

Furthermore, I need to change addresses as I move. If all of us could use the Internet using the same IP address whether we were in Boston, Montreal, or Helsinki, Internet routing would bog down and become unmanageable. IPv6 does not provide for the use of addresses in different geographic locations; there is only an extension called Mobile IP [http://www.ietf.org/html.charters/mobileip-charter.html], an extra layer designed for cellular phone networks. Implementing IPv6 and eliminating NAT have benefits, but they don't remove the addressing problem.

P2P Faces the Problem

The peer-to-peer movement had to face the problem of addressing head on because people at individual PCs had to be reached in a wide variety of environments. One of cleverest ways to solve the addressing problem is to design applications so that user addresses don't matter. This is the solution chosen by Gnutella and many related file-sharing systems: you just broadcast that you want a certain file. The request passes from your system to a few systems you know and then to a few systems each of them knows; eventually some system comes back saying, "Here is the file." Another way of saying this is that the addressing problem is moved from the user to the desired resource. Individual users are free from the addressing problem.

It's interesting how many applications can function with anonymity. As we have seen, the Web requires the client to identify the server, but the server does not have to identify the client (except to obtain a temporary IP address); the server is happy to display its home page to anyone. Once someone wants to view sensitive data or buy something, the server will put up a password dialog box or require a credit card; that's a more advanced situation.

On the other hand, anonymity is currently being allowed in many places where it's creating trouble, largely because of the rise of wireless networks and the risk of drive-by intruders. For instance, corporate file servers routinely put up public sites that anyone on the network can read; it's assumed that everybody behind the firewall can be trusted. This was always a bad assumption, but if the network adds a wireless hub, the administrator has to worry constantly about who's sneaking up to it and snooping around. In a similar fashion, many corporate mail servers accept mail from anyone on the LAN. That problem has been highly publicized because intruders have been using such hubs to send unsolicited bulk email. So more and more, we are discovering the need to assign persistent identities to users. In the case of wireless, organizations are doing so by making them log in before using the network.

Sender Policy Framework, which has been in the news a lot as email software designers and ISPs call for its adoption, works on a slightly different level. It doesn't identify end users. Instead, it provides checks to ensure that mail messages correctly identify the hubs and relays through which they pass. This is more of a routing issue than an addressing issue; the basic form of addressing (DNS) is no different when SPF is used.

Solutions to the addressing problem fall into a few categories. In the case of a wireless LAN, the first solution is simply to make users authenticate themselves with a central repository that contains their identities and to record their addresses for a single session. This is what most instant messaging systems do. It was also the quick-and-dirty solution chosen by Napster, which is why it could easily be dismantled for vicarious and contributory copyright infringement while modern Gnutella-based file-sharing systems cannot.

This dependence on central servers scales well. AOL Instant Messenger shows that such a system can serve millions of users. Still the system suffers from a flat namespace (once someone chooses the name John, no one else can use it), and it puts control in the hands of the people who run the servers.

The second solution is used in Apple's Rendezvous (an implementation of the Zero Configuration Networking or Zeroconf standard) and in many other network systems meant for LANs, including some Microsoft domains. Each would-be peer announces the address or name it wants, and if it hasn't been claimed already by another peer, it is assigned to the newcomer and recorded by each of the other peers. This solution requires all participants to be on the same LAN, for several reasons: it depends on broadcasts, it doesn't scale up to huge numbers, and it's open to many attacks if a peer isn't well-behaved.

The most robust and scalable solution in current use, the Domain Name System, was created twenty years ago. DNS was extended long ago with special records (MX records) to support email, which I mentioned as the one form of persistent address in our social infrastructure. DNS makes it easier to maintain a network of mail servers. It would be interesting to see whether support in DNS for a more generalized addressing solution could allow other services to support persistent addresses or lead to a more general form of addressing that could be used by many applications and protocols.

Such support was actually added about five years ago, in the form of SRV records as specified in RFC 2782. These records can specify any well-known server and provide the information needed to reach it in a flexible manner. SRV records have not been generally adopted and are not being pushed by the IETF; but they are in widespread use: by Apple's Rendezvous, by Kerberos, and by Microsoft's Active Directory.

The Jabber instant messaging service, an XML message passing system that is not highly popular yet but whose protocol was officially standardized by the IETF as the Extensible Messaging and Presence Protocol (XMPP), partly solves the addressing problem by depending on DNS, and suggesting that each user run his or her own server. Doing so is not required, but if practiced, automatically gives each user an address.

Domain names are perhaps the best solution in the ideal sense but the worst in their practical implementation. They remain relatively heavy-weight and present many barriers to the average user, partly thanks to the original implementation of the system and partly thanks to persistent intervention by large corporations and their legal representatives. Particularly in the global top-level domains like COM and ORG, this intervention has been effective in keeping most individuals from taking advantage of the persistent names offered by DNS.

The supply of domain names is artificially limited, so much so that a whole business has grown up around notifying someone when his desired name becomes available. VeriSign fought with registrars over who gets to dominate this activity, which adds no value to society. Compared to the cost of actually administering DNS servers, prices of domain names in the popular top-level domains amount to information highway robbery. Even if you get past these barriers and obtain your own domain name, you cannot consider it safe unless you also invest thousands of dollars to obtain a national trademark. Furthermore, registration requires you to make your contact information public, an anti-privacy measure that renders the system inappropriate for individuals.

When you turn to country domains, the situation is much more user-friendly. But registering is still too much trouble and expensive for most people; compare its difficulty to the convenience of getting a login account on one of the major instant messaging services.

Researchers have been searching for years for a distributed system of addressing and resource discovery. The more heavy-weight peer-to-peer systems such as Chord and Tapestry, both in the experimental stage, design addressing and routing systems.

Each node that joins one of these systems is assigned a unique, random identifier. Certain nodes know how to reach others with similar numbers. When trying to reach another node, you start by choosing one you know whose first few bits match the identifier of the node you want. The system is a lot like standard Internet routing at the IP layer.

Thus, if you want to reach 12345 and you have two choices, 12862 and 12347, you choose 12347 because more of the initial bits match. 12347 requires fewer hops to get to 12345. This kind of system is intriguing, but we don't know yet how practical it is.

Much of the p2p network research was subsidized by the music industry, for which we should offer our sincere thanks. Without that subsidy, how could researchers collect statistics over months and years based on participation by literally tens of millions of nodes? It was sheer genius to offer popular recordings to sign up users, and the world will benefit from the testbeds that these systems provide. The music industry botched the PR, of course. So did universities, who also subsidized the research by providing large amounts of bandwidth, but tried to strangle the traffic once they noticed it.

Coordination

The problem with the Web is that it offers too many choices. Things have grown increasingly complex since 1991, after all. A key change is that XHTML documents now often rely on third party resources -- for example, W3C specifications and DTDs -- to tell browsers what to do. The rendering process becomes much more complicated by bringing in a third party.

And HTML is just the start of our coordination problems. XML and web services have opened up the opportunity for a great many languages and application-specific vocabularies, forcing sites to work at finding agreement on what language they're speaking.

P2P Solutions

The P2P movement proved to be a good environment for the intense exploration of coordination issues. P2P application developers had to invent new kinds of coordination from scratch. As always happens when new opportunities are recognized, a period of chaos followed where everybody went off inventing her own way of achieving her goals. The P2P movement never moved beyond this period of chaos. The developers failed to coalesce around either a Peer-to-Peer Working Group led by Intel or the JXTA [http://www.jxta.org/] protocols introduced by Sun.

Another form of coordination that occupied a lot of people's attention was the problem of classifying or typing resources. Such classification was central to many P2P applications because end-users were responsible for contributing data, and no one could compare, tally, or even find the data unless some common system of metadata was offered. Web services depend on classifications done by many standards organizations, as we'll see. That's part of the social infrastructure that supports them.

SAML

So let's turn to a technology where the problem of coordination gets really complicated, one of the most heralded advances in web services, heavily promoted by Sun and other companies in the Liberty Alliance: the Security Assertion Markup Language (SAML). The key word here is "assertion". The assertion is the unit of data that allows coordination.

SAML allows a client and server to appeal to a third party for any kind of access decision. For instance, as shown in Figure AUTHENT-ASSERTION, a user's browser sends a digital signature to the site where the user wants to do business. The server at this business site, which in SAML is called the destination, sends the signature to a trusted third party that knows everybody and their digital signatures.

This server, known as a SAML authority, sends back an XML-encoded statement called an authentication assertion, which might look like:

This person is Ellen Radolfsky.

In this simple use of SAML, the server and client depend on a third party for signature verification. This requires a much more developed social infrastructure, even aside from the issues of trust. The business server has to set up a contract with the authority to check signatures. And Ellen Radolfsky has to contact the authority to prove her identity and get a signature.

Documents about SAML don't say much about all this activity, nor do talks I've heard on SAML, but Section 4.1 of SAML's Bindings and Profiles specification makes an oblique reference to this activity by admitting that the specification makes the following assumption.

The user...has authenticated to a source site by some means outside the scope of SAML.

If you read specifications for technologies you're considering, you have to look very carefully for passages like this one. "Outside the scope" means "this is really important but we're not doing it for you." Nor do I think SAML should do this. Kerberos has been used for decades to authenticate users and servers. It's proven itself over and over, it's been incorporated into Microsoft's domain technology, it will reportedly become the authentication component used by Microsoft's Passport service, and it deserves to be extended to web services. SAML makes that possible, but it doesn't remove the responsibility from people to set up the social infrastructure that makes authentication work. Vendors may give you the impression you can install a product that understands SAML, put a couple programmers to work developing applications, and enjoy their benefits; but the organization has a lot more work to do.

Coordination gets even more complicated as we explore the SAML specification further. The authentication assertion is only one of three types defined in the SAML specification. Next there's an authorization decision assertion, which might be something like:

This person can view our company's financial plans.

Even more coordination is needed for an authorization decision assertion than for the authentication assertion. The coordination you need to enable an authorization decision assertion includes making a list of sensitive operations, as well as defining your own protocol so your destination site and your authority know what they're talking about. This simple and useful assertion can't become a resource for web sites until someone who's maintaining the destination site says, "I have some sensitive financial plans to protect" and the authority agrees to starts tracking who has a right to see financial plans. The assertion can only be implemented as a result of lots of meetings between people representing the two organizations.

Finally, we have the attribute assertion. Attributes have been a product feature for years, perhaps stored in directories such as Active Directory to represent information such as contact information and job titles. The attribute assertion allows any kind of such directory information to be made available to web services through SAML. But because of its open-ended flexibility, an attribute assertion could also be something like this:

This person calls customer service a lot.

I made this example up. But its appearance in corporate databases would be quite plausible. Companies are increasingly aware of customer service costs and are putting systems in place to warn their employees about high-maintenance customers, as well as customers who should be favored because they provide high profits. Already IBM's Autonomic Computing vision explicitly includes the provision of customer service with relatively better or worse response time to different classes of customers. I firmly expect that authentication and authorization services such as SAML will be pressed into the service of discriminating among customers.

And this leaves us with the social question of whether we want automated decisions being made that affect individuals' lives in potentially major ways. Such questions are not the focus of this article. But if we discuss social infrastructure, we should be aware of social questions concerning that infrastructure. Security standards that use attributes, such as SAML and the open Shibboleth project , associated with Internet2, speak earnestly and probably with sincerity about the importance of preserving privacy. But the key lies in what information is stored by sites using the standards, and how these sites design the information's use.

When I hear talks or read papers about SAML, and many other current standards, I think of the changing of the guard, which you can see at Buckingham Palace in London, the Houses of Parliament in Ottawa, and a few other places. When you watch the changing of the guard, you are impressed by the precision marching, the excellence of the brass band, the crisp swordplay. It's easy to think that the purpose of the guard is to carry on this show for you. But the true purpose of the guard is to protect the Houses of Parliament or wherever else they are deployed, a serious task in these days of terrorism. When evaluating the guard, you have to look beyond the display.

Similarly, when we consider the infrastructure you need to make web services security work, SAML is the brass band, the swordplay. Doing it right is important because vulnerabilities at any point are dangerous. But the real work is being done behind the scenes by Kerberos, by the authority that manages people's identities or authorizations, and by the relationship that you and your clients maintain with this authority.

SAML's achievement is to make web services, with all the interoperable benefits and standards they bring, available to carry out organizational relationships that are well-established, well-formalized, and tightly integrated. But how many organizations have relationships that are sufficiently well-established, well-formalized, and tightly integrated? Creating these relationships is where they have to focus their efforts if they plan on deploying SAML. In other words, potential partners need to get their social infrastructures in tune before they can benefit from SAML. I believe this is part of what some consultants call "enterprise readiness" for web services.

The basic third-party authentication model in SAML is probably robust, because it's an extension of what people currently do in everyday computing environments. Lots of sites are parts of Microsoft domains or other networks that allow single sign-on. But for any application, SAML requires a vast amount of social infrastructure.

UDDI and ebXML

UDDI and ebXML have, arguably, even more ambitious goals than SAML, and they have correspondingly greater requirements for social infrastructure. The goal of these specifications is to streamline and automate commerce, so that you can search for a product or service, negotiate terms, and conclude the deal in a structured way over the Internet. A business needing a certain part for a machine could start up a web service request asking who has parts of a particular size and composition; the web service could then choose a part based on cost, geographical location, or some combination of other criteria and even contact the lucky vendor directly. You could even tell the registry: "Here's a business I know of; find other businesses like it" and then choose one with which to carry out a transaction.

The standards attempting to provide this service are UDDI (Universal Description, Discovery, and Integration) and ebXML. They were invented independently by different consortiums and never managed to converge. Because the data they maintain and the relationships among businesses they promote are subtly different, merger may not be desirable. UDDI is developed under the auspices of OASIS, which also takes responsibility for most ebXML components.

I haven't heard much about UDDI recently. ebXML, by contrast, is reported to be growing in use. It's made the biggest inroads in South Korea, where large businesses push it, supported by the government there. Let's take a look at each standard and the social infrastructure it assumes.

About some pieces of information, UDDI is quite explicit and specific. It provides fields where a business can offer its address, phone number, email contact, and so forth. But what about the criteria you search for? UDDI doesn't specify that, and there's no reason for it to do so. It relies on existing specifications in the outside world. For instance, there is an official U.S. government standard that classifies every type of business in North America; it's called the North American Industry Classification System (NAICS). Individual products are also specified through the United Nations Standard Products and Services Code (UNSPSC).

These standards make it convenient to find the cheapest source for commodities such as pen refillers (an example I took from the UNSPSC site). For something more nuanced or less commoditized, more information may be required. UDDI provides a field called a tModel that has an open-ended definition, where any kind of information can be stored. But for the tModel to be useful, the companies using it have to come together, agree on categories, and classify their products reliably.

Both UDDI and the ebXML frameworks have ambitions far beyond a virtual yellow pages. UDDI provides pointers to web services, with the notion that after you find a company you want to deal with, your computer application can form a relationship without human intervention.

The parts of the transaction that take place after a successful search are handled in ebXML by a Collaboration Protocol Profile (CPP) and a Collaboration Protocol Agreement (CPA). This can be considered a standardization of Electronic Data Interchange (EDI). The vendor offers a Collaboration Protocol Profile, in response to which the buyer offers a Collaboration Protocol Agreement that is supposed to conform to the profile. Together they allow purchasers and vendors to automate parts of their activity.

The work of developing a Collaboration Protocol Profile is enormous. It can be created by a consortium of vendors, created by a major customer and imposed on vendors, or developed by committee in some other manner. In any case, it's a masterpiece of human coordination. And the Collaboration Protocol Profile is as much a legal document as a technical one.

This brings us to the third issue, trust, which I will discuss in the next article.