From P2P to Web Services: Trust
April 14, 2004
In last week's article ("From P2P to Web Services: Identification and Addressing"), I examined the ways in which the development of web services might learn some lessons from the peer-to-peer phenomenon of a few years ago. I focused on identification and addressing. In this article I conclude my examination by focusing on trust.
I like the idea that an application can search for a business in UDDI and then automatically connect to that business to conduct a deeper search. That's a P2P activity: each business can be responsible for maintaining its own information and can provide a fuller and more up-to-date response than a central repository is likely to do. In fact, products and services have already been marketed that offer distributed searches, without the benefit of web services.
But proponents of the ebXML framework, and the literature on UDDI, call for more than search. They explicitly advocate a model where your application completes the search, makes a choice, and automatically enters into a business arrangement with the winner. It's seamless; it takes the human out of the loop.
Some early proponents of ebXML thought it could lead to what Doug Bunting, an XML Standards Architect at Sun Microsystems, calls "instantaneous commerce"; others have been more restrained in their expectations. Once again, if you're just refilling your pens, you might not encounter trouble automating everything. But it's now recognized that, if you're doing anything more complex, your application cannot muster enough judgment to make the choice without a human being. No IT manager wants to implement an application that automatically chooses a vendor, find out that a product turns out to be a failure, and then have to answer to the accounting department when they ask, "Who chose that vendor in the first place?"
ebXML allows the simple, pen-refiller sort of purchases to go through more quickly and can also serve as a grease to speed up delivery in cases where parties need a more formal legal agreement. Additional, upcoming standards may further streamline the negotiations between the lawyers and managers on each side. But none of them solve the problem of trusting whom you deal with.
P2P researchers recognized the benefit of capturing trust in a measure called reputation. The most familiar instance of reputation online can be found in Internet auction sites such as eBay. There's fraud, but it's kept down to a tolerable level. It would be oversimplifying to attribute the trust on eBay to its feedback mechanism. The main mechanisms holding back fraud rely firmly on real-world enforcement, such as those provided by credit card companies. That's a social infrastructure. And even that is probably not enough. eBay's success relies, like so much in life, on the fundamental decency of the average person. Most people are honest.
Unfortunately, reports in the press indicate that sophisticated abuses of Internet auction sites are increasing, just like credit card fraud, identity theft, and unsolicited email. Reputation systems are nowhere near ironclad enough to resist deliberate subversion. Someone can carry out many trivial transactions to establish a good reputation, and then cheat on a transaction that really matters. Someone can get his friends to submit dozens of testimonials regarding his good name. Someone can even combine the two attacks, carrying out trivial transactions with his friends so they are allowed into the system to submit ratings (as on eBay, where ratings are accepted only from people who carry out transactions).
The flip side of getting bogus ratings is the problem of not getting enough ratings. Most people don't want to take the trouble to submit a rating, so it's tempting to create incentives for them to post ratings. But these incentives can become perverse incentives, if people try to win them by submitting ratings on things about which they don't really know anything.
Most problems of reputation systems boil down to this: who rates the raters? How can you trust ratings? Even when people are honest, their natural differences in temperament throw off the ratings. Researchers have introduced meta-rating systems, but the more they try to control for these differences, the more complex and unwieldy their systems become, with no demonstrable resolution to the original problems.
Reputation is easier to solve when it involves a matter of taste. People who bought one book from Amazon were also likely to buy a certain other book; that's a small but useful fact. If you enjoy many of the same movies as someone else, you're likely to enjoy the next movie that this person rates highly. That's the system that underlay a service called Movie Critic, an early collaborative filtering system on the Web promoted by O'Reilly & Associates. Collaborative filtering is useful, but it deals with taste more than with trust. Furthermore, they depend on aggregating many data points and therefore depend on lots of transactions taking place.
How do we trust the people and organizations that we hear about for the first time, often located thousands of miles away? There is a small, intense community of researchers who deal with reputation online, tirelessly advancing and combining research on such issues as proofs of work, group dynamics, digital cash, game theory, and so on. It's a heady mix and a wonderful field to watch, but it has made very little progress beyond what you see in eBay.
The problems of reputation and distributed systems are made more complicated when the requirements for the systems embody some contradictory goals, such as preserving anonymity while preventing denial-of-service attacks, ensuring that everyone's input is tallied correctly, and limiting the number of times people can submit ratings. Incidentally, these problems are commonly found with electronic voting systems (particularly ones that allow remote voting), and explain why their use is inherently risky no matter how ideally they are designed and implemented.
Distributed monitoring systems currently depend on cooperating processes: the authority that collects statistics and dispenses commands must trust the other systems to provide accurate statistics and to act on those commands. There is little or no provision for processes that are not trusted.
Reputation researchers admit they don't have the answers, but they do have one existing system they admire -- Advogato. This is a system for coordinating contributions to free software from multiple programmers who may not know one another. In the reputation part of the Advogato system, each programmer assigns a degree of trust to other programmers she's worked with or whose code she's evaluated. Furthermore, to the degree you trust one programmer, the system assumes you have some trust for the programmers she trusts, that is, the trust ratings are transitive.
In real life, trust might not be transitive. I might entrust my teenage kids with access to my telephone or my Internet account, but not entrust the same resources to their friends. Advogato recognizes that trust weakens as it goes through intermediaries.
One key feature of Advogato likely disqualifies it as a model for global commerce and commercial web services: it depends on a chain of personal contacts. Because your trust extends step by step through the system, it can grow only incrementally. Introducing someone new to the system requires her to build up trust relationships with individuals. This is not scalable; it is not suitable for millions of people or businesses.
Now there are a raft of "social networking" sites that try to scale trust relationships. But it remains to be seen whether people really benefit by forming relationships with people they don't know on the social networks. I know that I get friendly with a lot of people because they're interesting for one specific reason, but that doesn't mean they're fit for some other purpose defined by another person who is looking for a friend or colleague. In fact, I don't vouch that the people I know are fit for anything at all. In a nutshell, social networking faces a problem of metadata. How do you formalize relationships? How do you say that something relevant to one relationship is relevant to a different relationship with a different person?
Some P2P applications can deal with the problem of reputation in the same way as some deal with the problem of addressing, namely, by ignoring it. If a system includes massive redundancy, as with file-sharing networks, bad actors don't matter. They're considered damage that the system can route around. And a user who receives bad data can throw it away and try again. SETI@Home, an early model for grid computing, uncovers users who submit false data by sending the same data to multiple users and checking whether their results match.
Trust applies not only to the data you get back, but the data you give out. When grid computing enters the corporation, most companies keep it internal in order to protect their data. Even so, new problems of trust arise. Consider a scenario suggested by security researcher William McCarty of Azusa Pacific University. Imagine that chunks of information are sent to Joyce's computer in sales. Joyce, who has been trusted up to now with sales information, is now also being entrusted with any other data being processed on her computer as part of the grid. Furthermore, if a malicious intruder should get access to the program running on the grid, he may end up with access to sales data.
Back to the Security Basics
Where do trust and reputation fit in modern computing? The computer security field has built up an understanding that before you deploy any technology that connects you to others, you should determine what you're expecting of them and whether they're trustworthy. The simplest security system is where two people trust each other. This is called pairwise trust. It can be illustrated by encrypted email.
Alice sends Bob an email encrypted with Bob's public key. Bob sends Alice an email encrypted with Alice's public key. They don't need to trust any third party. However, this works only if they've previously exchanged keys through an outside channel. It's not safe for Alice to just send Bob her public key over email because anybody could pose as Alice and send Bob a public key, then correspond with Bob indefinitely until Bob discovers the ruse. So unless Alice and Bob have exchanged keys in person or corresponded through postal mail, they're going to depend on a third party.
Now there's a slight weakening of trust. Alice and Bob depend on the third party being secure, so that no one can substitute the wrong key, and on the third party being honest. Nevertheless, this scenario is fairly trustworthy and is used all the time. Alice or Bob can buy an encryption key fairly cheaply from VeriSign. So long as the correspondent's mailer knows how to contact VeriSign, the email is guaranteed to be from the address it claims to be from.
Jabber provides several possibilities for encrypting instant messaging. You can use SSL, which is an uncertain form of security because it's not end-to-end. You can also encrypt elements of the conversation using some form of digital signature, just like email. Jabber also presents the same start-up problem as with email: unless you use a third party, you need some outside way to exchange keys so each side knows they're communicating with the right person.
Unlike the relatively lightweight approach of email encryption and Jabber, the encrypted instant messaging announced in June by America Online in conjunction with VeriSign is considerably more heavyweight. It provides certified digital signatures as well as encryption, which is good, but this design choice has three important consequences. First, it's available only to companies who sign up for VeriSign's services, not to individuals. Second, it costs money, though ten dollars a month for a business is a pretty trivial price. Third, it may perpetuate AOL's lock-in on instant messaging, putting up another barrier to third parties who want to offer competing services that can handle the AOL instant messaging protocol.
When email or instant messaging uses a third party, it doesn't need to be commercial authority. A colleague can vouch for Alice and Bob, and a chain of colleagues can be set up this way, the famous web of trust used in PGP. Each time you add another person, the web gets a little weaker. The third party solution also lies behind SSL -- where a certificate authority is used to prevent man-in-the-middle attacks -- and behind Kerberos.
The third party solution is safest in a contained environment. But putting a third-party solution like Kerberos behind SAML allows the solution to provide single sign-on for Web sites. Single sign-on is kind of a Holy Grail of web services, so far as the user experience goes. Single sign-on can be accomplished by authenticating the user at the first site she logs in to, or by letting sites pass credentials to some trusted third party. It's the responsibility of such a third party to authenticate the user and offer her a token that can be passed to all the other various sites that she goes to, whether it's to buy something or set up a vacation or use government services.
Single sign-on certainly introduces new risks, but users already create similar risks informally. You know how many people use the same password to login to a dozen different sites. It's not considered a good idea, but they do it out of ignorance or laziness. (Sites that require a fixed form of input, such as a credit card number, compound the problem by institutionalizing the bad practice.) Someone who breaks the password on one site can make a good guess that it will let him in to other sites as well. So single sign-on is probably more secure than the status quo.
Before we go further and look at the technologies, it's very important to look at the social infrastructure that will enable single sign-on. It's a big responsibility to authenticate someone: it involves making sure to check her identity carefully before giving out a secret key, and to use robust algorithms and protocols to verify the secret key when she logs in, and to lock up identifying information carefully against intrusion. A travel agent probably doesn't want that kind of responsibility, nor does a university or other likely destinations. This sort of function is likely to be centralized and contracted out. That's why Microsoft was hoping to make big bucks off of Passport. Much of the work now being done by standards committees, such as the Liberty Alliance, is precisely in harmonizing the security of different parties who want to collaborate so they can trust each other.
Whenever you participate in some kind of token-passing scheme in which you trust the authentication performed by another site, you have to check out how well that site is run. As you know, a tiny flaw or a failure to upgrade fast enough can leave a system compromised. And if you accept authentications from another site, you have to worry about two sites: your own and the other one you trust.
Single sign-on is an improvement over the insecure situation we have now where users fall into the habit of using the same password everywhere. Fundamentally, given the difficulties of letting users join a system security -- verifying them when they sign up and get their secret keys -- and the diff iculties of doing authentication in a robust manner, it's a good process to leave up to the experts. Someone who does it for a business is probably going to do it better than you.
Liberty Alliance documents talk about circles of trust, where organizations and people in the inner circle are entrusted with more activities than those in outer circles. This is subtly differently from the web of trust. The web of trust is restricted to signature verification; a circle of trust covers potentially any online transaction and recognizes that users trust different parties to different extents.
I feel a bit queasy about automating trust relationships, even when relationships among organizations are strong and long-lasting. Manual interactions create many barriers to subversion that automated processes strip away. You might not pay much attention if another site is sending your system a message telling you to trust a transaction that's worth just five dollars. You might not realize that your system has received 10,000 similar requests for five dollars in the past fifteen minutes. Vigilance is called for.
I support the concepts behind SAML up to this point; it allows a trusted third party to do single sign-on. But it goes much further, and I expect we'll wait a long time before we see sites implement it for these larger purposes. As I pointed out, the social infrastructure required to implement authorization decisions or attributes is much greater than the tried-and-true Kerberos style of authentication. And SAML doesn't stop with trusted third parties either. It adds fourth parties, fifth parties, etc. Extending the trust model is part of their vision they call federated commerce.
In federated commerce scenarios, coordination and trust problems mushroom. Web services depend on good business practices that all these different companies can follow. Technology can help to encode the practices, but technology cannot create or enforce them, only people can. And the buyer has to trust all the companies in the chain. As Scott Berinato wrote in The State of IT Security 2003, "When you connect your network with a partner, you're also connecting to your partner's partners. Yet only 22 percent of respondents demand that partners practice safe business."
A system where two users have no pre-existing relationship or agreement, but depend upon agreements they've worked out with a third party, is called brokered or indirect trust in Liberty Alliance specifications. A system where a user enters into a negotiation with someone she doesn't know, but who is part of a general class of businesses such as hotels, is called community trust. The latter is also basically a form of dependence on a trusted third party, but the trusted third party is whoever let the hotel into the community.
Trust and information sharing between companies have taken place since commerce began. But the speed and ease of communication nowadays are putting a lot of strain on traditional corporate boundaries. When setting up an extranet or even a mailing list, whom should be invited to join? How much should be shared on the forum?
Standards bodies are recognizing these problems, and in recent months they've been taking steps to inform businesses about the risks of partnerships, educate them about what they should do to protect themselves and their partners, and create new specifications that start to take the social infrastructure into account.
I'll start by briefly mentioning a new stack of protocols driven largely by Microsoft, IBM, and VeriSign: WS-Security, WS-Policy, WS-Trust, WS-Privacy, WS-SecureConversation, WS-Federation, WS-Authorization.
These standards, most of them in early discussion phases, strive to tame the issue I've been complaining about: trust and its delegation. The driving forces behind them are Microsoft, IBM, and VeriSign; the first one, WS-Security, has been submitted to OASIS for approval. If we build real-world trust relationships that are strong enough to be formalized, these standards may help web services exploit that achievement. But in the meanwhile they're just piling on more and more in a complicated series of complicated specifications.
To show a contrasting direction, I'll mention some intriguing explorations in the area of trust to which I've been alerted by Paul Madsen, who has done some work on web service specifications and works for a company called EnTrust, which offers products that deal with identity management and access control for web sites. He's been working on trust mechanisms for the Global Grid Forum, a group that designs open standards for distributed computing, and has also been following work by the Liberty Alliance in that area.
One focus of the Global Grid Forum work adds nuance to the acceptance of a certificate or encryption key. Usually, key acceptance is a binary decision. It matches or it doesn't. And if your site uses a third party for authentication, you just accept whatever their decision is.
But in the real world, some keys and certificates are better than others. In fact, certificate authorities define policies that say what they did to validate the identity of the person who was granted the certificate, and the purposes for which a certificate can safely be used. The process of handing out certificates, and the information recorded by certificate authorities while doing it, is a part of the social infrastructure.
The Global Grid Forum and Liberty Alliance are trying to make some of this information part of a server's decision whether to trust an assertion it gets from a third party. The Global Grid Forum calls this qualified installation of keys -- you might install the key you get and you might not, depending on the information you get about how it was granted and how well it was protected. This work lies on the meeting ground between technology and law. While specifying a context, an authority can indicate what the limits are to its liability. The negotiation can leave each side clear what it has to take responsibility for, should things go wrong.
The Liberty Alliance issued a specification this past April for something they call an Authentication Context. I find this specification unusual because it's not about a tool or a software library. Rather, it's about what organizations do or the social infrastructure. As with the qualified installation of keys just mentioned, here we see a new interest in data that's about the transaction, the context for the transaction. This consists of information that an authority can pass along with its authentication decision, some elements of which are
- physical protection of certificates (site construction, etc.);
- technical protection of certificates (computer security);
- operational protection of certificates (audits, etc.);
- authentication method (password, smart card, etc.); and
- identification (how does the authority know the individual?).
The authority can tell you whether the authentication was based on a password, smart card, or other mechanism. It can tell you what security measures were used to protect the information from corruption. You can negotiate what context to accept and decide how much to trust the user based on the context that's passed to you.
But as stated by Eve Maler of Sun Microsystems, trust is something created between people, not between computers that just act as blind agents. In that spirit, this past July, the Liberty Alliance went even further and issued business guidelines that organizations using web services should follow. These touch on familiar practices such as accreditation, audits, privacy policies, and so forth.
A company called the PingID Network already tries to provide standards for such things as liability and quality assurance, so that businesses feel safe working long distance, PingID takes as their model the process by which banks set up global ATM networks.
Standardizing business practices can help the most robust practices spread further and faster. A PingID white paper even claims that robust practices in areas such as tracking and revocation of information will reduce the social scourge of identity theft. However, I worry that standardized practices may also put something of a brake on the valuable spontaneity and democratic interaction that drives new ideas.
The Modest and the Grand Promise
Our exploration of contemporary networking problems today culminates in another big question. Ultimately, we have to consider the visions that motivate the new web services standards. They offer what I call a modest promise and a grand promise. The modest promise is that web services will automate stupid drudgery that is currently wasting a lot of people's time and costing businesses money. I support this modest vision. I have a sense that web services will enable people to do much more with handheld devices and will be empowering to individuals.
But I haven't bought in yet to the grand promise. The grand promise is the automated search followed by placing an order in UDDI or ebXML. The grand promise is also the chain of travel sites combined in a single query employing SAML or the WS protocols. These require coordination and trust, which can't be stretched as far on the Web as the proponents like to think.
When you visit a web site, you have no basis for knowing whether anything they claim or anything they promise is true. If you get a web-based rating, as on eBay, you still cannot be absolutely sure. The best you can do is build up an independent, real-life relationship with someone and build your opinion of the online site on that person's opinion. And the more attenuated the relationships are, the less effective that kind of transitive trust is.
Why is this important? Why are all these Web standards being created in the face of an intractable problem of trust? Because, or so I think, of another of the grand promises of web services: to promote globalization, world trade, commerce without walls.
In May 2003, XML leader Jon Bosak aired hopes that XML would help in "saving the world" by making it easier for people to do global business. The vision of affluent Westerners supporting rural African communities by purchasing their fabrics, or Latin American farmers by slurping up their coffee, all without a middleman, is appealing politically. But I think it will be harder than the proponents of the global marketplace believe. We need to trust the people we're going to trade with, especially if they're located more than a twenty minute drive from us, and we currently have zero mechanisms for developing and reporting trust outside of the traditional grapevine and corporate branding.
The only way to create an economy at this point is to create human-to-human connections where it is possible to build up trust, not machine-to-machine connections where no trust is intrinsically present. Standards and technological delivery mechanisms can help at certain points; to quote Bunting, "automation reduces errors, enables the humans to focus on strategic decisions, or otherwise reduces costs and time-to-delivery." But they will leave lots of work for the human-to-human connection.
It's significant that the federated identity model discussed by the Liberty Alliance allows different sites to combine the information they have on a user, but only after the user explicitly gives permission. To take the user out of the loop would be deadly to privacy, so the Liberty Alliance model doesn't automate the linking of information. The requirement for user consent creates a hard and fast limitation on automation.
Technology enables. As it grows in power and gives power to those who use it, the burden rests on its users to use it sanely. We see illustrative failures of sanity in enabling technology for weaponry: individuals massacre schoolchildren; armed gangs terrorize backwaters in poorly governed countries; terrorists spread mayhem everywhere; governments ignore world opinion and spread lies to justify mass destruction.
As Internet technology enables individuals to work together more tightly and more freely, it enables intrusion and cheating. Research has only begun to assemble the pieces that, we hope, will also enable defensive measures and group action to identify and exclude badly behaved members.
So, to sum up, what do I see in the near future? We need to do something about addressing. We need an easy system that assigns each person a persistent address that is useful for all applications; ideally the system would be distributed and resistant to attack or capture. Current Web Services specifications work on the problem of trust without solving the problem of addressing--that is, the problem as I've stated it, the problem of finding someone several hours later.
As for the other issues, coordination and trust, we need modest expectations. We need to form communities within industries or other areas of interest and develop both standards and trust.
It's heartening to see the Liberty Alliance and the Global Grid Forum talking about problems of trust; they are reminding organizations to invest in processes and in research about policies that are expensive, but are ultimately more important than the technologies adopted to streamline their use. Still, the experience of earlier researchers in trust and reputation is a warning that these issues are extremely complicated, as well as hard to formalize. This is a step-by-step process that won't conquer the world--at least not any time soon.