From P2P to Web Services: Trust
In last week's article ("From P2P to Web Services: Identification and Addressing"), I examined the ways in which the development of web services might learn some lessons from the peer-to-peer phenomenon of a few years ago. I focused on identification and addressing. In this article I conclude my examination by focusing on trust.
I like the idea that an application can search for a business in UDDI and then automatically connect to that business to conduct a deeper search. That's a P2P activity: each business can be responsible for maintaining its own information and can provide a fuller and more up-to-date response than a central repository is likely to do. In fact, products and services have already been marketed that offer distributed searches, without the benefit of web services.
But proponents of the ebXML framework, and the literature on UDDI, call for more than search. They explicitly advocate a model where your application completes the search, makes a choice, and automatically enters into a business arrangement with the winner. It's seamless; it takes the human out of the loop.
Some early proponents of ebXML thought it could lead to what Doug Bunting, an XML Standards Architect at Sun Microsystems, calls "instantaneous commerce"; others have been more restrained in their expectations. Once again, if you're just refilling your pens, you might not encounter trouble automating everything. But it's now recognized that, if you're doing anything more complex, your application cannot muster enough judgment to make the choice without a human being. No IT manager wants to implement an application that automatically chooses a vendor, find out that a product turns out to be a failure, and then have to answer to the accounting department when they ask, "Who chose that vendor in the first place?"
ebXML allows the simple, pen-refiller sort of purchases to go through more quickly and can also serve as a grease to speed up delivery in cases where parties need a more formal legal agreement. Additional, upcoming standards may further streamline the negotiations between the lawyers and managers on each side. But none of them solve the problem of trusting whom you deal with.
P2P researchers recognized the benefit of capturing trust in a measure called reputation. The most familiar instance of reputation online can be found in Internet auction sites such as eBay. There's fraud, but it's kept down to a tolerable level. It would be oversimplifying to attribute the trust on eBay to its feedback mechanism. The main mechanisms holding back fraud rely firmly on real-world enforcement, such as those provided by credit card companies. That's a social infrastructure. And even that is probably not enough. eBay's success relies, like so much in life, on the fundamental decency of the average person. Most people are honest.
Unfortunately, reports in the press indicate that sophisticated abuses of Internet auction sites are increasing, just like credit card fraud, identity theft, and unsolicited email. Reputation systems are nowhere near ironclad enough to resist deliberate subversion. Someone can carry out many trivial transactions to establish a good reputation, and then cheat on a transaction that really matters. Someone can get his friends to submit dozens of testimonials regarding his good name. Someone can even combine the two attacks, carrying out trivial transactions with his friends so they are allowed into the system to submit ratings (as on eBay, where ratings are accepted only from people who carry out transactions).
The flip side of getting bogus ratings is the problem of not getting enough ratings. Most people don't want to take the trouble to submit a rating, so it's tempting to create incentives for them to post ratings. But these incentives can become perverse incentives, if people try to win them by submitting ratings on things about which they don't really know anything.
Most problems of reputation systems boil down to this: who rates the raters? How can you trust ratings? Even when people are honest, their natural differences in temperament throw off the ratings. Researchers have introduced meta-rating systems, but the more they try to control for these differences, the more complex and unwieldy their systems become, with no demonstrable resolution to the original problems.
Reputation is easier to solve when it involves a matter of taste. People who bought one book from Amazon were also likely to buy a certain other book; that's a small but useful fact. If you enjoy many of the same movies as someone else, you're likely to enjoy the next movie that this person rates highly. That's the system that underlay a service called Movie Critic, an early collaborative filtering system on the Web promoted by O'Reilly & Associates. Collaborative filtering is useful, but it deals with taste more than with trust. Furthermore, they depend on aggregating many data points and therefore depend on lots of transactions taking place.
How do we trust the people and organizations that we hear about for the first time, often located thousands of miles away? There is a small, intense community of researchers who deal with reputation online, tirelessly advancing and combining research on such issues as proofs of work, group dynamics, digital cash, game theory, and so on. It's a heady mix and a wonderful field to watch, but it has made very little progress beyond what you see in eBay.
The problems of reputation and distributed systems are made more complicated when the requirements for the systems embody some contradictory goals, such as preserving anonymity while preventing denial-of-service attacks, ensuring that everyone's input is tallied correctly, and limiting the number of times people can submit ratings. Incidentally, these problems are commonly found with electronic voting systems (particularly ones that allow remote voting), and explain why their use is inherently risky no matter how ideally they are designed and implemented.
Distributed monitoring systems currently depend on cooperating processes: the authority that collects statistics and dispenses commands must trust the other systems to provide accurate statistics and to act on those commands. There is little or no provision for processes that are not trusted.
Reputation researchers admit they don't have the answers, but they do have one existing system they admire -- Advogato. This is a system for coordinating contributions to free software from multiple programmers who may not know one another. In the reputation part of the Advogato system, each programmer assigns a degree of trust to other programmers she's worked with or whose code she's evaluated. Furthermore, to the degree you trust one programmer, the system assumes you have some trust for the programmers she trusts, that is, the trust ratings are transitive.
In real life, trust might not be transitive. I might entrust my teenage kids with access to my telephone or my Internet account, but not entrust the same resources to their friends. Advogato recognizes that trust weakens as it goes through intermediaries.
One key feature of Advogato likely disqualifies it as a model for global commerce and commercial web services: it depends on a chain of personal contacts. Because your trust extends step by step through the system, it can grow only incrementally. Introducing someone new to the system requires her to build up trust relationships with individuals. This is not scalable; it is not suitable for millions of people or businesses.
Now there are a raft of "social networking" sites that try to scale trust relationships. But it remains to be seen whether people really benefit by forming relationships with people they don't know on the social networks. I know that I get friendly with a lot of people because they're interesting for one specific reason, but that doesn't mean they're fit for some other purpose defined by another person who is looking for a friend or colleague. In fact, I don't vouch that the people I know are fit for anything at all. In a nutshell, social networking faces a problem of metadata. How do you formalize relationships? How do you say that something relevant to one relationship is relevant to a different relationship with a different person?
Some P2P applications can deal with the problem of reputation in the same way as some deal with the problem of addressing, namely, by ignoring it. If a system includes massive redundancy, as with file-sharing networks, bad actors don't matter. They're considered damage that the system can route around. And a user who receives bad data can throw it away and try again. SETI@Home, an early model for grid computing, uncovers users who submit false data by sending the same data to multiple users and checking whether their results match.
Trust applies not only to the data you get back, but the data you give out. When grid computing enters the corporation, most companies keep it internal in order to protect their data. Even so, new problems of trust arise. Consider a scenario suggested by security researcher William McCarty of Azusa Pacific University. Imagine that chunks of information are sent to Joyce's computer in sales. Joyce, who has been trusted up to now with sales information, is now also being entrusted with any other data being processed on her computer as part of the grid. Furthermore, if a malicious intruder should get access to the program running on the grid, he may end up with access to sales data.
Back to the Security Basics
Where do trust and reputation fit in modern computing? The computer security field has built up an understanding that before you deploy any technology that connects you to others, you should determine what you're expecting of them and whether they're trustworthy. The simplest security system is where two people trust each other. This is called pairwise trust. It can be illustrated by encrypted email.
Alice sends Bob an email encrypted with Bob's public key. Bob sends Alice an email encrypted with Alice's public key. They don't need to trust any third party. However, this works only if they've previously exchanged keys through an outside channel. It's not safe for Alice to just send Bob her public key over email because anybody could pose as Alice and send Bob a public key, then correspond with Bob indefinitely until Bob discovers the ruse. So unless Alice and Bob have exchanged keys in person or corresponded through postal mail, they're going to depend on a third party.
Now there's a slight weakening of trust. Alice and Bob depend on the third party being secure, so that no one can substitute the wrong key, and on the third party being honest. Nevertheless, this scenario is fairly trustworthy and is used all the time. Alice or Bob can buy an encryption key fairly cheaply from VeriSign. So long as the correspondent's mailer knows how to contact VeriSign, the email is guaranteed to be from the address it claims to be from.
Jabber provides several possibilities for encrypting instant messaging. You can use SSL, which is an uncertain form of security because it's not end-to-end. You can also encrypt elements of the conversation using some form of digital signature, just like email. Jabber also presents the same start-up problem as with email: unless you use a third party, you need some outside way to exchange keys so each side knows they're communicating with the right person.
Unlike the relatively lightweight approach of email encryption and Jabber, the encrypted instant messaging announced in June by America Online in conjunction with VeriSign is considerably more heavyweight. It provides certified digital signatures as well as encryption, which is good, but this design choice has three important consequences. First, it's available only to companies who sign up for VeriSign's services, not to individuals. Second, it costs money, though ten dollars a month for a business is a pretty trivial price. Third, it may perpetuate AOL's lock-in on instant messaging, putting up another barrier to third parties who want to offer competing services that can handle the AOL instant messaging protocol.
When email or instant messaging uses a third party, it doesn't need to be commercial authority. A colleague can vouch for Alice and Bob, and a chain of colleagues can be set up this way, the famous web of trust used in PGP. Each time you add another person, the web gets a little weaker. The third party solution also lies behind SSL -- where a certificate authority is used to prevent man-in-the-middle attacks -- and behind Kerberos.
The third party solution is safest in a contained environment. But putting a third-party solution like Kerberos behind SAML allows the solution to provide single sign-on for Web sites. Single sign-on is kind of a Holy Grail of web services, so far as the user experience goes. Single sign-on can be accomplished by authenticating the user at the first site she logs in to, or by letting sites pass credentials to some trusted third party. It's the responsibility of such a third party to authenticate the user and offer her a token that can be passed to all the other various sites that she goes to, whether it's to buy something or set up a vacation or use government services.
Single sign-on certainly introduces new risks, but users already create similar risks informally. You know how many people use the same password to login to a dozen different sites. It's not considered a good idea, but they do it out of ignorance or laziness. (Sites that require a fixed form of input, such as a credit card number, compound the problem by institutionalizing the bad practice.) Someone who breaks the password on one site can make a good guess that it will let him in to other sites as well. So single sign-on is probably more secure than the status quo.
Before we go further and look at the technologies, it's very important to look at the social infrastructure that will enable single sign-on. It's a big responsibility to authenticate someone: it involves making sure to check her identity carefully before giving out a secret key, and to use robust algorithms and protocols to verify the secret key when she logs in, and to lock up identifying information carefully against intrusion. A travel agent probably doesn't want that kind of responsibility, nor does a university or other likely destinations. This sort of function is likely to be centralized and contracted out. That's why Microsoft was hoping to make big bucks off of Passport. Much of the work now being done by standards committees, such as the Liberty Alliance, is precisely in harmonizing the security of different parties who want to collaborate so they can trust each other.
Whenever you participate in some kind of token-passing scheme in which you trust the authentication performed by another site, you have to check out how well that site is run. As you know, a tiny flaw or a failure to upgrade fast enough can leave a system compromised. And if you accept authentications from another site, you have to worry about two sites: your own and the other one you trust.
Single sign-on is an improvement over the insecure situation we have now where users fall into the habit of using the same password everywhere. Fundamentally, given the difficulties of letting users join a system security -- verifying them when they sign up and get their secret keys -- and the diff iculties of doing authentication in a robust manner, it's a good process to leave up to the experts. Someone who does it for a business is probably going to do it better than you.
Pages: 1, 2