XML.com: XML From the Inside Out
oreilly.comSafari Bookshelf.Conferences.

advertisement

Web Disservices: Microsoft's Misstep

Web Disservices: Microsoft's Misstep

September 24, 2003

Microsoft recently announced Microsoft.com Web Services, through which developers can get programmatic access to Microsoft.com services. Microsoft has big plans to expand the program to include querying MSDN, MS Technet, and MS Support sites. For now, however, the only service available is getting information on top downloads from Microsoft.com. But this is enough to get a taste of the architecture and for us to dig in and see how it might be improved.

Microsoft.com Web Services are based on SOAP. To call a method you need to post a SOAP message, with particular HTTP headers, to a specific URL on the server; the server responds with a SOAP envelope containing a SOAP body containing your answer. For example, the service defines a GetVersion method which takes no parameters and returns a string describing the version of Microsoft.com Web Services that the server is running. That transaction looks like this:

Host: ws.microsoft.com
POST /mscomservice/mscom.asmx HTTP/1.0
Content-type: text/xml
SOAPAction: "http://www.microsoft.com/GetTopDownloads"

<soap:Envelope
  xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"
  xmlns:wsu="http://schemas.xmlsoap.org/ws/2002/07/utility"
  xmlns:wsse="http://schemas.xmlsoap.org/ws/2002/07/secext">
  <soap:Header>
    <wsse:Security>
      <wsse:UsernameToken>
        <wsse:Username>DEVELOPERTOKEN</wsse:Username>
        <wsse:Password Type="wsse:PasswordDigest">PASSWORDDIGEST</wsse:Password>
        <wsse:Nonce>NONCE</wsse:Nonce>
        <wsu:Created>2003-09-08T05:52:36Z</wsu:Created>
        <wsu:Expires>2003-09-08T05:55:36Z</wsu:Expires>
      </wsse:UsernameToken>
    </wsse:Security>
  </soap:Header>
  <soap:Body>
    <GetTopDownloads xmlns="http://www.microsoft.com">
      <topType>Recent</topType>
      <topN>25</topN>
      <cultureID>en-US</cultureID>
    </GetTopDownloads>
  </soap:Body>
</soap:Envelope>

There are 3 important pieces of information here.

  1. The method name. This is stored in two places: in an HTTP header called SOAPAction, and as the name of the element within the <soap:Body>.

  2. The method arguments. These are stored as elements within the <soap:Body>: topType, topN, and cultureID. The acceptable values and ranges for these arguments are documented.

  3. The authentication information. Not just anyone can make calls to Microsoft.com Web Services. Well, actually, that's not true. Anyone can, but first you need to sign up at Microsoft.com for a developer token and a PIN. These, along with some other calculated information (documented here), go in the <soap:Header>.

The rest is always the same. The server and the URL to post to are always the same (the server dispatches on the SOAPAction header instead). The namespaces used in the SOAP envelope are always the same. It all looks complicated, but it just boils down to method name, arguments, and credentials.

Here's another example, this time to get a version string that describes the version of Microsoft.com Web Services that is running on the server:

Host: ws.microsoft.com
POST /mscomservice/mscom.asmx HTTP/1.0
Content-type: text/xml
SOAPAction: "http://www.microsoft.com/GetVersion"

<soap:Envelope
  xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"
  xmlns:wsu="http://schemas.xmlsoap.org/ws/2002/07/utility"
  xmlns:wsse="http://schemas.xmlsoap.org/ws/2002/07/secext">
  <soap:Header>
    <wsse:Security>
      <wsse:UsernameToken>
        <wsse:Username>DEVELOPERTOKEN</wsse:Username>
        <wsse:Password Type="wsse:PasswordDigest">PASSWORDDIGEST</wsse:Password>
        <wsse:Nonce>NONCE</wsse:Nonce>
        <wsu:Created>2003-09-08T05:52:36Z</wsu:Created>
        <wsu:Expires>2003-09-08T05:55:36Z</wsu:Expires>
      </wsse:UsernameToken>
    </wsse:Security>
  </soap:Header>
  <soap:Body>
    <GetVersion xmlns="http://www.microsoft.com">
    </GetVersion>
  </soap:Body>
</soap:Envelope>

Changes are highlighted: the SOAPAction and first child within <soap:Body> are now GetVersion; this method takes no parameters, so they are simply omitted.

Here's the response from the GetVersion request:

HTTP/1.1 200 OK
Content-Type: text/xml; charset=utf-8

<soap:Envelope
  xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"
  xmlns:xsd="http://www.w3.org/2001/XMLSchema"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <soap:Body>
    <GetVersionResponse xmlns="http://www.microsoft.com">
      <GetVersionResult>Microsoft.Com Platform Services 1.0 Beta</GetVersionResult>
    </GetVersionResponse>
  </soap:Body>
</soap:Envelope>

Again, the namespaces and <soap:Envelope> are boilerplate; the real result is in the <GetVersionResult> element, a child of the <GetVersionResponse> element, which is in turn a child of <soap:Body>. In this case we've called a GetVersion method with no parameters, and the method has returned a single string: "Microsoft.Com Platform Services 1.0 Beta".

Whew.

The tools will save us

More Dive Into XML Columns

Identifying Atom

XML on the Web Has Failed

The Atom Link Model

Normalizing Syndicated Feed Content

Atom Authentication

I know what you're thinking. Either you're thinking, "that's a lot of boilerplate just to call a method and get a string back". Or you're thinking, "well, I don't care about what goes over the wire because I'll never see it. I'll just point my Web-Services-Enabled IDE to a WSDL file and call a method in a wrapper class that takes no arguments and returns a string." Both of these thoughts are true: SOAP-based web services are verbose, and there are tools that hide all of this messiness from you. For instance, Microsoft provides a WSDL file which is understood by savvy IDEs like Microsoft's own Visual Studio .NET. They also provide a Web Services Enhancements Service Pack for Visual Studio .NET which handles generating the required authentication elements from your developer token and PIN. And after spending thousands of dollars for development tools and updating all your service packs, you can indeed call a remote method with no parameters and get back a string, in about 5 lines of code, and you will never see any of this XML.

Of course, no other development environment that I know of (Python, Perl, or Java) has libraries that can generate the required authentication credentials, so it gets a little more complicated. And even once you manually generate the information, some of them (I tested Python specifically) make it difficult to add those credentials in the appropriate place in the <soap:Header> where Microsoft's server will look for it.

So I think this web services architecture is harder than it has to be for clients, unless you are blessed with perfect tools, in which case nothing is ever hard for you and you probably don't need to read web services articles like this. But let's look at this from Microsoft's point of view, from the server side. There are also a number of problems:

  • Every call uses HTTP POST. This seriously reduces the possibilities for any sort of caching; POSTs are never supposed to be cached by the server, the client, or by any intermediary proxies.

  • Every call POSTs to the same URL. This means that the web services producer cannot rely on standard web server logging to provide much useful information about usage. It would provide IP address and timestamp, but no information about the web services call itself. Another logging utility would need to be written that was application-specific (or at least SOAP-aware) in order to capture the SOAPAction header and extract the method arguments out of the<soap:Body>.

  • The server doesn't support gzip compression. This is actually independent of using POST or using a particular URL scheme. POST responses can be gzip-compressed, and SOAP messages actually compress very well. Microsoft just isn't doing it. It's a separate question whether your SOAP library knows enough about HTTP to send the appropriate Accept-encoding: header and decompress the response properly.

  • Microsoft is using an insecure authentication scheme. It does not send passwords in the clear, but it does require both the client and server to store plain text passwords somewhere. There are numerous authentication mechanisms that do not require this; even HTTP digest authentication allows both sides to store hashes of the password, instead of the password itself.

  • Also, the authentication is entirely one-way; the server can tell that the client knows the password, but the client can't tell that the server knows it. A malicious proxy could capture web services calls from the client and answer them itself, and the client would have no way of knowing that the results were not from Microsoft.com.

To sum up: Microsoft has implemented a set of standards-based web services in such a way that ignores most of the benefits of the Web and has made it as easy as possible with its own tools and as difficult as possible without them.

Is it possible to make web services easier? Is it possible to make them more like the Web?

Pages: 1, 2

Next Pagearrow