Securing Web Services

January 15, 2003

What's the Need?

According to the conventional wisdom, web services will not be successful until they're "secure." Without discussing the accuracy of the claim, let's look at what it means. In the web services context security means that a message recipient will be able to do some or all of the following:

Verify the integrity of a message, i.e., that it is unmodified.
Receive a message confidentially, so that unauthorized parties can't see it.
Determine the identity of the sender -- authenticating them.
Determine if the sender is authorized to perform the operation requested (explicitly or implicitly) by the message.

In a distributed environment -- really, any time you have more than a single program running under a local operating system -- these requirements are met by using cryptography. Two of the most fundamental operations, signing and encrypting, can directly meet the first two needs. The other two requirements, and all the supporting infrastructure that they need, are built on top of those operations.

How High the Stack?

There is a lot of security-related activities occurring at the various standards organizations. I think these are the most important right now:

XML Digital Signature, a W3C/IETF activity hosted at the W3C. This group is essentially finished, having successfully defined an XML digital signature specification.
XML Encryption, a W3C activity. This group has also completed its deliverable, a definition of how to encrypt (portions of) an XML document.
XML Key Management, a W3C WG that is defining a specification to allow clients to obtain crypto key information (keys, certificates, etc.) and do key management such as initial registration, revocation, etc. The requirements document is in last call and the main specifications seem to be well along.
OASIS Security Services TC, which is known for SAML (Security Authorization Markup Language). SAML is a framework for exchanging identification information; for example, a trusted third-party (such as a PKI CA or a network login server) could provide a signed set of assertions identifying my identity. SAML is the basis of the Liberty Alliance federated single sign-on facility; with time and luck, Microsoft will also adopt Passport to use it.
OASIS Access Control Markup Language TC. XACML is a framework for defining a set of privileges required to perform an operation, including identity information and external factors (like access policy and time of day). The XACML documents are in last call phase with only editorial changes expected.
OASIS Digital Signature Services TC, a new committee with the goal of defining a interface for a signature generation and verification service (for use if you don't have or want local facilities).
OASIS Web Services Security TC is building on the WS-Security document from IBM and Microsoft which defines how to sign a SOAP message. The group also intends to build a foundation for higher-level security services, such as policy integration, automatic interoperability, and so on.

In addition, the Web Services Interoperability Organization is defining a security profile to ensure basic interoperability among vendors. IBM and Microsoft, along with various partners, have issued a Security Roadmap that includes six -- SIX! -- new specifications.

All of that is almost enough to make you long for the days of ISO and X.400 specifications. Fortunately, we don't have to wait for the full suite; we can get real work done and solve real problems right now.

Why not SSL?

Let's look at the work involved in protecting a SOAP message. The obvious first question is why SSL isn't good enough. When Netscape turned SSL over to the IETF, they named the working group Transport-Level Security for good reason.

First, SSL operates between communication endpoints, not between applications. That's fine for most of the human-consumable Web, where a user is directing a browser to a web site. But even that falls down in the simple case of CGI scripts -- bugs in the web server (hard to imagine, I know) could result in undetected corruption of the POSTed data. Inexperienced ISPs often seem to have trouble setting up their first multihomed, multi-identity web server. In order to have different SSL server certificates, they must have different network endpoints, which often means either listening on different ports or configuring their network interface to "alias" multiple addresses.

There is also a more subtle, cryptographic point. When using SSL/TLS, you can't save the message for later to prove that it hasn't been modified. First, you have to save the entire two-way byte stream and "replay" the SSL traffic to get the plain text. Even if that were feasible, the fundamental architecture of SSL/TLS makes it impossible.

When a connection is established, parties exchange a transient session key to encrypt the data between them. Both parties have the same key, but it's only intended to be used for a short period of time. In case of a dispute, it's impossible for either party to prove that it has the unmodified message. As the saying goes, two people can keep a secret, but only if one of them is dead.

Since SSL/TLS isn't appropriate, we need other mechanisms. We need something that sits at the same layer as the message, something that's part of the SOAP message itself. Lucky for us, SOAP headers fit nicely with detached signatures to make this possible.

Signature Types

XML DSIG supports three signature types and each has its use:

Enveloped, where the signature is carried along within the data.
Enveloping, the inverse of enveloped, where the data being signed is carried within the signature.
Detached, where the data being signed is somewhere else, identified by a URI.

An enveloped signature is useful when you have a simple XML document which you to guarantee the integrity of. For example, XKMS messages can use enveloped signatures to convey "trustable" answers from a server back to a client. An enveloping signature is useful when the signing facility wants to add its own metadata (such as a timestamp) to a signature -- it doesn't have to modify the source document, but can include additional data covered by the signature within the signature document it generates. (An XMLDSIG signature can sign multiple objects at once, so enveloping is usually combined with another format.)

A detached signature is useful when you can't modify the source; the downside is that it requires two XML documents -- the source and its signature -- to be carried together. In other words, it requires a packaging format -- enter SOAP headers.

WS-Security

WS-Security specifies how to sign and encrypt SOAP messages. At the simplest level, it says to put the XML DSIG signature in a SOAP Header element with a specific qname:

<SOAP:Envelope xmlns:SOAP='http://schemas.xmlsoap.org/soap/envelope/'>

  <SOAP:Header>

    <wsse:Security xmlns:wsse='http://schemas.xmlsoap.org/ws/2002/07/secext'>

      <Signature xmlns='http://www.w3.org/2000/09/xmldsig#'>

        ...defined below...

      </Signature>

    </wsse:Security>

  </SOAP:Header>

  <SOAP:Body id='Body'>

      ...message body...

  </SOAP:Body>

</SOAP:Envelope>

Unfortunately WS-Security only allows one instance of this header, making multiparty signatures (imagine an online auction with buyer, seller, and auctioneer) a little awkward. Nevertheless, if I want to keep the tamper-evident proof for my application message, I only have to keep the SOAP message, which is a much more tractable problem than saving an entire SSL/TLS transaction.

The Signature itself contains three parts: a reference to what is being signed (there can be multiple references), a signature value that covers the references, and information about the signing keys.

The SignedInfo element mainly contains references to data, a description of how to process and generate a hash for it, and what that hash (or more accurately, message digest) value is.

<SignedInfo>

  <CanonicalizationMethod Algorithm='http://www.w3.org/TR/2001/REC-xml-c14n-20010315'/>

  <SignatureMethod Algorithm='http://www.w3.org/2000/09/xmldsig#rsa-sha1'/>

  <Reference URI='#Body'>

    <Transforms>

      <Transform Algorithm='http://www.w3.org/TR/2001/REC-xml-c14n-20010315'/>

    </Transforms> <DigestMethod Algorithm='http://www.w3.org/2000/09/xmldsig#sha1'/>

    <DigestValue>riJUygbyupbDqcIiV+jgIdHe7WQ=</DigestValue>

  </Reference>

</SignedInfo>

Without getting into too many details, this fragment says that the signed data is canonicalized and then hashed and signed with an RSA key. The node with a "Body" ID attribute is XML data that is canonicalized and has the specified hash value.

The signature value itself is a hash of the SignedInfo element and is fairly boring. It's important to realize, however, that XML DSIG gains its flexibility by always doing an indirection; the data itself isn't signed, a reference that specifies the data's digest is.

<SignatureValue>fHP...ZOA=</SignatureValue>

Finally we have information about the key that generated the signature. In this case it's an X.509 digital certificate.

<KeyInfo>

  <X509Data>

    <X509Certificate>MII...AABvi</X509Certificate>

  </X509Data>

</KeyInfo>

Where Are We?

We've done a lot of work; more accurately, we've looked at a lot of complicated XML. All we've done is guaranteed that the SOAP message that we have in hand is exactly the same message that someone sent. We haven't even started trying to answer any of the other important questions, such as who that entity is, whether we should trust them, and whether their claimed identity is still valid. Perhaps all those standards groups are really necessary after all?