Menu

Privacy and XML, Part 2

May 1, 2002

Paul Madsen and Carlisle Adams

XML-Based Techniques Relevant to Privacy

There are a number of efforts currently underway in standards bodies and other organizations to address various aspects of privacy with XML-based technologies. This section gives an overview of this work, noting which of the privacy concepts discussed in the first part of this article are addressed by different XML initiatives.

P3P

Relevant Privacy Concepts
• Privacy Policy
• Transparency
The Platform for Privacy Preferences (P3P) is a protocol developed by the World Wide Web Consortium (W3C). P3P (on the server side) defines an XML-based language by which Web sites can describe their privacy policies in a machine readable format. Categories of information include the contact information of the legal entity making the privacy statement, whether users will have access to information collected about them, different types of data being collected, the purpose(s) for collection, and which organizations will have access to the collected data. P3P is a response to the long, non-machine-readable, and often confusing or vague "Privacy Policies" that many sites offer to users. The following listing displays a P3P policy for a fictitious Web site.

<POLICIES xmlns="http://www.w3.org/2002/01/P3Pv1">
<POLICY discuri="http://www.website.example.com/p3p.html">
    <ENTITY>
    <DATA-GROUP>
        <DATA ref="#business.name">WebSite.com</DATA>
        <DATA ref="#business.contact-info.postal.street">200
            Main Street</DATA>
    </DATA-GROUP>
    </ENTITY>
    <ACCESS><nonident/></ACCESS>
    <STATEMENT>
        <PURPOSE><admin/><develop/></PURPOSE>
        <RETENTION><stated-purpose/></RETENTION>
        <DATA-GROUP>
            <DATA ref="#dynamic.http"/>
        </DATA-GROUP>
    </STATEMENT>
</POLICY>
</POLICIES>

This P3P policy expresses that the company "WebSite.com" does not acquire any Personally Identifiable Information and tracks dynamic HTTP data for administrative and development purposes only.

Microsoft has included client support for P3P in Internet Explorer 6, providing an interface by which users can indicate their own privacy preferences, as shown in the following screen capture.

Options

As this figure demonstrates, Microsoft's early support for client-side P3P entails giving the user control over cookie behaviour.

Note: P3P policies covering the use of cookies can be expressed in a condensed form called compact policies. The cookie filtering of IE6 actually requires these compact policies as opposed to the full XML policy.

If the user browses a site with either a policy that conflicts with the user's recorded preferences, or a missing privacy policy, the user is warned of the situation through an icon in the Explorer window's Status bar (see the following figure) and can then make a choice regarding how they interact with that site.

XACML - XML Access Control Markup Language

Relevant Privacy Concepts
• Privacy Policy
• Access Control

XACML, an OASIS Technical Committee, is producing XACML, a proposal for capturing authorization policies for resources. XACML is expected to address fine-grained control of authorized activities (e.g., read, write, copy, etc.) based on, among other criteria, access requester characteristics ("only Senior VPs and above can view this document'), the protocol over which the request is made ("this data is only viewable if accessed over HTTPS'), and the authentication mechanism ("requester must have authenticated using a Digital ID'). XACML is relevant to privacy both because access control is one security mechanism by which corporations will ensure privacy (with respect to unintended use) for their customers, and because of the potential for XACML to be used as the syntax for capturing the privacy policy defined by that customer for information resources (such as a health record, for example).

The following is a sample of an XACML policy which says that patients are allowed to read their own medical records (which, it is assumed, are stored in XML). The policy illustrates the connection between XACML and the Security Assertions Markup Language (SAML), another OASIS initiative. SAML defines an XML syntax by which authentication and authorization assertions and queries can be expressed for delivery over the network. In the example, the assumption is that a SAML AuthorizationDecisionQuery has been received which questions what entitlements should be granted to a particular medical patient requesting access to their medical records.

<?xml version="1.0"/>
<rule>
  <target>
    <subject>
      samlp:AuthorizationDecisionQuery/Subject/NameIdentifier/Name
    </subject>    
    <resource>
      <patternMatch>
        <attributeRef>
          samlp:AuthorizationDecisionQuery/Resource
        </attributeRef>
        <attibuteValue>medico.com/record.*</attibuteValue>
      </patternMatch>
    </resource>

    <actions>
      <saml:Actions>
        <saml:Action>read<saml:Action>
      </saml:Actions>
    </actions>
  </target> 

  <condition>    
    <equal>
      <attributeRef>
        samlp:AuthorizationDecisionQuery/Subject/NameIdentifier/Name
      </attributeRef>
      <attributeRef>
        //medico.com/records/patient/patientName
      </attributeRef>
    </equal>
  </condition>

  <effect>Permit</effect>

</rule>

The resources for which read access are being requested are highlighted in blue, the conditions under which the access should be granted are highlighted in red.

XML Encryption

XML Encryption, currently a W3C Candidate Recommendation, is a proposal for an XML vocabulary for capturing the results of an encryption operation performed on arbitrary (but most likely XML) data. A critical feature of the XML Encryption proposal is that it supports the concept of encrypting only specific portions of an XML document, not only minimizing the encryption processing but, more importantly, leaving non-sensitive information in plain text form such that general (i.e., non-security-related) processing of the XML can proceed. The following figure displays a employee record with the salary element encrypted with the XML encryption tags in a separate 'xenc' name space. This example omits many of the details; the XML Encryption proposal is available if you'd like to see more.

<?xml version="1.0"?>
<employee id="b3456">
  <name>John Smith</name>
  <title>Senior Analyst</title>
  <salary>
    <xenc:EncryptedData>
      <xenc:CipherData>
        <xenc:CipherValue>AbC234ndZ...</xenc:CipherValue>
      </xenc:CipherData>
    </xenc:EncryptedData>
  </salary>
</employee>

Only the intended recipient, presumably the employee in question or an appropriately authorized Human Resources representative, would be able to decrypt the contents of the <CipherValue> element to view the employee's actual salary.

XML Signature

Relevant Privacy Concepts
• Information Sharing
• Opt-In
• Authentication

XML Signature is another W3C proposal, current a Proposed Recommendation. XML Signature defines an XML Schema for capturing the result of a digital signature operation applied to arbitrary (but often XML) data. Unlike previous non-XML Digital Signature standards, XML Signature has been designed to both account for and take advantage of the Internet and XML. Please refer to our previous article for a detailed depth discussion of XML Signature.

XML Signature is especially relevant to privacy because, in the first place, in the context of information sharing programs like .NET My Services and Liberty, it will likely be through an XML Signature calculated on a SOAP request that a requesting application authenticates itself; and, in the second, because the user's opt-in confirmations could be captured as XML Signatures to guard against the user later repudiating their choice. This scenario is represented in the following diagram.

If the user were to click on the "Sign Confirmation" button, an XML Signature (using the user's private key) would be calculated over the relevant portion of the displayed HTML page and then archived, thereby preventing the user from subsequent repudiation.

Of course, current browsers do not provide the ability to perform digital signatures on displayed HTML, much less XML Signatures. Technologies do exist for extending the browser to provide these signing capabilities, and it's likely that XML Signature will eventually be supported.

WS-Security

Relevant Privacy Concepts
• Information Sharing
• Confidentiality
• Authentication
• Authorization

WS Security is a recent proposal from Microsoft for adding security metadata to SOAP messages. We'll discuss it here in the context of a site requesting user information from .NET My Services. In this context, there are two privacy aspects to WS Security:

  1. The requesting application will use WS Security mechanisms to authenticate to .NET My Services such that .NET My Services can determine if the relevant user has been granted the appropriate authorizations as defined in its privacy policy
  2. The confidentiality of the returned user information will be ensured through WS Security's support for encryption.

To ensure that only authorized parties gain access to this data, the site must first prove its identity by authenticating itself. Rather than the application authenticating directly to .NET My Services, it will do so to Microsoft's Passport service in order to obtain a token (a Kerberos ticket), which it will then present to .NET My Services. .NET My Services, after verifying the token and confirming that it came from Passport, will confirm that it conforms to the appropriate user's privacy preferences. It will then accept the request and return the requested data in a SOAP response.

An example of the SOAP request to .NET My Services is shown below, illustrating both the elements of the WS-Security namespace (identified by the namespace prefix "wsse") and those of HSDL (.NET My Services data manipulation language). For clarity, the actual namespace declarations are omitted.

<SOAP:Envelope>
    <SOAP:Header>
    <wsse:Security>
    <wsse:BinarySecurityToken wsse:ValueType="wsse:Kerberosv5"
        EncodingType="wsse:Base64Binary" Id="token">
        MIIEZzCCA9CgAwIBAgIQEmtJZc0...
    </wsse:BinarySecurityToken>
    
    <dsig:Signature>
        <dsig:Reference="#busmsg"/>
        <dsig:SignatureMethod Algorithm="#hmac-sha1">
            <HMACOutputLength>128</HMACOutputLength> 
        </dsig:SignatureMethod>
        <dsig:SignatureValue>IU(89.Hl8*.</dsig:SignatureValue>
        <dsig:KeyInfo>
            <wsse:SecurityTokenReference>
                <wsse:Reference URI="#token"/>
            </wsse:SecurityTokenReference>
        </disg:KeyInfo>
    </dsig:Signature>
    </wsse:Security>
    </SOAP:Header>

    <SOAP:Body>
    <hsdl:queryRequest id="busmsg">
        <xpQuery select='/contact[Name="Smith"]'/>
    </hsdl:queryRequest>
    </SOAP:Body>

</SOAP:Envelope>

The binary encoded Kerberos ticket in the <BinarySecurityToken> element is highlighted in green in the SOAP header. The ticket is encrypted with a secret key that .NET My Services shares with Passport.

The <Signature> element is highlighted in red in the SOAP Header. The enclosed WS-Security <SecurityTokenReference> references the ID of the previous <BinarySecurityToken>.

The HSDL elements of the SOAP Body are highlighted in blue. The HSDL request is to return all contacts with the name of "Smith" for the user on whose behalf the request is made.

To authenticate and authorize this SOAP message, .NET My Services would perform the following processing:

  1. Use the secret key it shares with Passport to decrypt the Kerberos ticket in the <BinarySecurityToken> element. The ticket will contain:
    1. The Passport PUID of the user on whose behalf the request is being made
    2. An identifier for the requesting application.
    3. The temporary session key (of which the requesting application has the other copy)
  2. Examine the <Reference> element in the XML Signature to determine over which elements of the SOAP message was the MAC calculated
  3. Use the session key it extracted from the Kerberos ticket (1c above) to calculate a MAC over the same elements
  4. Compare the MAC it calculates to that supplied by the requesting application.
  5. If the MACs match, compare the authenticated requesting application with the user's privacy policy to determine if the request should be granted.

Assuming that the nature of the user profile data to be returned in the SOAP response is sensitive, the user's privacy requires that the confidentiality of this information be protected in transit. An example of the SOAP response is shown here:

<SOAP:Envelope>
    <SOAP:Header>
  
        <wsse:Security>

         <xenc:EncryptedData id="encdata">
         <xenc:EncryptionMethod   
             Algorithm='http://www.w3.org/xmlenc#3des-cbc'/>
         <xenc:CipherData>
             <xenc:CipherValue>JS*du89sad7</xenc:CipherValue>
         </xenc:CipherData>
        </xenc:EncryptedData>

        </wsse:Security>
 
   </SOAP:Header>

    <SOAP:Body>
        <hsdl:queryResponse id="busmsg">
        <xenc:CipherReference idref="#encdata"/>
       </hsdl:queryResponse>
    </SOAP:Body>

</SOAP:Envelope>

The XML Encryption <EncryptedData> element is highlighted in red. It holds the encrypted contents of the <queryResponse> element. The actual encryption operation used the 3DES algorithm and the secret key previously established between the requesting application and .NET My Services.

The HSDL elements of the SOAP Body are highlighted in blue. The contents of the <queryResponse> have been replaced with the <CipherReference> element of the XML Encryption namespace. This element points at the EncryptedData element in the SOAP header above.

SAML

Relevant Privacy Concepts
• Information Sharing
• Authentication
• Authorization
SAML (Security Assertions Markup Language) is another OASIS initiative. SAML will provide a standard way to define user authentication, authorization, and attribute information in XML documents. As its name suggests, SAML will allow business entities to make assertions regarding the identity, authorizations, and attributes of a subject to other entities, which may be partner companies, other enterprise applications, and so on. These assertions will be passed as XML documents, either pushed from the Asserting Party to the Relying Party, or pulled from the Asserting Party by the Relying Party.

SAML is relevant to privacy because the Liberty Alliance, a consortium of companies committed to defining a framework of standards and processes for federated identity, has decided to use SAML as the syntax for communicating authentication assertions from one business to another to enable Single Sign-On (SSO) for Web users. In a typical SSO scenario, a Web user would login to one site, and then, without needing to present any additional credentials, be able to access the resources of a second site as if they had actually logged-in. In addition to SSO, Liberty will define mechanisms for the passing of user information from one Web site to another; SAML supports this information sharing through its attribute assertions. The following listing gives an example of such an Attribute assertion (omitting some of the namespace details for clarity), the assertion states that the designated subject has an AAA credit rating.

<saml:Assertion>
  <saml:AttributeStatement>
    <saml:Subject>
      <saml:NameIdentifier
        SecurityDomain=" www.creditcheck.com"
        Name="averagejoe" />
    </saml:Subject>
    <saml:Attribute
          AttributeName="CreditRating"
      AttributeNamespace="http://www.creditcheck.com">
      <saml:AttributeValue>
        AAA
      </saml:AttributeValue>
    </saml:Attribute>
  </saml:AttributeStatement>
</saml:Assertion>

In order for the recipient of such an assertion to trust it, the recipient must be able to authenticate that the assertion did indeed come from the entity who claims to have sent it (i.e., it hasn't been sent by an impersonator). This authentication will be accomplished through the sender attaching an XML Signature over the SAML assertion; by verifying this XML Signature, the recipient of the assertion will be able to determine its authenticity. The confidentiality of this information -- obviously a requirement if the privacy of the individual is to be respected -- could be achieved through the use of XML Encryption. (Note that if the confidentiality of the information received does not need to persist, then confidentiality of the information in transit could be achieved through a transport layer mechanism like SSL.)

Conclusion

Privacy, ensuring that e-business customers remain in control over the personal information they share with online businesses, is one of the key issues facing today's companies as they attempt to take advantage of the great potential the Internet has for creating personalized and trusted relationships with customers. XML, not surprisingly, plays a critical role, both aggravating the problem through the free-flow of information it enables and providing syntax that offers a piece of the technology puzzle for solving the problem.