Implementing XML Signatures in WSS4J
October 20, 2004
In the first column of this series, I introduced the WSS4J API. In the second column, I demonstrated the use of XSS4J for XML encryption. In the third column, I implemented the encryption features of WSS4J using the concepts discussed in the second column. And the fourth column of this series demonstrated the use of XSS4J to author XML digital signatures.
In this column, I'll use the concepts discussed in the previous columns to implement XML signature support in our WSS4J API.
As you know from previous columns, the WSS4J API implements the Token interface separately for different types of security tokens in WSS. In this column, I'll discuss six types of WSS signature tokens:
- X509 binary security token
- Username password text token
- Username password digest token
- X509 key name token
- X509 pointer token
- X509 key identifier token
Let's start our discussion by considering a WSS message before and after XML signatures.
A WSS Message Before and After XML Signature
In this section I'll present an input message (i.e. a SOAP message containing data to be signed) that we will use to demonstrate our WSS4J API for signing. Then I'll discuss the output after signing of the input message with six different tokens supported by the WSS specifications.
Listing 1 shows the SOAP input message that we will use as input for the XML signature process. Listings 2, 3, 4, 5, 6, and 7 show the output after signing the input SOAP message with different tokens.
You can see from Listings 2, 3, 4, 5, 6, and 7 that they are not very different from each other. Therefore, we will first explain Listing 2 and then outline the differences between Listing 2 and other tokens.
Listing 2 shows the
result after signing the input SOAP message using the wsse:BinarySecurityToken
element. I covered the details of using the wsse:BinarySecurityToken
element in
our web services security series (especially in Part 2). Notice the
following points from Listing 2:
- You can see in Listing
2 that the
wsse:BinarySecurityToken
element is included inside thewsse:Security
element in the SOAP Header. - This
wsse:BinarySecurityToken
element is actually a security token and in order to act as a security token, thewsse:BinarySecurityToken
element has to wrap authentication data. - The
wsse:BinarySecurityToken
can wrap different types of authentication data in binary format (e.g. X509 certificates and Kerberos tickets, etc.). - You can see in Listing
2 that the
wsse:BinarySecurityToken
element contains three attributes, namelyValueType
,EncodeType
, andwsu:Id
. - The
ValueType
attribute of thewsse:BinarySecurityToken
element in Listing 2 is used to specify the type of the security token wrapped inside thiswsse:BinarySecurityToken
element. Here theValueType
attribute contains"X509v3"
, which means this binary security token wraps an X509 version 3 certificate. - The
wsu:Id
attribute is used to specify an identifier for the token. Here we specified its value as "BinarySecurityTokenWithReference." Whenever we need this token for signing a message, we will refer to the token using itswsu:Id
. - The
EncodeType
attribute is used to specify the encoding type of the token. In Listing 2 the value of theEncodeType
attribute iswsse:Base64Binary
. The binary data in raw form cannot be wrapped inside XML markup as such; it may produce problems while XML parsing. Therefore we encode it into base-64 before wrapping inside XML markup. The result of base-64 encoding does not contain any byte that conflicts with XML processing. - Now look at the
ds:Signature
element inwsse:Security
element. We discussed the details of theds:Signature
format in the second article of the Web Services Security series. - How will we specify the token that a particular signature element uses for signature?
Look at the
ds:KeyInfo
element inside theds:Signature
element in Listing 2. It contains awsse:SecurityTokenReference
element. Thiswsse:SecurityTokenReference
element is used to specify the token that we have used to produce the signature. In Listing 2 we have included ads:Reference
element inside thewsse:SecurityTokenReference
element. TheURI
attribute of theds:Reference
element specifies the ID of the token we want to sign our message. For example in Listing 2 the value of theURI
attribute is "#BinarySecurityTokenWithReference
". This attribute value points to the token whosewsu:Id
is "BinarySecurityTokenWithReference."
We will call the wsse:BinarySecurityToken
element that wraps an X509
certificate as X509 binary security token.
Now look at Listing 3,
the output in Listing 3
is generated after signing the SOAP message with another type of token named
UsernameToken
, which is also defined by the WSS Specification. The only
difference between Listings 2 and 3 is that
the wsse:BinarySecurityToken
element in Listing 3 is replaced with
wsse:UsernameToken
element. The following points explain the
wsse:UsernameToken
security token:
- You can see in Listing
3 that a
wsse:UsernameToken
element contains ausername
in thewsse:Username
element and a password wrapped in thewsse:Password
element. - The
UsernameToken
signs the SOAP message with the password specified in thewsse:Password
element. - The
wsse:Password
element contains a Type attribute, which specifies the form in which the password is wrapped inside thewsse:Password
element. Here in Listing 3 its value is "PasswordText", which tells that the password is in plain-text form. Some users might not like to send their password in plain-text form. They can use another type of security token named Username password digest token, which I will explain shortly.
We will call the username token with password in text form (Listing 3) as Username password text token.
Now look at Listing 4,
which is a very slightly modified form of Listing 3. In Listing 4 the Type attribute of
wsse:Password
element is specified as "PasswordDigest." Moreover,
two elements named wsse:Nonce
and wsse:Created
are included in the
wsse:UsernameToken
element. In this case the password (which is a secret
key), wsse:Nonce
(which wraps a random number), and wsse:Created
(which is a timestamp) will be combined to form the digest value corresponding to
the
password.
We have already covered the details of password digests as well as wsse:Nonce
and wsse:Created
elements while discussing the wsse:UsernameToken
element in Listing 1 of the fourth article of
the Web Services Security series.
We will call the username token of type "PasswordDigest" (Listing 4) as Username password digest token.
Listing 5 shows the
output after signing the message with the key name token. You can see in Listing 5 that it contains a
ds:KeyName
element as child of the wsse:SecurityTokenReference
element. This ds:KeyName
element contains the alias of the X509 certificate
used to sign the WSS message.
Notice that unlike Listings 2, 3, and 4, Listing 5 does not contain any token.
This is because, in Listing
5 a reference to the token is directly included in the
wsse:SecurityTokenReference
child of the ds:KeyInfo
element. In
this case we are assuming that the application that receives this WSS message can
itself
find the certificate using the alias. Therefore, we do not need to send the certificate
along with the message.
We will call the key name token (Listing 5) as X509 key name token.
Now have a look at Listing
6, which is a slightly modified form of Listing 5. In Listing 6 the ds:KeyName
element inside the wsse:SecurityTokenReference
is replaced with the
ds:X509Data
element structure. The following points explain the
ds:X509Data
element:
- The
ds:X509Data
element wraps an element namedX509IssureSerial
. - The
X509IssureSerial
element contains two child elements namedX509IssuerName
andX509SerialNumber
. These two elements uniquely identify an X509 certificate used for signing.
Also notice that in the case of Listing 6, we are assuming that the
recipient can find the certificate with the information provided in the
ds:X509Data
element. Therefore, we do not need to send the certificate along
with the message.
We will refer to the token with the X509 issuer serial (Listing 6) as X509 pointer token.
Listing 7 shows the
output after signing the message with the key identifier token. You can see that Listing 7 is a slightly
modified form of Listing
5. In Listing 7
the ds:KeyName
element is replaced with the wsse:KeyIdentifier
element as the child of the wsse:SecurityTokenReference
element. The
wsse:KeyIdentifier
wraps an identifier of a key.
The following points explain the wsse:KeyIdentifier
element:
- You can see in Listing
7 that the
wsse:KeyIdentifier
element contains two attributes, namelyEncodeType
andValueType
. - The
ValueType
attribute of thewsse:KeyIdentifier
element in Listing 7 is used to specify the type of the identifier wrapped inside thiswsse:KeyIdentifier
element. Here the value of theValueType
attribute iswsse:X509SubjectKeyIdentifier
, which means this key identifier identifies the signer's X509 certificate. - The
EncodeType
attribute is used to specify the encoding type of the token. In our case we are using base-64 encoding.
Also notice that in case of Listing 7, we are assuming that the
recipient can find the certificate with the information provided in the
wsse:KeyIdentifier
element. Therefore, we do not need to send the certificate
along with the message.
We will refer to the key identifier token (Listing 7) as X509 key identifier token.
Implementing the Signature Tokens
In this section, I will demonstrate the implementation of different signature tokens discussed so far.
As you know WSS4J works on the idea of implementing the Token interface. Therefore, I will implement the Token interface for every token defined by Listings 2, 3, 4, 5, 6, and 7.
From the listings discussed in the previous section you can see that the Signature
elements in Listings 2,
3, and 4 have exactly the same
structure. Especially notice the following points regarding the ds:Signature
element:
- The
ds:Signature
element in each listing contains ads:SignedInfo
element. - The
ds:SignedInfo
element in Listings 2, 3, and 4 containsds:CanonicalizationMethod
,ds:SignatureMethod
, andds:Reference
elements as its child. - You can see from Listings 2, 3, and 4 that the
ds:Signature
element in each listing also contains theds:SignatureValue
and theds:KeyInfo
elements. Theds:KeyInfo
element contains thewsse:SecurityTokenReference
element, which in turn contains awsse:Reference
child element.
The features common among Listings 2, 3, and 4 lead us to implement a generic
SignatureToken
class that handles the common signing responsibilities
presented by XML signing tokens. Then the different token classes will extend this
generic
SignatureToken
for common functionalities and implement token specific
additional functionality.
I have also written a SignatureTemplate XML file (shown in Listing 8) that contains the
ds:Signature
element structure, which we want to use as a signature template.
You can notice from the ds:Signature
element in Listing 8 that it is similar to the
ds:Signature
elements in Listings 2, 3, 4, 5, 6, and 7. The only difference is that the
signature template has some empty fields:
- In Listing 8 the
Algorithm
attributes in theds:CanonicalizationMethod
,ds:DigestMethod
, andds:SignatureMethod
elements are empty, while in Listings 2, 3, 4, 5, 6, and 7 the Algorithm attributes contains values. - You can see in Listing
8 that the
ds:SignatureValue
element in theds:Signature
element wraps nothing, while in Listings 2, 3, 4, 5, 6, and 7 theds:SignatureValue
element contains the actual signature value. - Listing 8 does not
contain the
ds:KeyInfo
element in theds:Signature
element while Listings 2, 3, 4, 5, 6, and 7 contains theds:KeyInfo
element.
We will fill the missing values in the signature template during the signature process.
I will now discuss the implementation details of the SignatureToken
class. Listing 9 shows the code
for the SignatureToken
class. You can notice the following points from Listing 9:
- The
SignatureToken
class implements the Token interface. - The
SignatureToken
class contains two constructors, one with five parameters and the other with three parameters. The actual token classes that extend theSignatureToken
class will decide which constructor they want to use. - Following are the parameters that the five parameter constructors takes:
keyStoreFileName
: The full path name of a Java key store file. The key store file contains the public key of the recipient of the WSS message.keyStorePassword
: The string representation of the password that we need in order to access the key store.keyName
: This parameter is actually the alias to the certificate. We will use the value of this parameter to extract the certificate from key store.tokenWSUId
: The identifier for the token used to sign the message.wssMessage
: The parentWSSMessage
object that wraps this token.
- The
SignatureToken
constructor with five parameters performs the following steps:- First it sets the parameters in class-level variables. For example, it sets the
value of the
keyName
parameter in the class variablekeyName
, the value of thetokenWSUId
parameter in the class variabletokenWSUId
, and the value of thewssMessage
variable in thewssMessage
class variable. - Then it loads the signature template (Listing 9) in a DOM Document object
named
templateRootEl
. - Then it loads the
KeyStore
, whose path is specified in thekeyStoreFileName
.
- First it sets the parameters in class-level variables. For example, it sets the
value of the
- The
SignatureToken
constructor with three parameters takes the following parameters:tokenWSUId
: The identifier for token used to sign the message.userName
: The name of the user who is signing the message.wssMessage
: The parentWSSMessage
object that wraps this token.
- The
SignatureToken
constructor with three parameters performs the following steps:- First it sets the parameters in class-level variables. For example, it sets the
value of the
tokenWSUId
parameter in the class variabletokenWSUId
, the value of theuserName
parameter in the class variableuserName
, and the value of thewssMessage
parameter in thewssMessage
class variable. - Then it loads the signature template (Listing 9) in a DOM document object
named
templateRootEl
.
- First it sets the parameters in class-level variables. For example, it sets the
value of the
- The
SignatureToken
class will never be instantiated as it is not really an actual WSS4J token. It only represents the common functionality of several tokens. So you may ask why I am writing the constructors for theSignatureToken
class. The answer to this argument is that the actual token classes will extendSignatureToken
and their constructors will call the supers constructor. - The
SignatureToken
class implements all the methods in Token interface in Listing 3 of the first article. - Many Token interface methods are empty in the
SignatureToken
class. Only three methods of the Token interface have actual implementations in theSignatureToken
class. These three methods aresign()
,signWithXPath()
, andsetSecret()
. We will shortly explain these three methods. - The
SignatureToken
class (Listing 9) also contains two helper methods namedloadDocument()
andgetXMLString()
. - The
loadDocument()
method takes a XML structure in form of string, loads that string into a DOM document, and then returns the document instance to the calling method. - The
getXMLString()
method is the opposite of theloadDocument()
method. It takes a document, converts it into a string, and returns the string representation of the document.
Now have a look at the setSecret()
method shown in Listing 9. The setSecret()
method takes a secret in byte array form and uses the secret to fetch the private
key that
we will use for signing.
The setSecret()
method performs the following steps:
- First it extracts the X509 certificate from the key store.
- Then it extracts the private key by calling the
getKey()
method of theKeyStore
object.
Some of the token classes that extend the SignatureToken
class will override
the setSecret()
method to implement their own token specific
setSecret()
method functionality.
Now have a look at the sign()
method in Listing 9. This method authors the
Signature element. The sign()
method takes four parameters:
wsuElementID
: Thewsu:Id
of the element that we are going to sign.digestAlgo
: The algorithm used to digest the message.signatureAlgo
: The signature algorithm used to generate the signature value.canonicalizationAlgo
: The canonicalization algorithm to be used before signing. Please refer to the Resources section for more details of canonicalization.
The sign()
method performs the following nine steps (marked with comments in
Listing 9) to fill in
the blanks in the signature template:
Step 1: In this step, we import the ds:Signature
element from the
signature template document (Listing 9) into the DOM document which represents the parent WSS message.
Step 2: You can see in Listing 9 that the URI
attribute of ds:Reference
element in ds:Signature
element is
empty. So, we fill this with the value of the incoming parameter named
wsuElementID
. This URI
attribute value refers to the
wsu:Id
of the element that we are going to sign.
Step 3: If you look at the signature template (Listing 8), you will find that there is
an Algorithm
attribute in ds:CanonicalizationMethod
,
ds:SignatureMethod
, and ds:DigestMethod
elements but the values
are not specified. In this step we set the values of all three Algorithm
attributes. The value of canonicalizationAlgo
parameter is set in the
Algorithm
attribute of the ds:CanonicalizationMethod
element.
The value of the digestAlgo
parameter is set in the Algorithm
attribute of the ds:DigestMethod
element. The value of the
signatureAlgo
parameter is set in the Algorithm attribute of the
ds:SignatureMethod
element.
Step 4: You can see from Listings 2, 3, 4, 5, 6, and 7 that each of them contains a
ds:KeyInfo
element. But our signature template does not contain any
ds:KeyInfo
element. Therefore, in this step, we insert a
ds:KeyInfo
element in the Signature
element.
Step 5: Now we author the wsse:SecurityTokenReference
element as the
child of the ds:KeyInfo
element.
Step 6: In this step we insert a wsse:Reference
element as child of
wsse:SecurityTokenReference
element and also set its URI
attribute with the wsu:Id
of the security token (i.e. the fourth parameter
named tokenWSUId
that was passed to the SignatureToken
constructor).
Step 7: Next, we create an instance of SignatureContext
class. The
SignatureContext
class is explained in fourth article of the
Web Services Security for Java series.
Step 8: Now let's make a call to the setIDResolver()
method of the
SignatureContext
class and pass it the WSUIdResolver
object. The
IDResolver
class from XSS4J cannot resolve wsu:Id
of an element.
Therefore, we have implemented our own id resolver class named as
WSUIdResolver
. I will discuss the details of WSUIdResolver
class
after the discussion of the SignatureToken
class.
Step 9: Now we are ready for signing. We can call the sign()
method of
SignatureContext
by passing the signature template element (i.e. populated in
above stated steps) and a private key (that was mentioned while discussing the
setSecret()
method).
The SignatureToken
class of Listing 9 contains another public method
named signWithXPath()
. This method also signs a WSS message like the
sign()
method discussed above. The difference comes between both methods
while referencing the element to be signed in the WSS message.
The signWithXPath()
method references the element to be signed with an XPath
expression. The signWithXPath()
method takes the same parameters as discussed
for the sign()
method above except the first parameter, which is the XPath
expression that identifies the element to be signed.
The implementation steps for the signWithXPath()
method are same as discussed
for the sign()
method except the third and eighth steps:
Step 3: This time the URI
attribute of ds:Reference
element
will remain empty. Instead, we author a Transforms
child of the
Reference
element to resolve the XPath expression. This XPath expression will
identify the element from the WSS message needed to be signed.
Step 8: We do not need any IDResolver
class to resolve the
wsu:Id
. Therefore, Step 8 is empty in the signWithXPath()
method.
When we sign using XPath, the resulting Signature
element will look like Listing 10. You can see
that Listing 10 is
similar to the Signature
elements in Listings 2, 3, 4, 5, 6, and 7. The only difference is that in Listing 10 there is a
Transforms
child element of the ds:Reference
element.
Now have a look at Listing 11 that shows the code for WSUIdResolver
class. As mentioned in
Step 8 of discussion on the sign()
method, the SignatureContext
class will use the WSUIdResolver
class to resolve the wsu:Id
of
the element to be signed.
XSS4J provides a class named as AdHocIDResolver
to resolve the Ids of
elements in a SOAP message. This AdHocIDResolver
class implements an interface
named IDResolver
. The IDResolver
interface contains just one
method named resolveID()
. Whenever the SignatureContext
class
comes across the URI
attribute in the Reference
element, it calls
the resolveID()
method , which returns the element to be signed.
But there is a problem, the IDResolver
class from XSS4J cannot resolve the
wsu:Id
. Therefore, we have implemented our own class named
WSUIdResolver
in Listing 11 to resolve the
wsu:Id
. The WSUIdResolver
class implements the
resolveID()
method, which will return the element we want to sign.
Note the following points from the resolveID()
method in Listing 11:
- The
resolveID()
method takes a DOM document object and thewsu:Id
to be resolved as parameters. The DOM document object is the input document in which theresolveID()
method will search for the element to be signed. - It extracts the root element from the document passed to it.
- Then it creates an XPath query to search the element in the input DOM document whose
wsu:Id
matches with thewsu:Id
we are looking for. - Then
resolveID()
method runs the XPath query on the input DOM document and returns the matching element.
The next discussion will explain the implementation of actual signature tokens.
Signing a WSS Message Using X509 Binary Security Token
Listing 12 shows the code for the BinarySecurityTokenWithReference class that implements the X509 binary security token of Listing 2. Following points explain the code shown in Listing 12:
- The
BinarySecurityTokenWithReference
class extends theSignatureToken
class. - The constructor of the
BinarySecurityTokenWithReference
class makes a call to the super's constructor. - The
BinarySecurityTokenWithReference
implements agetXMLString()
method. ThegetXMLString()
method authors the XML for the token and returns it in string form. Thewsse:BinarySecurityToken
element in Listing 2 shows the typical XML string that this method authors. - You can see in Listing
12 that the
BinarySecurityTokenWithReference
also implements a public method namedgetType()
, which returns the type of token in string form. This String uniquely identifies the X509 binary security token.
Signing a WSS Message Using X509 Pointer Token
Now have a look at Listing 13 that shows the code for X509CertificatePointerToken
class,
which implements the X509 pointer token described in Listing 6. The
X509CertificatePointerToken
class is similar to the
X509BinarySecurityTokenWithReference
class except that it does not implement
the getXMLString()
method and overrides its own sign()
method.
You can notice in Listing
13 that all the steps of the sign()
method of
X509CertificatePointerToken
are the same as the steps in the
sign()
method of the SignatureToken
class except Step 6:
Step 6 of Listing
13: In this step we insert an ds:X509Data
element as child of
SecurityTokenReference
element. We have discussed the details of
ds:X509Data
element while discussing the Listing 6.
Signing a WSS Message Using X509 Key Name Token
Now have a look at Listing 14 that shows the code for KeyNameToken
class, which implements
the X509 key name token that we described in Listing 5. The KeyNameToken
class is similar to the X509CertificatePointerToken
class in Listing 13 except step 6 that is
described below:
Step 6 of Listing
14: In this step we insert a ds:KeyName
element as child of
wsse:SecurityTokenReference
element. This ds:KeyName
element
wraps the alias that uniquely identifies an X509 certificate.
Signing a WSS Message Using Username Text Password Token
Now have a look at Listing 15 that shows the code for
UserNamePasswordTextTokenWithReference
class that implements the username
password text token, which I described in Listing 3. The
UserNamePasswordTextTokenWithReference
class is similar to the
BinarySecurityTokenWithReference
class (Listing 12) except that it implements
its own setSecret()
method. The setSecret()
method takes a secret
in byte-array form and builds a secret key based on the secret byte array.
Signing a WSS Message Using Username Digest Password Token
Now have a look at Listing 16 that shows the code for
UserNamePasswordDigestTokenWithReference
class that implements the username
password digest token, described in Listing 4. The
UserNamePasswordDigestTokenWithReference
class is similar to the
UserNamePasswordDigestTokenWithReference
class (Listing 15) except that it implements
its own getXMLString()
method as described below:
- You can see in Listing
16 that the
getXMLString()
method first generates an eight byte random number. - Then it encodes the random number in base-64, which forms a nonce value.
- Then it gets the current system time and converts it into UTC format, which is a WSS requirement. This forms a time stamp.
- Then it concatenates the nonce, timestamp, and a base-64 encoded secret key. The token
already knows the secret key from the
setSecret()
method shown in Listing 16. - Then it calculates the SHA-1 digest value over the result of Step 4.
- Now it applies base-64 encoding to the result of Step 5. This is the required digest
value, which it wraps inside the
wsse:Password
element as shown in Listing 4.
In this article my focus of discussion was on tokens. I discussed six different security tokens and demonstrated the implementation details of five of them in Java. I discussed the XML representation of X509 key identifier token in Listing 7 but have not yet implemented this token in Java.