XML Transactions for Web Services, Part 2
In the first part of this series, I introduced federated web service applications and transactional web services, including brief descriptions of of the WS-Coordination and WS-Transaction specifications. In this article I discuss and demonstrate the operation of atomic transactions in web services.
The next section discusses the various protocols defined by the WS-Transaction specification to enable atomic transactions. In this section, we will also explain how the different protocols will be used in our sample application scenario of the second section.
The last section will provide a comprehensive demonstration of atomic transactions using the WS-Coordination and WS-Transaction framework. This section will provide the actual XML code listings that will be used in the exchange of data during the course of a typical atomic transaction.
-->What is an Atomic Transaction?
The WS-Transaction specification defines a set of protocols to support atomic transactions. The term "atomic transactions" is not specific to web services, of course. It is a concept well known in database applications. In fact, "database transactions" is effectively a synonym for "atomic transactions". WS-Transaction defines the concept of an Atomic Transaction (AT) based on the proven concept of atomic database transactions.
Since web service ATs are conceptually similar to database transactions, it would be useful to consider them. Generically speaking, you use database transactions in the following way:
-
Create a new connection to the database to mark the start of a new transaction. This creates a new transactional context.
-
Add some SQL statements into the newly created transactional context. You can think of each SQL statement as a database activity (read/write operation). The statements do not execute immediately; rather, the changes produced at this stage are temporary and can be rolled back any time.
-
Next, you commit the temporary changes. When you ask the database management system to commit the activities of the transaction, the system will make a commit attempt.
-
If there were any problems with any of the activities in the transaction, you can roll back the entire set of activities. If you decide to roll back, the state of the database will revert to its original state before the start of the transaction. This means, in case of a roll back, the database will be in a state such that the transaction never happened at all.
-
If there were no problems with any of the activities in the transaction, your commit operation will succeed, thus making the temporary changes permanent.
So how does WS-Transaction define an AT? An AT has the following characteristics:
-
It results either in the commitment of all activities or none of them. Activities involved in an AT are treated as an indivisible whole. The entire set of activities -- the atomic whole -- either succeeds or fails.
The success of the atomic whole is normally referred to as a "Commit" operation, while the failure of any single activity results in a "Rollback" of the set of activities which constitutes the AT.
-
We have seen that a database transaction considers the results of a transaction to be temporary until they are either committed or rolled back. A logical implication of this temporary storage of results is that the transaction are isolated from the rest of the system. If transactions are not isolated, other database operations and transactions may alter the database during the execution of one transaction, thus producing inconsistent or inaccurate results.
Isolation in database transactions is a technical issue. But it implies two important side effects. When we try to keep a transaction isolated, we have to temporarily lock some of the database resources, which means these resources will not be available to other applications during the execution of the transaction. This leads to two side effects:
- A transaction should take a little time as possible, freeing locked database resources as soon as possible. Thus, database transactions are short lived.
- Only trusted users and applications should be allowed to access the transactional features of a database; being able to lock resources indefinitely is a denial of service attack which may be launched by malicious users.
Each of these points is applicable to all ATs. An AT should always be short lived and should operate within a trusted domain.
An Application Scenario
Let's consider the enterprise implementation of the PC assembler that I
introduced in the first article. Recall that the bookOrder
method implementation has to ensure the availability of components
required to fulfill an order from the distributor.
The PC assembler's enterprise application implementation consists of various independent modules; those which will take part in the AT include the Sales, Inventory, PC Assembly Line Management, Accounts, and Database modules.
The bookOrder method is part of the sales module. When
the sales module receives a bookOrder method invocation call,
it performs the following activities, each of which is part of an AT:
-
The sales module asks the inventory module to check the availability of all the required components. If the required components are available, the inventory module will make them available to fulfill the current order. Depending upon business logic, the inventory module may ask the database module to mark those components as "booked" for the current order.
-
The sales module then asks the PC assembly line management module to check whether the PC assembly line has the manufacturing capacity to fulfill the order. If the capacity is available, the PC assembly management module will ask the database module to mark the manufacturing capacity as "booked" for the current order.
These two activities are part of an AT and both should occur simultaneously. In order to commit the transaction, the inventory should have the required components and there should be enough manufacturing capacity in the assembly line. If any one of the two requirements is not met, both the activities will be rolled back.
As evident from the two activities, the database module will have an active involvement in this AT. For the sake of simplicity, we have shown just one database module for the entire set of activities. In actual practice, there may be several databases in an enterprise scale application.
The database module has the task of managing persistent date storage and, thus, will be responsible for updating the outcome of the transaction in its data storage facility.
In addition to the two activities discussed above, the PC assembler's business logic has another requirement associated with the same AT:
-
The sales module will require the involvement of the accounts module in this transaction. The accounts module will be a silent listener. It will not directly participate in any activity in the transaction. But in order that it keep a record of the relevant accounting information it will be informed about the outcome of the AT.
Now that we have elaborated the application scenario, it's time to discuss the various WS-Transaction protocols that will be used to implement the AT.
Protocols for Atomic Transactions
Each participant has a role to play in the AT. The role of the participant depends upon its responsibilities. WS-Transaction defines five protocols for atomic transactions, each of which is actually a role that a participant will play while taking part in an AT.
Each of the five protocols defines a two-party communication mechanism, which means there will always be exactly two parties communicating according to any of the five protocols. One of the communicating parties will be a participant and the other will be a coordinator.
The coordinator provides two important services, Activation and Registration, to enable the AT. The participant and the coordinator form the two ends or two ports for each of the five protocols. So each protocol has a coordinator and a participant port.
Thus, all participants of an AT will talk to the coordinator in order to fulfill the business logic requirements of the AT. This is represented graphically in Figure 1, where all the participants of our sample AT scenario are shown interacting with a coordinator. Don't worry about the arrows and numbers in Figure 1. The last section of this article will explain Figure 1 in detail.
|
| Figure 1: Exchange of Messages during the course of an AT |
The five protocols for atomic transactions include the following.
Completion Protocol
A participant who wishes to initiate the commit or rollback of a transaction will use this protocol. Recall that in general an RDBMS application creates a new transactional context, throws SQL statements into the transaction, and then initiates a commit. Thus, the participant which initiates an AT will also initiate the commit or the rollback and will register for the completion protocol.
In our application scenario, the sales module initiates the transaction and will also ask the coordinator to try to commit. Therefore the sales module registers for the completion protocol.
At the end, when the transaction concludes with either a commit or a rollback, the coordinator will send a notification back to the participant who has registered for the completion protocol. The notification conveys the information about the outcome of the transaction.
CompletionWithAck Protocol
This protocol is the same as the completion protocol, except for one detail. The participant which registers for the CompletionWithAck protocol will need to send an acknowledgment back to the coordinator after receiving the outcome notification.
2PC Protocol
The "two phase commit" idea from transactions in distributed database applications. Imagine several database management applications participating in a single transaction. How can you design a transaction control system in such distributed environments?
An obvious strategy is to implement distributed transactions in two phases:
-
The first phase (the prepare phase) is akin to asking all database applications whether they have any problem with any of the activities we want them to perform during the transaction. In the prepare phase, the applications are not required to actually do anything. They are only required to report any problems they may encounter if they make the requested changes. If any of the participating applications reports a problem, the prepare phase will fail and the transaction will roll back.
-
If the prepare phase goes successfully, we are sure that none of the database applications have any problem with the changes required by the transaction. So we can go ahead with the actual commit operation and ask all the participants to perform the actual changes already conveyed to them before the start of the prepare phase.
This type of two-phase commit allows everyone to prepare before starting to perform a coordinated activity.
In our application scenario, we have included just one database module for the sake of simplicity. The database module will register for the 2PC protocol.
PhaseZero Protocol
The name "PhaseZero" suggests a phase that occurs before the start of the first of the two phases in the 2PC sequence. In our application scenario, the inventory module checks the availability of components with the database module and then asks the database module to mark the components as "booked". This requires the database module to know about the requirement of booking components for this order before the start of the 2PC sequence. Once the 2PC sequence has started, the inventory module will not have the chance of sending any more data to the database module.
Similarly, the assembly line management module also needs to tell the database module to book the assembly line capacity before the start of the 2PC sequence. The PhaseZero protocol provides an opportunity for applications to send data updates before the start of the 2PC sequence.
Therefore, both the inventory and the PC assembly line management modules will register for the PhaseZero protocol. You can think of PhaseZero protocol as an opportunity for middleware applications to flush any outstanding data to database applications before the start of the 2PC sequence.
OutcomeNotification protocol
This protocol is for applications that want to act as silent listeners in an AT. Such applications will have no active participation in the transaction, but they will be informed about its outcome.
In our application scenario, the accounts module will register for the OutcomeNotification protocol.
Pages: 1, 2 |