XML.com: XML From the Inside Out
oreilly.comSafari Bookshelf.Conferences.


P2P and XML in Business

July 11, 2001

Following the growth of business-to-business exchanges and supply chain management systems, the emergence of peer-to-peer (P2P) computing is likely to become another deployment arena for XML technology. Whether exchanging user messages, application state, or processing instructions, relaying information effectively is a critical component of any P2P application. By using XML system designers can establish rules for peer interaction that allow developers to build applications independently. From this perspective, one sees how, by facilitating this communication, XML plays an important role in P2P application design.

What is P2P?

As with any technology emerging under the media spotlight, P2P as a whole is open to misinterpretation. Much of the confusion surrounding the term "peer-to-peer" arises from companies applying the label to dozens of distinct types of system. For instance, SETI@home, the well-known distributed computing project designed to analyze data with the hope of finding extraterrestrial life, has little in common with the infamous Napster music community. Similarly, Groove Networks' collaboration system cannot be directly compared to the Jabber Open Source project that focuses on instant messaging. Yet these disparate systems are all are touted as key elements of the P2P movement.

As a result one is hard pressed to find a common technical thread among these P2P applications. Complicating matters further, there exists no field monopolizing these initiatives, as notable contributions to P2P technology have been made in many areas. Nor is there a single industry sector driving the effort. Network equipment manufacturers, open source projects, educational institutions, and scores of unaffiliated independent programmers all have played an important role to further the development of the P2P systems.

Without a suitable definition in terms of technology or contributors, the industry is left to describe P2P in terms of the intent of its supporters. Framed this way, peer-to-peer is best defined as the set of technologies targeted at better utilizing resources that are networked together. Defining peer-to-peer as any system designed with the explicit intent to take advantage of under-utilized networking, disk space, processing, or user resources at the edges of the Internet is the best way to accurately depict the underlying movement while still encompassing all aspects of the technology.

Does P2P make sense?

The timing of this new interest in peer-to-peer technology is interesting. Just when IT managers have begun to adapt to the shift from client-server applications to web-based application services, users are showing new found interest in exploiting dormant resources on their desktops connected to networks. In fact, users are beginning to demand more control over their computing resources every day.

Whether creating chat rooms with colleagues or sharing files with clients directly, users want the ability to use applications without relying on IT departments to set up user accounts or create virtual private networks to support them. For years, IT administrators have been pressured to consolidate IT support operations by locking down corporate desktops and centralizing computing resources. Now they are being told that their systems are too rigid and don't allow users enough control. Not surprisingly, the demand for new peer systems has been met with harsh resistance.

Many IT managers thought that their jobs would be getting more bearable as decreasing server costs allowed them to meet the budgetary constraints of their departments. The pendulum seems to be swinging once again as the indirect costs of under-utilized desktop computing resources have offset the hardware savings of server-centric IT systems. This current shift highlights the continuing oscillation from central to distributed control of computer systems. Those who witnessed the prior shifts, from mainframes to client-server applications and more recently from client-server applications to server-centric ASP architecture, should find the rationale behind P2P architectures vaguely familiar. Looking at computing architecture over the course of the last quarter century, one sees that the P2P movement is the just the most recent phase of this centralized-distributed cycle.

Despite the historical and theoretical justifications of P2P systems, the costs associating with developing, deploying, and supporting client application are not insignificant. So before starting a P2P crusade within an organization, one should be certain it makes economic sense. Although there is much discussion concerning this topic, any viable P2P system should offer benefits that cannot be achieved relying on another computing architecture that is less costly to maintain.

XML and Peer-to-Peer Technology

After determining that P2P technology is appropriate for an IT project, there are several design challenges that will have to be solved before any development can begin. Since pure P2P systems have no central servers for dispatching information between peers, devising a mechanism for peers to communicate is a critical aspect P2P design. And efficiently distributing and storing application data for peer access is not a trivial task since data often has to reside locally on the peer for processing. And managing the updates to the peer application components themselves is of paramount concern as even a simple bug fix can lead to a distribution nightmare. It is no coincidence these are the areas in P2P technology that benefit the most from XML.


XML offers an ideal mechanism to transfer short, structured messages between peer applications. XML can be easily customized for specific P2P systems and readily transmitted over today's Internet protocols. XML data can be encrypted using existing technologies, making it an ideal candidate for secure messages. There are already several implementations of XML-based messaging schemes, including SOAP and XML-RPC.

Data Storage

Utilizing XML to cache application data locally in P2P systems offers several advantages. Caching data in XML allows for more flexibility and easier retrieval than custom or unstructured formats, and it has a much smaller overhead than installing a relational database on each peer. Developers can take advantage of XML handlers to search, validate, retrieve, and manipulate the data needed to support the peer application. This approach will reduce the overall complexity of the P2P system. In many cases XML stores are easier to implement than storing unstructured data directly in the file system and require less system resources to operate than relational databases.

Application Deployment

XML can also be used to help manage the deployment of the application components to peers in the network -- often one of the most difficult challenges of P2P systems. With the potential of having millions of peers interacting, having an effective process to distribute software updates is essential to the long-term success of any P2P system. One XML-based solution to this problem is Open Software Description (OSD). OSD files allow system architects to define the application components required for peer applications along with the location to download these components and any component dependencies. Effectively integrating OSD files into a P2P deployment strategy shifts the burden of software upgrades from the user to the P2P application itself. Each peer can verify that it has the most recent software components and automatically download upgrades if needed.


It seems likely that P2P technology will have an influence on many aspects of the IT industry. Whether or not it can live up to the aggressive hype with which companies are promoting it remains to be seen. Whatever the outcome, XML will continue to play an instrumental role in its future.