XML.com: XML From the Inside Out
oreilly.comSafari Bookshelf.Conferences.


XP and XML

February 19, 2003

I discovered Extreme Programming (XP) in April 2001, at the SD West conference, through a workshop by James Grenning and Robert Martin and a presentation by Chet Hendrickson and Ron Jeffries. Since then I have been vexed to see that XP is out of my reach for two reasons. First, I am mostly interested in XML applications, and XML isn't that great when you are using XP practices. Second, and more irritatingly, I work remotely from home and this is absolutely not XP kosher.

However, the more I think about it, the more I am convinced that both XP and XML could benefit from working more closely together. And there may even be some hope for remote pair programming. I can't pretend to have real experience with XP but only with some of its practices, which I have been able to follow despite my remoteness. Therefore, most of this article is theoretical, but I hope that these ideas will still be useful.

The links between the XP and XML communities were covered last year in an XML.com article by Leigh Dodds, "XP Meets XML". I will cover the technical relationships between XP and XML in the remainder of this article.

I am not proposing to compare apples and oranges: I know that XML and XP don't share anything except the fact that they are young concepts defended by quasi-religious communities. Even the superficial name similarity is misleading: XP stands for eXtreme Programming, while XML stands for eXtensible Markup Language. XP and XML don't even operate at the same level: XP is a methodology and XML is a technology. That said, I like both and they don't seem to be incompatible. Leigh Dodds has shown that if XP zealots could find XML too complex to be "extreme", XML users are more inclined to be "extreme". My goal is to explore in more detail how these acronyms play together: how XML as a technology might help XP programmers, and how well XP as a methodology is adapted to developing XML applications.

20 Second Guide to Extreme Programming (Part 2)

I won't introduce Extreme Programming again, so please consider this to be the continuation of "10 Second Guide to Extreme Programming" in Dodds' article. In addition to Dodds' guide, we need to say a word about the "twelve practices" that are the foundations of XP:

  1. Planning Process: the planning is based on the description of elementary features (the "user stories") which cost is evaluated separately (and re-evaluated as many times as needed over the course of a project).
  2. Small Releases: a simple version of the system is released at an early stage and frequently updated.
  3. System Metaphor: the naming conventions are standardized.
  4. Simple Design: the design should be the simplest possible to meet the current set of requirements.
  5. Testing: tests are essential; unit and acceptance tests are written before the code and new tests are added for each bug found.
  6. Refactoring: the system is continuously revised and duplications of code are removed.
  7. Pair Programming: the code is written by pairs of programmers working on the same machine.
  8. Collective Code Ownership: any code segment belongs to the whole team.
  9. Continuous Integration: the system is built and integrated several times a day.
  10. 40-hour Week: developers are kept "fresh, healthy, and effective".
  11. On-site Customer: the customer is an integral part of the project.
  12. Coding Standards: everyone must write the code in the same way.

What XML could do for XP

The XP practices rely heavily on communication between team members and a fluidity of the code that is continuously refactored. Communication and fluidity are where XML excels. One might think that implementers of XP tools and even XP users would have been eager to use XML pretty much everywhere. This is not really the case and even though a few XML applications have been developed for some practices, I haven't found any cross practices effort to define a XML framework that could facilitate XP as a whole.

Being feebly tooled appears to be a more general characteristic of XP. During my research I was surprised to see that there doesn't seem to be an "XP IDE" taking care of the twelve practices; on the contrary, XP developers rely on conventional tools: text editors, browsers, CVS repositories, Wiki-wiki webs, instant messengers or IRC. The two practices which are the most advanced in terms of tools are probably testing, with the development of the "xUnit" test frameworks, and refactoring, traditionally strong in the Smalltalk world, which has greatly influenced XP.

Is it the influence of the "Simple Design" practice which recommends never writing any piece of code "for the future" if it is not immediately needed? This principle is pretty much a killer for the development of tools, which by definition are written to be used in the future. Or is it because XP defines itself as the mindset to bring best practices up to their extreme more than a methodology and a mindset isn't a matter of tools?

Whatever the reason, the fact is that very little effort seems to have been put into even integrating these traditional tools, not to mention developing a complete IDE for XP developers. If there are few tools and little integration, there are few opportunities for using XML.

As a remote worker, I can't pretend to have ever practiced XP. Even though I am not a XP specialist (or maybe because I am not one), I would expect benefit to come from better tools and integration, especially in the "Planning Process", "System Metaphor", "Testing", "Refactoring", "Pair Programming", "On-site Customer" and "Coding Standard" practices.

Planning Process

The basic atom used by XP's Planning Process is the user story. The definition of the user story is pretty simple. Ron Jeffries has said that "a User Story is a story, told by the user, specifying how the system is supposed to work, written on a card, and of a complexity permitting estimation of how long it will take to implement." User stories have also been defined as being "in the format of about three sentences of text written by the customer in the customers terminology without techno-syntax."

This seems so simple that it would hardly deserve an XML "techno-syntax", but it's only one piece of the problem. Other information needs to be added such as scheduling information and acceptance tests. The fact that we need to glue all this together and eventually associate the engineering times consumed to implement the story, and later keep a track of the origin of each acceptance test, is a good call for something more formal which could be implemented in XML.

An XML-based proposal (PDF) has been made by Karin K. Breitman and Julio Cesar Sampaio do Prado Leite which covers not only the user story (called a "scenario") but also the description of the actual task of implementation and a glossary which could be used for establishing the naming conventions needed for "System Metaphor". Unfortunately, this proposal doesn't seem to have got much traction. Among the benefits of this approach, the authors note the consistency checks which can be done on the user stories when they are expressed in XML should improve their level of quality.

System Metaphor

Already mentioned in Breitman and Leite's proposal, the system metaphor is also something which can be formalized. Any format could be used and the glossary might be stored in a RDBMS, a spreadsheet, or a CSV document. However, formalizing it in XML would give the flexibility to publish it on various medias and enable consistency checks with different documents including user stories.


XP defines two different types of tests: acceptance (or functional) tests, which test a user feature "end to end" over the complete system, and belong to the users or customers; and unit tests, which test the behavior of a single class or function and belong to the programmers.

For the acceptance tests, a XML format would seem interesting, especially if they are attached to user stories which are themselves expressed in XML. The case for using XML is probably stronger for acceptance tests than for unit tests, because acceptance tests are owned by the customer. It seems like a good idea to decouple the way to express these tests from the technologies and languages used by the implementation.

Outside the scope of XP, many test suites are written in XML, but they appear to be pretty specific to each application. With Relax NG for instance, James Clark published a test suite as an XML document including a set of schemas, either valid or invalid, and for each valid schema a set of valid and invalid instance documents. If Clark had defined the test suite using jUnit, the unit test package for Java, the language he has used for his implementation, it would have been tough for me to use it to test my Python implementation. Since it is expressed as XML, this test suite is easy to use to generate documentation and individual documents which can be processed using generic unit test libraries (in my Relax NG implementation, XVIF, I use the unittest package to perform the actual testing).

Even though such test suites will likely remain specific (you don't define the tests for a schema language in the same way than you define the tests for a user interface), at a high level there are many invariants in a test definition: one or more inputs are processed to generate one or more outputs which are compared against expected results. It should thus be interesting to define an extensible vocabulary to express these invariants.

This is more or less what has been proposed by Bill la Forge in JXUnit, with a focus which is more on unit tests than on acceptance tests. In a nutshell, JXUnit is the integration of a XML binding framework together with jUnit. Test cases are expressed as XML, the Java objects are created by the binding framework, and jUnit is called to perform the tests. Bill la Forge sees the following benefits to separating the tests from the code, and these benefits apply equally very well to acceptance tests:

  • Test data can be edited, making it easy to add additional test cases;
  • test data can be validated, as a means of reducing "false failures"; and
  • test data can be externally generated or captured from a production process.

The case for separating test data from the code is weaker for unit tests, unless one tried to combine the paradigm of XP with Literate Programming, another methodology which I find very promising. As proposed by Norm Walsh, "Literate Programming in XML" stores in XML fragments both the documentation and the code of a program unit. In such a context, this fragment could be extended to include the definition of the unit tests for the same program unit. Grouping each unit of code with its documentation, tests, and source in a format which is easy to process with XML tools and can be stored in a repository seems really powerful.


Along with unit testing, refactoring is the second practice where a number of tools are available. Support for refactoring is integrated into many IDEs and they seem to do pretty well without XML. That said, if we had implemented the "eXtreme Literate Programming" framework mentioned above, evaluating the impact of a refactoring would be much easier since the relations between code units would be formalized as XML. Code units would contain the source, tests, and documentation grouped together, and the relations between code units and conformance tests could also be formalized to evaluate the possible impact of a modification of a unit of code on the system.

Pair Programming

Pair Programming, or at least face to face pair programming, is the most restrictive XP practice since it goes against three major trends in Western society:

  1. An increasing number of employees now work from home. This trend is dramatically reducing the time lost in physical commuting and the office square meters needed by the companies to keep all their employees in a single location. It is usually seen as an improvement of the working conditions.
  2. Companies are more and more interdependent either because some tasks are externalized and subcontracted to other companies, which may be located on the other side of the planet, or because they build partnerships to cooperatively develop projects which are too big for one of them.
  3. Open source projects are becoming part of the strategy of major players in the IT business, and they group employees from different companies at different locations together with independent consultants, students, and other volunteers often working by their own.

These three trends mean that it will be increasingly restrictive to require that all the members of a team are located at a single location, the prerequisite for face to face pair programming, and that virtual pair programming will become a necessity.

The concept of virtual pair programming isn't new and many papers have been written on the subject. Pair programming can be implemented without XML using standard conferencing tools. However, I think that using an XML infrastructure to build the support of pair programming into the IDE itself could make the concept of virtual pair programming fly.

Some will argue that text based conferences are not sufficient for effective virtual pair programming and that voice and even video is needed. I am not convinced: daily use of IRC makes me think that simple text based conference systems are enough to let you "see" what your partners actually "think". Even if I am wrong. one can always complement the built-in collaborative features of an IDE with traditional audio or video conferences. The reverse seems to be less obvious, and I am not that sure that the generic file or desktop sharing features of audio or video conference systems can be as well adapted to virtual pair programming as the collaborative features of an IDE.

Pages: 1, 2

Next Pagearrow