Checkmate XML

August 25, 2004

John E. Simpson

As I explained last month, in my final XML Q&A column, my new monthly column will focus on the ways people use XML. The XML ocean is a big one, populated with a good number of whales (EDI, web services, RDF, and so on). But there are also plenty of pilot fish and guppies swimming around -- on up to lesser whales that you may simply have missed. It's these unsung applications I'm interested in.

I also mentioned last month that I'll be covering XML "applications" in both senses of that word:

  • Formally, a conceptual model -- a vocabulary, a schema -- for some knowledge domain represented in XML.
  • Less formally, software for processing XML documents.

This inaugural XML Tourist addresses a subject that lends itself to both meanings of "application" -- the game of chess.

Why XMLize Chess?

When most outsiders think of chess, they no doubt think of playing the game. They may know the names of the various pieces -- pawns, rooks, knights, bishops, king, and queen. They may know, too, that each type of piece has its own unique type of movement -- rooks, for example, can move forward and back and side to side, but not diagonally, while bishops move diagonally only. They may have actually played chess and may have become quite proficient at it. Even if they haven't played it, they may still be aware of the game when it makes the news -- as, for example, when some Grandmaster plays against a computer.

What outsiders may not know about is the devotion of chess insiders to studying games for which the moves have been recorded for posterity.

For at least 10 years, the prevailing method for recording the context and actual play of chess games has been in the form of something called Portable Game Notation (PGN), developed by the newsgroup. The authoritative source of information on PGN is the PGN Specification and Implementation Guide (last revised, apparently, in 1994). You might also want to take a look at a simple overview of the language. Here's a portion of a sample PGN document (extra line breaks added for legibility):

[Event "F/S Return Match"]
[Site "Belgrade, Serbia JUG"]
[Date "1992.11.04"]
[Round "29"]
[White "Fischer, Robert J."]
[Black "Spassky, Boris V."]
[Result "1/2-1/2"]

1. e4 e5
2. Nf3 Nc6
21. Nc4 Nxc4
22. Bxc4 Nb6
23. Ne5 Rae8
24. Bxf7+ Rxf7
25. Nxf7 Rxe1+
43. Re6 1/2-1/2

The square-bracketed information at the top is understandable enough, describing the circumstances under which the game was played. The numbered items, on the other hand, verge on the inscrutable.

Each numbered item represents one turn, from 1 on up to the number of turns it took to complete the game. The pair of codes to the right of the numbers -- the "movetext" -- depicts each player's move on that turn. The capital letters refer to pieces (R for rook, K for king, N for knight, and so on); the letters following the the name of the piece designate a square on the board; and various special characters (like x and + here) denote special circumstances occurring during gameplay (a capture and a check, respectively). The syntax for locations on the chessboard is plain old Standard Algebraic Notation (SAN); from the perspective of the player of the white pieces, the leftmost square in the nearest row is a1, and the rightmost square in the farthest row is h8. This particular game ended with a draw -- that's the "1/2-1/2" -- in turn 43.

Once you get the hang of it, PGN is pretty easy to read. But you're not a computer. For satisfying both human- and machine-readable requirements, naturally some tech-savvy chess aficionados have turned to the charms of an XML-based game notation.

Note: The operative word in that last clause is "some." Skepticism lives on, both inside and outside the corridors of chessdom. Even if you're not a fan, for instance, you may observe that the structure of information about a chess game (as PGN shows well) is serial rather than hierarchical: not your typical XML-friendly application.

A Familiar Scenario: Competing "Standards"

One lesson which chess-in-XML developers have clearly learned from the rest of the XML community is that there's not just one way to express something via markup. Here, for starters, are links to five variations of depicting a chess game using XML (numbers in parentheses are the years of each language's most recent version, as far as I've been able to determine):

I'll cover one of these applications (formal sense) in this column. I'm not covering it because it's necessarily the "best" or most complete, but because it's referenced in other standards and because applications (informal sense) have been built around it: Andreas Saremba's ChessGML (for "Chess Game Markup Language").

Here's a portion of a ChessGML document, taken from Saremba's ChessGML: The Why and Wherefore:

<event>Großes Internationales Schachmeister-Turnier</event>
<player id="Maroczy.Geza" table-ref="sw-2">
<person cbuffId="Maroczy,G">

There's nothing remarkable about this; it captures much of the same information found in the bracketed headers in a PGN document.

A bit more unsettling is the portion of this ChessGML document that records the actual moves of a game:

1. d4 d5 2. e3 Nf6 3. c4 e6 4. Nc3 c5 5. Nf3 Nc6 6. Bd3 dxc4 7. Bxc4 a6
8. a3 b5 9. Bd3 Bb7 10. O-O Qc7 11. Qe2 Bd6 12. dxc5 Bxc5 13. e4 Nd4
14. Nxd4 Bxd4 15. Bd2 Rd8 16. Rac1 Qb8 17. Nd1 O-O 18. Bc3 Qf4 19. Bd2
Qh4 20. Re1 Ng4 21. h3 Ne5 22. Bb1 f5 23. Kh1 f4 24. f3 g5 25. Be3 g4
26. Bxd4 Rxd4 27. Qf2 Qxf2 28. Nxf2 gxf3 29. Ba2 Rd2 30. Bxe6+ Kh8 31.
Kg1 fxg2 32. Red1 Rxb2 33. Bd5 Bxd5 34. Rxd5 Re8 35. Rxe5 Rxe5 36. Nd3
Rxe4 37. Nxb2 Re3 38. a4 f3 39. axb5 axb5 40. Rd1 Kg7 41. Nd3 Re2 42.
Nf4 Re4 43. Nd5 Re2 44. Nf4 Re4 45. Nd5 Re2 1/2-1/2

Yes, that's good old PGN movetext there (hence the sanMoves element name). Why I found this unsettling is because Saremba devotes much of his "Why and Wherefore" paper to arguing the virtues of XML-type notation over PGN.

("[S]oftware has to understand the rules of chess in order to be able to process SAN, which is certainly not the case for standard XML software.") Ultimately, his point is that PGN movetext, while "dumb," is nonetheless understood by existing chess-smart software. But he's also aware of the contradiction; as he points out:

If you intend to process your file with standard XML tools, however, the content of the sanMoves element will be just a sequence of characters without any semantics for the tool, so it's necessary to transform it into something meaningful.

Consequently, he also offers up an alternative pure-XML (and attribute-heavy) notation, such as this:

<moves ply-count="90">
<m c="w">
<p c="w" n="p"/>
<sq n="d2"/>
<sq n="d4"/>

This is output from a Java program, supplied with the ChessGML distribution, which transforms SAN-based movetext into XML.

This has one undeniable advantage over its PGN ancestor. Like the rest of the ChessGML document, this one can now be processed by "standard XML tools" -- say, an XSLT transformation.

Which brings us to the informal sort of application.

"Doing Chess" with ChessGML

The first and most obvious thing to do with raw ChessGML is to transform it to (X)HTML, to depict a simple game or an entire tournament. Saremba himself offers up a good example of this application. In his example each numbered cell is a hyperlink to the move-by-move play of a single game -- expressed as PGN movetext. As you can see from any game's page, the movetext (taken from the non-XMLized sanMoves element) is simply dumped to an area at the left of the page, without modification.

What if you don't know PGN?

One (very clever) next stage is Greg Griffiths' Javascript Chess Simulator. The full demo allows you to see a game in progress, move by move, by clicking on the corresponding hyperlink. As you'll see in Part 3 of Griffiths' article, the XML behind this example is not actually ChessGML (notably, the moves themselves are expressed as XML). But it definitely suggests the possibility of a pure XML-style ChessGML solution.

A little further along the road of complexity, check out RenderX's Chess Viewer demonstrator application. Here, again, the source document is a highly simplified version of something like ChessGML. In this case, though, the output is a static but high-quality print document, created via XSLT transformation from source document, to XSL-FO, and finally (through XEP, RenderX's own XSL-FO rendering engine) to a PDF document.

But for the best demonstration to date of what a pure ChessGML application might be, see Max Froumentin's ChessGML to SVG demo.

You'll need to have an SVG viewing engine installed, such as Adobe's SVG Viewer, in order to visit that page. This is an animated demonstration. Icons representing the pieces move across the board, and with each move-and-response pair a scrolling window shows you the corresponding turn in PGN -- even though the underlying source describing the moves is in Saremba's alternative XML-only format. (The demo alone, without Froumentin's notes, is also available.) Be sure to watch the game play through to the end, to see all the pieces go floating back to their starting points. Very cool.

Figure 1: Playing chess with XML
Figure 1: Playing chess with XML.

Froumentin includes a pointer to Peter Watkins' interactive chessboard ; this enables you to move pieces on the board yourself, outputting -- in a small area at the bottom right -- the PGN notation for the entire game you're "playing:" another impressive piece of coding. In the figure at the left, you can see what this looks like in practice. The piece I've clicked on in order to move is the white queen (row/column 5B). When I click on that piece, squares around it light up in color: in this example, blue for available moves without captures and green for available moves with captures.

Some six years ago, the estimable Peter Murray-Rust began an entertaining thread on the XML-DEV mailing list, the subject of which was "XML Is Boring." One of his correspondents was Simon North, who agreed -- and urged his readers (almost exclusively XML developers and other early adopters), to "have a look at XML chess or some other 'sexy' application."

"Sexy" isn't a word I'd naturally apply to chess, so I appreciate North's scare quotes. Still, showing someone chess pieces drifting back and forth across a board is leagues ahead of a demonstration with angle brackets and ampersands. I can't promise all the applications I'll visit in the XML Tourist column will be as "sexy," but I promise to keep looking.

If you're the proprietor of an XML application (of either kind), guppy- or narwhal-sized, feel free to to let me know about it. In any case, thanks for reading.