To Tag or Not to Tag

May 26, 2004

Patrick O'Kelley

The New Variorum Shakespeare and XML

Every generation remakes Shakespeare for itself with new costumes, new set designs, and new interpretations. But, despite numerous advances in humanities computing, variorum editions of the works of Shakespeare have relied on models established well before the digital age. Since the nineteenth century, scholars have slowly worked through the plays and sonnets to create definitive, variorum editions that include the variations of each line of each play as well as all of the major critical commentary. The undertaking, which was assumed by the Modern Language Association (MLA) in 1936, is obviously mammoth. But until recently, the variorum editions -- thick with special typographical marks and a complex web of cross-references -- were prepared solely as print texts running hundreds of pages long. Later this year, however, the MLA will bring the New Variorum Shakespeare (NVS) project into the world of XML for the first time.

What is a Shakespeare Variorum Edition?

Paul Werstine, General Editor of the New Variorum Shakespeare editions, says, "the essential purpose of the New Variorum Shakespeare has always been to provide a detailed history of critical commentary together with an exhaustive study of the text.... In its inclusiveness and its historical orientation the NVS differs from regular scholarly editions."

The whole idea of creating Shakespeare variorum editions began with Horace Howard Furness's (1833-1912) publication of his first variorum Shakespeare in 1871. Furness, a member of the Shakespeare Society of Philadelphia, was troubled by the lack of historical context for variant readings of the plays, and he set out to provide the bibliographic apparatus to fill this gap. Furness' son, Horace Howard Furness, Jr. (1865-1930) continued in his father's footsteps, completing his last Shakespeare editorial work in 1928.

Since 1936 the MLA has shepherded 9 NVS plays into print, with two more nearing the final stages and others in various stages of completion. Each is a massive undertaking for an individual scholar, as the weight of new literary criticism continues to mount each year. "It is really time consuming. We're talking decades," says Judith Altreuter, Print and Electronic Production Director at the MLA.

XML for Shakespeare Variorum Editions?

In the 1990s, the MLA turned to the Center for Electronic Texts in the Humanities (CETH), then a Princeton/Rutgers organization, to produce a feasibility study regarding the possible production of electronic editions to accompany the print versions of the NVS.

The MLA had several reasons to seek advice toward a digital strategy. Given the physical size of the print format, NVS editions, like dictionaries, were an obvious candidate for digitization simply to make them more compact.

At the same time, the electronic form needed to be malleable enough to suit a variety of research uses. One could imagine multiple front-ends, each optimized for a particular purpose. While the MLA did not have specific, immediate clients to serve, they needed to produce a text that could be adapted quickly to future electronic research needs.

Finally, some early e-text projects had met with failure as the proprietary software used to create and read the texts became outdated. Given the life-cycle of a typical variorum edition (10-30 years), the editors needed to know that the software or encoding they used at the beginning of the project would still be viable when the edition was completed.

The CETH researchers (including the author of this article) proposed Text Encoding Initiative (TEI) P3 SGML (the most recent update, the 2002 P4 version of the standard, moved to XML while remaining backward compatible) as the single best solution to the MLA committee's question of which electronic format to use for the NVS. By the mid-1990s, the TEI standard already included detailed guidelines for encoding a rich and complicated play with extensive commentary, like the NVS editions. The CETH group also prepared a full-demo for the MLA of an SGML marked-up version of the first few pages of the New Variorum Antony and Cleopatra, which had been published in 1990.

The TEI, an "an international and interdisciplinary standard" under the auspices of the Association for Computers in the Humanities, provided the academic clout and computing experience necessary to promote the comprehensive, long-term, stable solution that the MLA was looking for. And, as the CETH group demonstrated, TEI encoding could capture the sophisticated apparatus that accompanied the primary text.

Still, simply deciding to use the TEI was not enough. The MLA needed to commit resources to determine precisely how the flexible tagging system would be implemented. And, of course, they needed a play to work on.

Winter's Tale in XML

After several years of discussion and debate, the MLA is now developing an official encoding plan. Winter's Tale, which is expected to be completed in late winter, will be the first NVS edition to appear since the CETH recommendations were made, and the MLA confirms that they will, indeed, be providing a CD-ROM in the back of the print edition that includes a full TEI XML version of the play. "[TEI XML] seemed like a clear choice for a project of this scope and importance, because it offers a well-tested basis for high- quality XML encoding and because it has a strong institutional and organizational basis (hence likely to exist and be supported well into the future). There isn't really any other encoding system that would be adequate," says Julia Flanders.

Flanders, who met MLA's Judith Altreuter at a TEI training seminar, was taken on as a consultant by the MLA. Flanders has 12 years of humanities computing experience on the Women Writers Project at Brown University, and the consulting group she works with, Ridgeback, is creating a specification and detailed documentation for NVS encoding. It is also doing the actual tagging of the Winter's Tale in association with Altreuter and the NVS editors.

While the encoding plan is still in prototype mode, Flanders provided some samples that suggest the direction the Ridgeback group is going. The tagging will capture several different kinds of information, she notes:

  • "Structural information about the text as a whole"
  • "Details of bibliographic references"
  • "Cross-references and other linking information"
  • "Editorial apparatus"
  • "Some renditional information (or rather, encoding that can be used to motivate formatting: e.g. names that will be highlighted in the print version, foreign-language words, etc.)"

This excerpt of a few lines from Winter's Tale (which Flanders stresses is in a "preliminary state") includes the beginning of a new act (2) and scene (1), a stage direction, and dialogue. The prefix "tln" is an acronym for "'through line numbering' which refers to the First Folio lineation and provides the overall internal reference system for the play text and notes," Flanders says.

<div1 type="act" n="2">

<div2 type="scene" n="1">

<lb id="tln.583"/><head type="scene">Actus Secundus. Scena 

Prima.</head><note type="asn">2.1</note>

<lb id="tln.584"/>Enter Hermione, Mamillius, Ladies: Leontes,

<lb id="tln.585" n="585"/><stage type="enter">Antigonus, Lords.</stage><?sgmlp 

pgbrk pg="140"?>

<lb id="tln.586"/><sp who="Hermione"><speaker>Her.</speaker>

<p>Take the Boy to you: he so troubles me,

<lb id="tln.587"/>'Tis past enduring.</p></sp>

<lb id="tln.588"/><sp><speaker>Lady.</speaker>

<p>Come (my gracious Lord)

<lb id="tln.589"/>Shall I be your play-fellow?</p></sp>

<lb id="tln.590" n="590"/><sp who="Mamillus"><speaker>Mam.</speaker>

<p>No, Ile none of you. </p></sp>




The <?sgmlp pgbrk pg="140"?> tag references the actual page break in the print edition, so that XML can be mapped to PDFs of the print pages if needed at some time. The attribute "asn" stands for "act/scene number" and provides an alternate notation. Flanders writes that "the 'n=' attribute seems in this example to duplicate the tln number, but there are cases where the two get out of whack, so this isn't as redundant as it looks here."

The sample textual commentary note that follows demonstrates how quickly the NVS can become complicated. Here, the note discusses a fragment of a single line but also cross references the bibliographic entry for "Brook," creating a new chain of connections:

<note id="cc.575" target="tln.511">

<p><app><lem>declare</lem></app> <name type="author">Abbott</name> 


<quote>The Subjunctive after verbs of command [<emph 

rend="italic">coniure</emph> &lpar;509&rpar;&rsqb;&hellip;is especially 

common.</quote> See also <ref targType="bibl" target="b.bro76"><name 

type="author">Brook</name> &lpar;1976, p. 107&rpar;</ref>. Cf. n. 


The "bibl" reference (a target from the above note) gives a glimpse at the kind of entries that will populate the full bibliography of the NVS XML Winter's Tale:

<bibl id="b.bro76"><author>Brook, G&lsqb;eorge&rsqb; L.</author>

<title level="m">The Language of 


While the tagging is going to be solid in the edition that arrives this fall, the MLA is still debating what, exactly, to package on the CD-ROM. There will probably not be a front-end to read the XML text, and some discussion has involved including PDF files linked to the marked-up play. But Altreuter believes the important thing is that they have entered the digital age. "We want to put this out there and let people play with it," she says. "We had to start somewhere."

Werstine is hoping that members of the humanities computing community will step up to build on the efforts of the NVS editors. "It is also our hope that someone will be sufficiently interested when they get XML version of the Winter's Tale to see what can be done with it," he says. "We are hoping that people will give their work to MLA."

The Future of the NVS

So what is down the road for the NVS editions? In the near term, the MLA will be changing some of its basic processes. "At a minimum, the XML will be used to generate the printed books," says Flanders, "but in addition I expect that in the future it may serve as the basis for some kind of electronic edition to accompany the print. In such a case, we can imagine that we'd want to provide for various kinds of searching and analysis, but the specifics remain to be determined." One could also imagine, as Werstine does, a number of the texts being made available in a single, searchable database, a "docuverse". And Professor Braunmuller, the current chairman of the MLA's NVS committee, is hoping that the NVS editors themselves will do the markup as part of their preparation of the text.

But the biggest area for innovation is likely to come from the open-endedness enabled by a digital text. "The moment that you publish [New Shakespeare Variorum editions], they are out of date," says Altreuter, thinking of all the new scholarship that hits the presses even a month after a volume is bound and shipped to bookstores. Though she doesn't have a business model worked out, her personal vision imagines that the texts could be part of a Website that allows ongoing expansion and annotation -- a true community effort.

Of course, to make this leap would require grant money of some kind, since open access to the texts on the Web would preclude the self-sustaining support that comes from sales of the print editions. It would also require a shift in the traditional notion of authorship in an academic edition.

There are precedents for academic editions of Shakespeare made available on the Web, though on a smaller scale. The Enfolded Hamlet, for example, provides a simple interface for searching Bernice Kliman's The Enfolded Hamlet text, which includes both the Second Quarto and First Folio editions of the play. And the Web is already home to numerous open-access communities for scholarly discussion, though the need for editorial filtering still remains a topic of debate.

For now, the NVS team members are excited to see the first step -- the move to XML -- finally being taken. What role will the NVS electronic editions play in the academics debates of the coming years, and how will the editions adapt to the changing technology remain to be seen. "The scholarly community has to tell which direction to go," says Altreuter. By building all future NVS editions on the foundation of XML, though, the MLA has already helped provide some direction itself.

For more information about the NVS project contact: David G. Nicholls, Director of MLA Book Publications (