Taking the Pulse of XML Editing
October 1, 2003
In February, roughly coincident with XML's five year anniversary as a technology standard, I wrote an XML-Deviant column ("The Pace of Innovation") in which I discussed the future of XML, focusing on what is widely thought to be a slackening of the pace of its innovation. People react differently to this perceived innovation bear market, and at least some XML developers have expressed gratitude at being granted a bit of breathing room in which to catch up. Others suggest that there is just as much innovation in XML as there ever was, especially if we allow ourselves a minor redefinition: the innovation today happens not so much to XML as with it.
In trying to reach reasonable assessments, about both the state of XML and the pace of innovation, I examined one of the persistent complaints about XML:
Every programming language has one, if not many ways to create XML programmatically, many of them clever indeed. But the "XML editor for humans" area remains underdeveloped, particularly if we judge maturity by reference to more than five years of complaints about and claims for more tool support, by both vendors and advocates alike. Have we all simply misjudged how hard it is to build XML tools, including editors, with which ordinary people can create XML content naturally and simply?
The problem with XML developers, and the XML-DEV list in particular, working out answers to this question is that XML programmers, to put it bluntly, are not ordinary people. We typically have very little understanding of what it's like to create XML as an ordinary user. Our judgments about the maturity of various end user tools are likely to be colored by our experience, knowledge, and training.
Thus, when I got a chance recently to attend a one-day conference of authoring and editing vendors, my only question was whether the conference was pitched to developers or managers. I would ordinarily avoid a conference pitched to managers because, well, I'm not a manager. But in this case it was important because I wanted to check my views, hunches, and surmises about what's natural and simple about creating XML against the views, hunches, and surmises of people who are interested, invested, but not expert in XML.
In the rest of this column I do two things: first, I describe the interesting bits of the most interesting vendor presentations I saw at a one-day conference in the Washington, DC, metro area earlier this week; second, I offer some impressions and opinions about the state of XML tools represented by these vendors.
The Vendors and Tools
Smartdraw's VisualScript is an interesting editor for at least two reasons. First, users of the tool create XML by manipulating various kinds of visual representations of both high-level domain objects and processes, as well as visual representations of various XML patterns (like: "parent with child" and "empty element with attributes"). Think: I manipulate some tinker toys and XML is spit out at the end. The user never has to see any textual XML. Second, the array of XML vocabularies which VisualScript can produce out of the box is not trivial: XSLT, XHTML, XForms, WXS, RSS, CPA, BPSS, BOD, BPEL, WSDL, UDDI, SMIL, HTML+TIME. Developers can extend the tool by creating new libraries of symbols which represent a new XML vocabulary. Users manipulate these and pre-defined symbols in order to create XML instances.
Actually, VisualScript isn't interesting because of those two facts considered separately but, rather, because of the conjunction of those two facts. In other words, it's an interesting tool because its interface is aimed at non-technical users but its vocabulary coverage is extremely broad and very technically-oriented. One of oddities of the VisualScript presentation, however, was that the vendor rep failed to give the barest hint to the audience that, having generated reams of complex, multivocabulary XML, someone was going to have to write an awful lot of programming code to consume it.
Corel's product offering is notable because, looking at it from the outside, it might seem a hodgepodge. Corel owns WordPerfect and acquired SoftQuad's SGML-XML business. That gives Corel the successors of the first commercial SGML, HTML, and XML editor and one of the best selling MS DOS applications of all time -- an odd mixture.
Out of this range of acquisitions Corel has put together an XML authoring-editing product, XMetaL, which seemed to me fairly compelling for a range of uses. XMetaL, like many of the editors on display at this conference, is really a family of products: a version for developers, a "thick" client for end users, and a "thin" client (a Windows ActiveX control, apparently) for use in-browser. This pattern seems to be shaking out as a kind of law of the genre; many vendors have partitioned the space in just this way: developer tool, thick client, thin client.
There are two notable bits. First, given Corel's graphics tools, XMetaL has impressive SVG integration. If I had a group of end users who needed to do lots of stuff with SVG and XML creation, I'd probably give them XMetaL, and the Corel graphics tools, on that basis alone. Second, XMetaL and WordPerfect are customizable and extensible in a wide variety of ways. For XMetaL alone, you can extend and customize it via Java, COM, ActiveX, and WSH. I'm neither a Windows user or developer, but that's a fairly ecumenical range of extensibility options.
The last vendor I want to mention at length is Xcential, a company based in San Diego. Xcential has built a system to create and manage legislation for the State of California. This system has some noteworthy aspects.
First, it's the only one which mentioned RDF, and I'm an RDF user and fan. Second, LegisPro demonstrates that there are seriously complex XML systems to be built which are very domain-specific. While I may be the only person who ever thought this, though I doubt that, there was a time when I thought that having data in XML meant that one would always get some benefit from generalized XML tools, particularly editors. But I have increasingly been convinced that this supposed general benefit is of marginal value, when it's of any value at all. There are many domains in which XML is the right choice, but in which general authoring-editing applications are simply not of much use in the common case. Supporting legislative and regulatory applications is one such domain.
There were at least ten vendor presentations at the one-day conference I attended. I've elided discussion of some of the duller presentations, as well as ones which presented very similar tools. There were also a few presentations of some very interesting tools -- Altova's XMLSpy and Software AG's Tamino come to mind -- which aren't really aimed at end user authoring-editing, and so I've omitted extensive discussion of those tools here.
In general, however, I think that the state of authoring-editing, especially for ordinary users, is maturing rapidly. I'm encouraged by both the vendor presentations I saw and by the obvious ongoing maturation of this space.
I want to end this column by offering a list of impressions, some vague and only half-formed, I took away from this conference as a whole.
"Tags" are the Real Enemy. I simply lost count of the number of times I heard vendors or audience members swear that end users simply will not countenance "tags." These comments were often passed in a tone which I found surprisingly vehement and angry. In this part of Washington, DC, the Real Enemy is not terrorism but XML tags. I don't know what this means, but it's worth thinking about.
"Tag Soup" doesn't mean, for vendors and managers, what it means for XML developers. For these users it means, simply, "lots of tags." The only thing worse than "tags" are "lots of tags." See above.
Office 2003's much-reported XML support generates less fear on the part of vendors and (though I'm less sure about this point) less interest on the part of managers. I'm surprised by the former point, since many of the tools presented are parasitic on MS Word in one way or another. I'm also surprised because Microsoft has a very long tradition of encouraging and then destroying third-party developers (cf. disk defragmentation and compression vendors, as well as antivirus vendors, among many others).
I'm very surprised by the latter point, but perhaps I shouldn't be. Many of the managers at the conference seem to have already partially implemented some kind of authoring-editing standards in their work groups, and it seems unlikely that Office's new XML support is enough to overturn those solutions.
Further, there may be something to the fact that every vendor who showed XML output from its tool showed XML that was vastly cleaner and more comprehensible than any XML output I've seen from Office 2003. That doesn't necessarily mean much, but it may be an indication of overall fit and finish.
Some author-editor tool features which I have tended to think of, when I have thought of them at all, as domain-invariant are, in fact, domain-variant. In other words, "change tracking," for example, is both harder and more domain-specific than I realized. In the legislative and regulatory domain, change tracking has a different life cycle and scope than it does for general business documents. Change tracking for legislative and regulatory documents bleeds into a very specific kind of legislative activity ("document markup") and vendors have to take account of that difference. This is one reason why I suggested above that the general benefit of XML tools is less real than I once thought.
The separation of presentation from content is both fundamental and illusory. It is fundamental to warding off proprietary vendor capture (about which more below), but it's also an illusion in any strict sense. Another legislative-regulatory requirement makes this point clearer. Line numbering, which would seem to be a presentation issue, given that it's an artifact of print, is actually a semantic feature of legislative-regulatory documents. Line numbers simply are part of the content of these documents.
Even worse, from a vendor's perspective, there isn't one line numbering algorithm. In some contexts line numbers run from the beginning of a document to the end, irrespective of page breaks. In other contexts line numbers are per-page line numbers. That is, the index is a concatenation of page number and line number on that page. This is tricky because XSL-FO apparently has little support for line numbering, much less for various kinds of line numbering.
The chasm between document and data uses of XML among developers is replicated among vendors, some of which are rooted in one tradition or the other (the older ones tend to be either comfortable in both camps or, given their SGML heritage, document-rooted). One emblem of this distinction among vendors is the varying support for W3C XML Schema (WXS) or DTDs. Some vendors support both, of course, but other vendors support only one or the other. Vendors targeting pure publishing contexts are often focused on DTD instead of WXS, which makes a certain kind of sense.
Some technologies that I think of as important (or even superior to their competitors) are totally missing in action in this vendor space. They include RDF, XML Topic Maps, Relax NG (which wasn't mentioned a single time, alas), XLink, and XPointer. I'm not sure what to conclude from this. It could be that vendors are using some of these standards (though I doubt any of them are using Relax NG), but just not mentioning that fact to potential users because potential users wouldn't recognize or care about that. I think it's more likely that they aren't being used at all.
On the other hand, some standards are simply dominant, and those include XSLT and WXS. With regard to moving XML around between client and server, WebDAV is ubiquitous. XQuery seems destined to join this list; it had much more vendor and manager uptake than I expected.
Just about every vendor with an author-editor tool has a "thin" client version to run in-browser. Each of these vendors claimed endlessly that its thin client allows "anyone, anywhere" to create XML documents. Without exception, all of these thin clients run in Internet Explorer only. The tension was either unnoticed or unremarked.
XML developers may like to engage in W3C; it's a sport I have learned to play, if not well, then joyously. But for vendors and managers the W3C is the source from which all things of goodness and light steadily flow. Every vendor with any W3C working group presence made sure to make this point repeatedly.
I was impressed, as a developer, with Altova's XMLSpy and Software AG's Tamino (a native XML database). If I had a team of XML developers on the Windows platform, I'd have a license for XMLSpy. It is the best XML IDE I've seen.
Avoiding vendor capture, ironically enough, was a major element of every vendors' spiel. That's a good thing. However, there were at least two areas for potential, perhaps unwitting vendor lock-in that managers need to think carefully about.
Also in XML-Deviant
First, many authoring-editing tools use a non-standard templating language to customize displays, build forms, style documents for in-tool presentation, and so on. In some of the tools this information is associated with XML instances and vocabularies by way of an out-of-band mechanism; in other tools XML processing instructions are used. However, I would be wary of building too much of my infrastructure into these non-standard mechanisms. That could be a hidden cost in trying to move to a new vendor.
Second, several tools, especially those with strong server-side or CMS support, offered fairly impressive support for the reuse of XML fragments or chunks. That is, after all, one of the very first promises of markup, going back to SGML. But I heard very little discussion of the precise mechanisms for creating the links between centrally-stored fragments and instances in which those fragments are reused. There are, obviously, a range of relevant W3C standards (XLink, XPointer, RDF, and XML namespaces come to mind), but it's not clear whether fragment reuse isn't vendor-specific. Insofar as it is, for some or all of these vendors, it may also pose a hidden cost in a tool or vendor migration project.
While many of the very biggest, most visible vendors attended this conference, I am confident that they represent only a portion of this entire space. If they are a representative portion, however, there is some good reason to think that the authoring-editing space is maturing nicely. Constant innovation is fun, but tool maturity is equally crucial to XML's success in the long run.