Support for XML in mainstream products

May 5, 1998

Liora Alschuler

The Seybold Report on Internet Publishing
Vol. 2, No. 9
Liora Alschuler
May, 1998

Another indication of change in the editorial marketplace is support for XML from mainstream editing vendors. Certainly, the SGML editing tool vendors are embracing XML, but it is encouraging to hear Adobe state its commitment as well. Curiously, Microsoft's Office group is making much less of an XML commitment than its browser group.

Adobe: strategic commitment. Lani Hajagos, senior product marketing manager for Adobe's FrameMaker+Sgml, announced that Adobe is making a "strategic commitment" to XML across all its product lines. FrameMaker, as well as FrameMaker+ Sgml, will write XML.

To what extent is unclear, however, beyond the inclusion of metadata tags. The announcement said all product lines. What does Photoshop have to do with XML? According to Hajagos, XML can be a wrapper for interprocess communication. Product managers for PageMaker and Acrobat are also looking at ways to use XML. In a digital print workflow, once the data is cooked, pdf is a better format than XML for describing the document to be imaged. Xml metadata, however, can provide the information for a workflow wrapper to go from prepress to author, or from publisher to service provider.

Microsoft: Office banks on HTML. Where does the vendor with the largest share of the editorial marketplace plan to take XML editing? Not very far, evidently. We recently spoke with Matthew Price, group product manager for Microsoft Office, on this topic. With a beta version of Office 98 due out this summer, Price has a fairly good idea of what XML support will look like in the next release of Word.

The Office product suite is being pulled in the direction of Web authoring. But rather than embrace XML as a potential file format, the browser group has chosen HTML. Office 98 will offer users the option of using HTML as a default file format, instead of the internal binary format. Price cited a number of reasons for this shift, among them the desire to have all Office documents viewable on the Web, without ActiveX controls or plug-ins. Microsoft wants Office to be the leading Web authoring package, and for it to accomplish its objective, Office will have to adopt HTML as a default file format-no exporting to a second file required.

In this context, Microsoft has decided to use XML as a way to write Office-specific information into HTML files. For example, Price said that Microsoft will use XML to keep track of edit-state information (the marks you see when you turn on revision control in Word). The browser will simply ignore the XML tagging, but Word, on importing the HTML document, will read the XML tagging and interpret the revision marking accordingly. Price said that Microsoft would also use XML to mark pointers to external source files, such as graphs or charts, that would be converted to gif or jpeg for Web viewing.

To retain user-defined styles in HTML, Office 98 will make use of cascading style sheets. These styles will support user-defined names, but the paragraph tags will conform to an HTML tagset.

Though he admitted that "we'll have to make HTML jump through some hoops to handle Office documents," Price made it clear that at this juncture, Microsoft's Office group is not even considering editing XML source documents, saying that the mainstream market just isn't ready for it. That means Word won't be checking to see if documents are well formed, and it certainly won't be validating files against DTDs, as it did with Microsoft's SGML Author product, which flopped in the marketplace.