OAXAL: Open Architecture for XML Authoring and Localization
by Andrzej Zydron
|
Pages: 1, 2, 3, 4, 5
Figure 6 shows how the xml:tm namespace coexists within an XML document:

Figure 6. xml:tm tree
XML namespace is used to map a text memory view onto a document. This process is called segmentation. The text memory works at the sentence level of granularity—the text unit. Each individual xml:tm text unit is allocated a unique identifier. This unique identifier is immutable for the life of the document. As a document goes through its lifecycle the unique identifiers are maintained and new ones are allocated as required. This aspect of text memory is called author memory. It can be used to build author memory systems, which can be used to simplify and improve the consistency of authoring. A detailed technical article about xml:tm has been published on xml.com.
The use of xml:tm greatly improves upon the traditional translation route. Each text unit in a document has a unique identifier. When the document is translated, the target version of the file has the same identifiers. The source and target documents are therefore perfectly aligned at the text unit level. Figure 7 shows how perfect matching is achieved:

Figure 7. Matching
In xml:tm terms, this type of perfect matching is known as in context exact (ICE) matching. The xml:tm concept has been successfully proven within large-scale technical publishing applications.
For authoring, xml:tm represents a systematic way of identifying and storing all previously authored sentences. This repository can then be appropriately indexed and serve as input to authoring tools, to encourage authors to reuse existing sentences where appropriate rather than coming up with different ways of saying the same thing.

Figure 8. Author Memory
The author can enter keywords and search the author memory for appropriate sentences. If a sentence has already been used, then it is very likely that it has also been translated, in which case a leveraged translation match will be found, further reducing translation costs.
DITA and xml:tm
DITA and xml:tm are a perfect fit. While DITA enables reuse at the topic level, xml:tm complements this by enabling reuse at the sentence level. The xml:tm namespace can be seeded transparently into a DITA document. As the document goes through its lifecycle, the unique identifiers of each text unit are maintained, as are modifications to the document. In effect, xml:tm provides a change tracker for the document. The easiest way of implementing this is on the back of a content management system (CMS). Using DITA invariably means investing in an appropriate CMS.