XML.com

Release of XMLmind Word To XML v1.3

November 8, 2017

Submitted by Hussein Shafie, XMLmind.

Release of XMLmind Word To XML v1.3

What is XMLmind Word To XML?

XMLmind Word To XML can automatically convert DOCX files to:

  • Clean, styled, valid HTML (single page or multi-page HTML, Web Help, EPUB) looking very much like the source DOCX file.
  • Unstyled, structured, valid DITA bookmap, map, topic, DocBook, XHTML (single page or multi-page XHTML, Web Help, EPUB) or XML conforming to your custom schema.

Download: http://www.xmlmind.com/w2x/download.shtml

Free online DOCX conversion services: http://www.xmlmind.com/w2x/online_w2x.html

Enhancements:

  • Upgraded XMLmind Web Help Compiler (whc for short) to version 2.0, which supports 2 layouts for the generated Web Help: classic, the default layout and simple, a new layout. When generating Web Help, pass w2x option -p webhelp.wh-layout simple to give it a try.
  • Setup assistant of w2x-app:
    • Added a "Layout of the generated Web Help" combobox to the "Output format options" screen when the chosen output format is Web Help. This combobox makes it easy choosing between the classic and simple layouts.
    • The dialog box allowing to add or modify an entry of the MS-Word style to XML element map now displays the localized name of a style (e.g. "Definition Char") next to the w2x name of this style (e.g. "c-DefinitionChar"). This is really needed when you give for example Japanese names to your custom MS-Word styles.
  • Parameter edit.remove-styles.preserved-classes now accepts class patterns as well as class names. For example, specify -p edit.remove-styles.preserved-classes "^(t|(tr)|(tc)|(tp)|p|(pn)|n|c)-.+$" if you want to preserve in the semantic XHTML the class names corresponding to all the CSS styles generated during the Convert step.
  • Hidden text runs (<w:vanish/>) are now converted to <span style="display:none">. When generating semantic XML, these invisible span elements are then discarded.
  • “Word To XML” servlet: added an optional params servlet parameter which allows to augment or to override some of the options of the conversion specified by the conv servlet parameter. Example:
    curl -s -S -o manual.epub \
      -F "docx=@manual.docx;type=application/vnd.openxmlformats-officedocument.wordprocessingml.document" \
      -F "conv=epub" \
      -F "params=-p epub.identifier urn:x-mlmind:w2x:manual -p epub.split-before-level 8" \
      http://localhost:8080/w2x/convert
  • XMLmind Word To XML is now available as a macOS X native .dmg distribution including a private Java™ 1.8.0_152 runtime.
  • All programs which are part of XMLmind Word To XML are now officially supported on macOS High Sierra (version 10.13).

Bug fixes:

  • When a table was inserted inside a sequence of paragraphs having the same border, the conversion to styled XHTML (and to all output formats based on styled XHTML, like EPUB) failed with the following error message: error in action "group": missing attribute "g:container" for element .../html:p[NN].
  • When generating semantic XHTML, for some rare cases, class name role-bridgeheadI was added to li elements.
  • Field codes like "XE" (index entry) were not normalized to upper-case. For example, this bug could cause some index entries to be missing in the generated semantic XML.
  • It was not possible to use built-in image converter factory com.xmlmind.w2x_ext.emf2png.EMF2PNG to convert WMF to PNG despite the fact that this factory supports the WMF format in addition to the EMF format.
  • Marking as being deleted all the text contained in DOCX table caused w2x to generate an invalid XHTML table having no cells at all.
  • w2x generated invalid DITA when a table or figure caption contained index terms.

News items may be commercial in nature and are published as received.