XMLmind Word To XML Manual
Explains how to install and use XMLmind Word To XML (w2x for short), how to customize the output of w2x and how to embed a w2x processor in a Java™ application.
Hussein Shafie
XMLmind Software
35 rue Louis Leblanc,
78120 Rambouillet,
France,
Phone: +33 (0)9 52 80 80 37,
Web: www.xmlmind.com/w2x/
Email: mailto:w2x-support@xmlmind.com (public mailing list)
Contents
2.1 Contents of the installation directory 7
3 Alternatives to using the w2x command-line utility 9
3.1 The w2x-app graphical application 9
3.2 The “Word To XML” add-on for XMLmind XML Editor 9
3.2.1 Installing the “Word To XML” add-on 10
3.3 The “Word To XML” servlet 10
3.3.1 Contents of the servlet software distribution 11
3.3.2 Installing the servlet 11
3.3.3 Configuring the servlet 11
3.3.4 Using the servlet to convert DOCX files 12
3.3.5 Non interactive requests 13
4.1 How to generate useful multi-page HTML 17
6 Customizing the output of w2x 24
6.1 Customizing the XHTML+CSS files generated by w2x 24
6.1.1 Using a XED script to modify the styles embedded in the XHTML+CSS file 24
6.1.2 Appending custom styles to the styles embedded in the XHTML+CSS file 24
6.1.3 Using an external CSS file rather than embedded CSS styles 25
6.1.4 Combining all the above methods 26
6.2 Customizing the semantic XML files generated by w2x 27
6.2.1 Converting custom character styles to semantic tags 27
6.2.2 Converting custom paragraph styles to semantic tags 28
6.3 Generating XML conforming to a custom schema 33
6.4 Packaging your customization as a w2x plugin 34
6.4.2 Registering a plugin with w2x 35
7 The w2x command-line utility 37
7.1 Variables substituted in the parameter values passed to the –p and –pu options 39
7.2 Default conversion steps 40
7.3 Automatic conversion step parameters 40
8 Conversion step reference 42
9 Embedding w2x in a Java™ application 66
9.1.1 Custom conversion step 67
9.1.2 Custom image converters 67
9.1.2.1 Specifying an external image converter 68
9.1.2.2 Controlling how image files found in the input DOCX file are converted to standard formats 69