When you execute the following command:
..\..\bin\w2x –o docbook5 manual.docx out\manual.xml
you execute in fact a sequence of 3 conversion steps:
The entry point of these “semantic” XED scripts is found in w2x_install_dir/xed/main.xed
.
The XED scripts edit in place the input XHTML document. Therefore, the result of this step is the same XHTML document, still valid, but this time, containing no CSS styles whatsoever.
The XSLT stylesheets are all found in w2x_install_dir/xslt/
. In the above case, we want to generate DocBook v5, therefore we use w2x_install_dir/xslt/docbook5.xslt
.
This sequence of conversion steps can be made visible in every detail by specifying the –vv
option (very verbose) :
..\..\bin\w2x –vv –o docbook5 manual.docx out\manual.xml VERBOSE: Converting "manual.docx" to XHTML... DEBUG: convert.xhtml-file=C:\w2x-1_12_0\doc\manual\out\manual.xhtml VERBOSE: Editing XHTML document using "C:\w2x-1_12_0\xed\main.xed"... DEBUG: edit.xed-url-or-file=file:/C:/w2x-1_12_0/xed/main.xed DEBUG: Loading script "file:/C:/w2x-1_12_0/xed/main.xed"... DEBUG: Loading script "file:/C:/w2x-1_12_0/xed/after-translate.xed"... [...] DEBUG: Loading script "file:/C:/w2x-1_12_0/xed/before-save.xed"... VERBOSE: Transforming document using "C:\w2x-1_12_0\xslt\docbook5.xslt" then saving it to "C:\w2x-1_12_0\doc\manual\out\manual.xml"... DEBUG: transform.out-file=C:\w2x-1_12_0\doc\manual\out\manual.xml transform.xslt-url-or-file=file:/C:/w2x-1_12_0/xslt/docbook5.xslt [...]
In fact, option –o docbook5
is a shorthand for the following w2x command-line options:
-c
Execute a Convert step called “convert
”.
-p convert.xhtml-file C:\w2x-1_12_0\doc\manual\out\manual.xhtml
Pass the above xhtml-file
parameter to the conversion step called “convert
”.
-e
Execute an Edit step called “edit
”.
-p edit.xed-url-or-file file:/C:/w2x-1_12_0/xed/main.xed
Pass the above xed-url-or-file
parameter to the conversion step called “edit
”.
-t
Execute a Transform step called “transform
”.
-p transform.xslt-url-or-file file:/C:/w2x-1_12_0/xslt/docbook5.xslt
-p transform.out-file C:\w2x-1_12_0\doc\manual\out\manual.xml
Pass the above xslt-url-or-file
and out-file
parameters to the conversion step called “transform
”.
If you need to learn about the details of the conversion steps to be executed, the simplest is to use the –liststeps command-line option.
Example: w2x –o docbook5 –liststeps
.
The order of the –c, -e and –t options is significant because it means: first convert, then edit and finally transform. The order of the –p (and –pu) options is not important, as a parameter name must be prefixed by the name of the step to which it applies.
The Convert, Edit and Transform steps are the most important steps. There are other conversion steps though, which are all documented in chapter Conversion step reference. Moreover a Java™ programmer may implement its own custom conversion steps[5] and instruct the w2x
command-line to give them names (required to pass them parameters) and to execute them. See option –step.
A w2x processor executes a sequence of conversion steps whatever the output format. Simply the conversion steps, their order, number and parameters, depend on the desired output format. This is depicted in the figure below.
The first sequence of in the above figure reads as follows: in order to convert a DOCX file to styled XHTML, first convert the DOCX file to a XHTML+CSS document, then “polish up” this document (e.g. process consecutive paragraphs having identical borders) using XED script w2x_install_dir/xed/main-styled.xed
, and finally save the possibly modified XHTML+CSS document to disk.
[5]A custom conversion step derives from abstract class com.xmlmind.w2x.processor.ProcessStep
.