explain home explain home explain home explain homeexplain banner
Explanations - about explain | XML training | Consulting | XML resources | Publications | Open Source Software

DocBook Roundtripping

Explain's Steve Ball, in conjunction with Bob Stayton, has developed a set of XSL stylesheets that convert Microsoft's WordML into DocBook and back again. These stylesheets are intended to allow "roundtripping" of DocBook documents, ie. to convert DocBook documents into Word and back in DocBook with no loss of data and structure. The aim of this is to allow a word processor to be used to edit DocBook XML documents.

All of the XSL stylesheets are part of the DocBook XSL project. Until the next release, you must either extract the files from the CVS repository or use a snapshot build.

More than one word processor application is supported. At present there is support for Microsoft Office 2003 and Apple Pages. In order to use MS Word it is necessary to save documents in XML format (ie. using WordML). To use Apple Pages, it is necessary to copy the index.xml.gz file from the document bundle and uncompress it.

Using the XSL Stylesheets

There are two sets of XSL stylesheets; one set for converting DocBook into Word (WordML) or Pages, and another set for converting WordML to DocBook.

DocBook to WordML

The XSL stylesheet docbook.xsl is used to transform DocBook documents into WordML documents. An additional file, the document template, is needed to provide definitions of style formatting properties. This is specified as the wordml.template stylesheet parameter.

Example usage:

xsltproc -o my-word.xml --stringparam wordml.template template.xml docbook.xsl my-docbook.xml

The document template is a WordML document, but its body text is not used. This allows the user to change the formatting properties used by the various styles by simply using Word's menus and dialogs.

WordML to DocBook

Transforming a WordML document into DocBook involves "chaining" the XML document through a "pipeline" of XSL stylesheets. There are four XSL stylesheets involved: wordml-normalise.xsl, wordml-sections.xsl, wordml-blocks.xsl and wordml-final.xsl.

Example usage:

xsltproc -o normalised.xml wordml-normalise.xsl my-word.xml
xsltproc -o sections.xml wordml-sections.xsl normalised.xml
xsltproc -o blocks.xml wordml-blocks.xsl sections.xml
xsltproc -o my-docbook.xml wordml-final.xsl blocks.xml

DocBook to Pages

The XSL stylesheet docbook-pages.xsl is used to transform DocBook documents into Pages XML index documents. An additional file, the document template, is needed to provide definitions of style formatting properties. This is specified as the pages.template stylesheet parameter.

Example usage:

xsltproc -o index.xml --stringparam pages.template template-pages.xml docbook-pages.xsl my-docbook.xml

The result document, index.xml must be placed in a bundle in order to be opened by the Pages application. The simplest way to do this is to create a directory that has ".pages" as its file extension and then copy/move the index.xml into that directory. The index file does not need to be compressed before opening with Pages.

The document template is a Pages index document, but its body text is not used. This allows the user to change the formatting properties used by the various styles by simply using Pages's menus and dialogs.

Pages to DocBook

A Pages document is actually a "bundle", ie. although it appears as a single icon it is really a directory that contains all of the files needed for the document. Control-click on the Pages document icon and select "Show Package Contents". Inside the bundle is an index file, either index.xml.gz or index.xml. This index file must be uncompressed before being used.

Transforming the Pages index document into DocBook involves "chaining" the XML document through a "pipeline" of XSL stylesheets. There are four XSL stylesheets involved: pages-normalise.xsl, wordml-sections.xsl, wordml-blocks.xsl and wordml-final.xsl. NB. the last three stylesheets are the same ones used for transforming WordML into DocBook.

Example usage:

xsltproc -o normalised.xml pages-normalise.xsl index.xml
xsltproc -o sections.xml wordml-sections.xsl normalised.xml
xsltproc -o blocks.xml wordml-blocks.xsl sections.xml
xsltproc -o my-docbook.xml wordml-final.xsl blocks.xml

Supported Elements

The roundtripping system does not support all of the DocBook elements. See Supported Elements for the current status of support of DocBook elements.

Getting Help

Contact Explain for support. Explain offers commercial support, for those organisations that need it.


Copyright © 2005 explain All rights reserved. Legal notices. Comments or questions about this website? Contact the webperson.