Creating documentation in XML

illustrations illustrations illustrations illustrations illustrations illustrations illustrations

Published on 24 November 2022 by Andrew Owen


If your documentation has reached the limits of what’s possible in Markdown, and you’d prefer not to fall back to HTML, it’s time to consider authoring in XML. And no, I don’t mean using Microsoft Word and saving in .DOCX format. Whichever schema you choose (DITA, DocBook, XHTML or something else), you’ll get the benefits of single sourcing and structured authoring, which will save you time and money, especially if your documentation is translated into other languages.

In 1993, Jean-Yves Belmonte, Olivier Ishacian and Hussein Shafie created Pixware SARL. One of its first customers was Renault Sport, which had a requirement for engine telemetry and test bed software. In 2001, they created an XML division and in 2017 it was spun off into XMLmind Software. There are lots of XML editors, and I’ve used most of the well known ones. But XMLmind’s XML Editor (XXE) is my favorite, because when it was released it was the only one that was aimed at writers instead of developers. With a graphical WYSIWYG editor, you rarely have to touch an XML tag.

I’ve written previously about why you should consider DocBook for your documentation needs (although XXE supports other schemas). During my involvement with the Mega65 project, I was unable to persuade the team, and they went for LaTeX instead. It uses its own form of markup in place of XML, but it supports structured authoring. The main thing it has going for it is that it uses the TeX typesetting system, which Donald Knuth created to typeset complex mathematical formula while writing “The Art of Computer Programming”.

But compared to DocBook, the LaTeX tool chain is a headache to use. And you can do complex formulae just as easily in DocBook as in TeX by using MathML. The other reason that LaTeX is popular is because of its free software license. But guess what, XXE is available free for personal and open source use. And it’s written in Java (with pre-built macOS and Windows packages) so it will run on any system that supports Java 8 or higher.

For the purposes of the license, personal use means any document where you’re the sole copyright owner. You can also freely use XXE Personal Edition if you’re creating or editing documents:

  • For software published under an open source license as defined by the Open Source Initiative.
  • In the public domain or licensed under Creative Commons or an equivalent license.
  • Owned by a non-profit non-governmental organization.
  • Within an educational institution for the purpose of using it as a teaching tool.

And if you don’t fall into any of those categories, you can use it on a trial basis for any purpose for 30 days. With the commercial version, you get an XSL-FO (XML stylesheet formatting objects) tool for converting the source into outputs such as Microsoft Word. But there are free alternatives such as Apache FOP. Personally, I use the ASCIIdoctor fopub, which is based on Apache FOP. It has a nice default style sheet for PDF that’s easy to modify.

I don’t want to do a deep dive on the power of XML, but I’ll mention a few useful tools from my earlier DocBook article:

  • XPath enables you to navigate the tree of an XML document and select nodes that match a given set of criteria.
  • XPointer is like XPath for media.
  • XInclude enables you to assemble an XML document from other XML documents, including translations or boilerplate text.

And my final selling point for XML authoring is that, back before we were all using Git, every source control system I used would destroy docs, but seemed to be able to handle XML. There are some great content management systems out there. But if you have limited resources, you can easily manage XML docs in a normal code repository.