Component single-source multi-channel publishing with Markdown

illustrations illustrations illustrations illustrations illustrations illustrations illustrations
post-thumb

Published on 1 June 2023 by Andrew Owen (5 minutes)

This week, I finished work on the final beta of the classic BASIC interpreter I’ve been working on for the Chloe 280SE FPGA retro computer project. Like many home computers from the 1970s to the early 1990s, when you switch it on, you go straight into the BASIC editor. But unlike most home computers of the same period, it has effectively unlimited storage. And that means that it can include built-in help, a feature that Microsoft didn’t add to QuickBASIC until 1987.

Although it’s a retro project, there’s no reason to limit development to what was available in the era. But it still has an incredibly constrained microprocessor by modern standards. So a lightweight document format is crucial for a responsive help system, and therefore the logical choice is Markdown. This is also a good choice because all the current documentation is in a GitHub wiki in Markdown format, which means the content can be reused.

There are some technical limitations to take into consideration. The Chloe has an 80×24 character screen with a fixed-width font. And the wiki is currently made up of nine files, each equivalent to a chapter in the user guide. These are too big to load in one go with a 64 kilobyte address space. And with a text based display and no mouse, users will need to type in the names of the wiki links. So the logical solution is to chunk the content into components. Breaking each document down into its individual sections and headings. Unlike a real component authoring system, heading levels are fixed. But breaking the chapters into components makes content reuse easy. For example, the descriptions of the BASIC functions and statements could be used in a separate quick reference document. And if the syntax changes, when the docs are republished, the changes are reflected everywhere.

So now we have single-sourcing between the GitHub wiki and the built-in help. But what about a printed manual? How can we go from an enormous collection of Markdown files to a PDF or other formats such as EPUB, HTML or Word? The answer is Pandoc, an open-source document converter created by Berkeley philosophy professor John MacFarlane. It can import from a vast array of sources and has built-in support for many output formats. However, for PDF it relies on LaTeX for formatting, so you’ll also need to install that.

As always, I recommend installing using Homebrew on macOS and Scoop on Windows. Pandoc is included in the repositories of most varieties of Linux. On macOS there’s an extra consideration, which is that the full LaTeX install uses four gigabytes of storage. So unless you need all the features, you should use BasicTeX instead.

brew install pandoc
brew install librsvg python homebrew/cask/basictex

The next step is to build a container for your content, in the form of a set of parameters for Pandoc. This contains your table of contents and any additional LaTeX directives for setting things like cover images and page sizes. For convenience, I put it in a makefile so that I can call it from the command line using make. I also created two text files containing LaTeX directives. These can be stored in the root of the repository, and they won’t show up in the GitHub wiki (you can get the git clone link for the repository when you view the wiki in your browser). There are various ways of setting LaTeX, but the simplest way is to add a YAML header to a file that you save with a .txt extension.

---
 titlepage: true
 title: "Programming SE Basic IV"
 author: "Rob Hagemans"
 date: \today
 subtitle: "Cordelia 4.2.0"
 subject: "Classic BASIC Programming"
 keywords: "BASIC"
 toc: true
 numbersections: true
 geometry: margin=2cm
 urlcolor: blue
 header-includes: |
    \lfoot{SE Basic IV}
    \rfoot{Cordelia 4.2.0}    
---

This example provides a title page and includes some metadata and formatting information for the PDF.

I also created a text file to produce a page break between chapters. It contains only one line.

\newpage

And here’s a partial sample makefile.

prg:
	pandoc --toc --output "Programming SE Basic IV.pdf" -s \
	cover.txt \
	newpage.txt BASIC.MD \
					FUNCTION.MD \
						ABS.MD ACOS.MD ASC.MD ASIN.MD ATAN.MD CHRS.MD COS.MD \
						DEEK.MD EXP.MD FIX.MD FN.MD INKEYS.MD INP.MD INT.MD \
						LEFTS.MD LEN.MD LOG.MD MIDS.MD PEEK.MD RIGHTS.MD \
						RND.MD SGN.MD SIN.MD SQR.MD STRS.MD STRINGS.MD TAN.MD \
						USR.MD VAL.MD VALS.MD \
					STATEMEN.MD \
						BLOAD.MD BSAVE.MD CALL.MD CHDIR.MD CIRCLE.MD CLEAR.MD \
						CLOSE.MD CLS.MD COLOR.MD CONT.MD COPY.MD DATA.MD \
						DEF_FN.MD DELETE.MD DIM.MD DOKE.MD DRAW.MD EDIT.MD \
						ELSE.MD END.MD ERROR.MD FILES.MD FOR.MD GOSUB.MD \
						GOTO.MD IF.MD INPUT.MD KEY.MD KILL.MD LET.MD LINE.MD \
						LIST.MD LOAD.MD LOCATE.MD MERGE.MD MKDIR.MD NAME.MD \
						NEW.MD NEXT.MD OLD.MD ON.MD ON_ERROR.MD OPEN.MD OUT.MD \
						PALETTE.MD PLAY.MD PLOT.MD POKE.MD PRINT.MD \
						RANDOMIZ.MD READ.MD REM.MD RENUM.MD RMDIR.MD RUN.MD \
						SAVE.MD SCREEN.MD SOUND.MD STOP.MD TRACE.MD WAIT.MD \
						WEND.MD WHILE.MD \
	newpage.txt LICENSE.MD

The backslashes concatenate all the text into a single command line, but you want to be able to read your table of contents. Including newpage.txt before each chapter makes it start on a new page. Heading levels are taken from the Markdown files, so you can only insert content at the appropriate heading level. Therefore, reusable content should have a minimum heading level of three (`###`) or four (`####`).

Then you enter make and providing that your filenames don’t contain any shell characters, and that your markdown is valid, you’ll get a PDF from your wiki with a cover page and a table of contents. It’s not fancy, but it should be adequate for simple open source projects.