Doing continuous translation with Weblate

illustrations illustrations illustrations illustrations illustrations illustrations illustrations
post-thumb

Published on 7 April 2022 by Andrew Owen (3 minutes)

This week, I want to give a shout-out to Weblate, a web-based translation tool with Git integration that’s available free to open source projects. I discovered it by chance because a developer I was chatting with on Telegram had contributed to a project that made use of it.

First some background. I’m community manager for the Chloe 280SE FPGA retro computer project. Part of the firmware is a classic BASIC interpreter, which can be localized for any language whose character set can be encoded in an 8-bit code page (256 characters). Through my role as a volunteer editor for Translation Commons, I was able to request assistance from volunteer translators to localize the software for 20 languages.

But when the pandemic hit, the translation project manager moved on to another role, and the translation project stagnated. I looked at machine translation as a possible solution. But copying and pasting strings from a Google Sheets into Google Translate and back again is a slow process, and there’s no validation.

So when I heard about Weblate, I thought I should try it. It provides:

  • Continuous localization through integration with the most common code hosting services (GitHub, GitLab, Bitbucket, Pagure and Azure repos).
  • Quality checks including string length, double spaces, repeated words and so on.
  • Suggestions, from its own translation memory and machine translation services including DeepL, Google, LibreTranslate and Microsoft.
  • Glossaries, for maintaining translation consistency.
  • Crowdsourcing with attribution. Anyone can make anonymous suggestions. Registered users can contribute entire translations.

Weblate provides a web interface that’s a simplified version of what most professional translators will be familiar with from other translation software. It supports most translation formats understood by translate-toolkit. I chose to use a monolingual JSON file because:

  • I was already familiar with JSON.
  • It was relatively trivial to convert the original Google Sheets doc by exporting it in CSV format.
  • It’s human-readable.
  • It supports UTF-8.
  • Command-line tools were available to automate the production of a binary file from the JSON source.

I was already using GitHub to host the software. All I had to do to integrate with Weblate was add the Weblate bot as a user to the repository, create a locales folder in the root of the project, and drop the original English strings file en.json in the folder.

From there I could create new translations and push them to the remote repository when I was happy with them. If I made changes in the remote repository, I could rebase the Weblate repository.

I wrote a script to parse the contents of the locales folder to generate the binaries required by the software. Currently, I have to trigger it manually. The final step is to trigger it when there are any changes in the locales folder on the main branch. More on that in a future article.

I used this setup to increase the language coverage from 20 languages to 48 languages over the course of a couple of weeks. One of the side benefits of switching to Weblate and automating the process was that some inconsistencies in the translations were discovered that had not been obvious in the Google Sheets file.

But the real test came this week when a translator offered to add Basque support. They were able to go through the entire process of signing up to Weblate and creating the new translation with minimal help from me. I only needed to advise them on the contents of a few project-specific strings used in the build process. When they were done, I reviewed their contribution, pushed their changes to the Git repository and ran the build script.

If you’re using a Git repository, and you’re currently doing your translations using a spreadsheet, I think it would be well worth your time to take a look at Weblate.