1 Title: From English wiki to translated PDF and epub via Docbook
2 Tags: english, debian, debian edu, docbook
5 <p>The <a href="http://www.skolelinux.org/">Debian Edu / Skolelinux
6 project</a> provide an instruction manual for teachers, system
7 administrators and other users that contain useful tips for setting up
8 and maintaining a Debian Edu installation. This text is about how the
9 text processing of this manual is handled in the project.</p>
11 <p>One goal of the project is to provide information in the native
12 language of its users, and for this we need to handle translations.
13 But we also want to make sure each language contain the same
14 information, so for this we need a good way to keep the translations
15 in sync. And we want it to be easy for our users to improve the
16 documentation, avoiding the need to learn special formats or tools to
17 contribute, and the obvious way to do this is to make it possible to
18 edit the documentation using a web browser. We also want it to be
19 easy for translators to keep the translation up to date, and give them
20 help in figuring out what need to be translated. Here is the list of
21 tools and the process we have found trying to reach all these
24 <p>We maintain the authoritative source of our manual in the
25 <a href="https://wiki.debian.org/DebianEdu/Documentation/Wheezy/">Debian
26 wiki</a>, as several wiki pages written in English. It consist of one
27 front page with references to the different chapters, several pages
28 for each chapter, and finally one "collection page" gluing all the
29 chapters together into one large web page (aka
30 <a href="https://wiki.debian.org/DebianEdu/Documentation/Wheezy/AllInOne">the
31 AllInOne page</a>). The AllInOne page is the one used for further
32 processing and translations. Thanks to the fact that the
33 <a href="http://moinmo.in/">MoinMoin</a> installation on
34 wiki.debian.org support exporting pages in
35 <a href="http://www.docbook.org/">the Docbook format</a>, we can fetch
36 the list of pages to export using the raw version of the AllInOne
37 page, loop over each of them to generate a Docbook XML version of the
38 manual. This process also download images and transform image
39 references to use the locally downloaded images. The generated
40 Docbook XML files are slightly broken, so some post-processing is done
41 using the <tt>documentation/scripts/get_manual</tt> program, and the
42 result is a nice Docbook XML file (debian-edu-wheezy-manual.xml) and
43 a handfull of images. The XML file can now be used to generate PDF, HTML
44 and epub versions of the English manual. This is the basic step of
45 our process, making PDF (using dblatex), HTML (using xsltproc) and
46 epub (using dbtoepub) version from Docbook XML, and the resulting files
47 are placed in the debian-edu-doc-en binary package.</p>
49 <p>But English documentation is not enough for us. We want translated
50 documentation too, and we want to make it easy for translators to
51 track the English original. For this we use the
52 <a href="http://packages.qa.debian.org/p/poxml.html">poxml</a> package,
53 which allow us to transform the English Docbook XML file into a
54 translation file (a .pot file), usable with the normal gettext based
55 translation tools used by those translating free software. The pot
56 file is used to create and maintain translation files (several .po
57 files), which the translations update with the native language
58 translations of all titles, paragraphs and blocks of text in the
59 original. The next step is combining the original English Docbook XML
60 and the translation file (say debian-edu-wheezy-manual.nb.po), to
61 create a translated Docbook XML file (in this case
62 debian-edu-wheezy-manual.nb.xml). This translated (or partly
63 translated, if the translation is not complete) Docbook XML file can
64 then be used like the original to create a PDF, HTML and epub version
65 of the documentation.</p>
67 <p>The translators use different tools to edit the .po files. We
69 <a href="http://www.kde.org/applications/development/lokalize/">lokalize</a>,
70 while some use emacs and vi, others can use web based editors like
71 <a href="http://pootle.translatehouse.org/">Poodle</a> or
72 <a href="https://www.transifex.com/">Transifex</a>. All we care about
73 is where the .po file end up, in our git repository. Updated
74 translations can either be committed directly to git, or submitted as
75 <a href="https://bugs.debian.org/src:debian-edu-doc">bug reports
76 against the debian-edu-doc package</a>.</p>
78 <p>One challenge is images, which both might need to be translated (if
79 they show translated user applications), and are needed in different
80 formats when creating PDF and HTML versions (epub is a HTML version in
81 this regard). For this we transform the original PNG images to the
82 needed density and format during build, and have a way to provide
83 translated images by storing translated versions in
84 images/$LANGUAGECODE/. I am a bit unsure about the details here. The
85 package maintainers know more.</p>
87 <p>If you wonder what the result look like, we provide
88 <a href="http://maintainer.skolelinux.org/debian-edu-doc/">the content
89 of the documentation packages on the web</a>. See for example the
90 <a href="http://maintainer.skolelinux.org/debian-edu-doc/it/debian-edu-wheezy-manual.pdf">Italian
91 PDF version</a> or the
92 <a href="http://maintainer.skolelinux.org/debian-edu-doc/de/debian-edu-wheezy-manual.html">German
93 HTML version</a>. We do not yet build the epub version by default,
94 but perhaps it will be done in the future.</p>
96 <p>To learn more, check out
97 <a href="http://packages.qa.debian.org/d/debian-edu-doc.html">the
98 debian-edu-doc package</a>,
99 <a href="https://wiki.debian.org/DebianEdu/Documentation/Wheezy/">the
100 manual on the wiki</a> and
101 <a href="https://wiki.debian.org/DebianEdu/Documentation/Wheezy/Translations">the
102 translation instructions</a> in the manual.</p>