A few days ago, during a discussion in +EFN about interesting books to read +about copyright and the data retention directive, a suggestion to read +the 1968 short story Kodémus by +Tore à ge Bringsværd +came up. The text was only available in old paper books, and thus not +easily available for current and future generations. Some of the +people participating in the discussion contacted the author, and +reported back 2013-03-19 that the author was OK with releasing the +short story using a Creative +Commons license. The text was quickly scanned and OCR-ed, and we +were ready to start on the editing and typesetting.
+ +As I already had some experience formatting text in my project to +provide a Norwegian version of the Free Culture book by Lawrence +Lessig, I chipped in and set up a +DocBook processing framework to +generate PDF, HTML and EPUB version of the short story. The tools to +transform DocBook to different formats are already in my Linux +distribution of choice, Debian, so +all I had to do was to use the +dblatex, +dbtoepub +and xmlto tools to do the +conversion. After a few days, we decided to replace dblatex with +xsltproc/fop (aka +docbook-xsl), +to get the copyright information to show up in the PDF and to get a +nicer <variablelist> typesetting, but that is just a minor +technical detail.
+ +There were a few challenges, of course. We want to typeset the +short story to look like the original, and that require fairly good +control over the layout. The original short story have three +parts/scenes separated by a single horizontally centred star (*), and +the paragraphs do not contain only flowing text, but dialogs and text +that started on a new line in the middle of the paragraph.
+ +I initially solved the first challenge by using a paragraph with a +single star in it, ie <para>*</para>, but it made sure a +placeholder indicated where the scene shifted. This did not look too +good without the centring. The next approach was to create a new +preprocessor directive <?newscene?>, mapping to "<hr/>" +for HTML and "<fo:block text-align="center"><fo:leader +leader-pattern="rule" rule-thickness="0.5pt"/></fo:block>" +for FO/PDF output (did not try to implement this in dblatex, as we had +switched at this time). The HTML XSL file looked like this:
+ ++ ++<?xml version='1.0'?> +<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version='1.0'> + <xsl:template match="processing-instruction('newscene')"> + <hr/> + </xsl:template> +</xsl:stylesheet> +
And the FO/PDF XSL file looked like this:
+ ++ ++<?xml version='1.0'?> +<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version='1.0'> + <xsl:template match="processing-instruction('newscene')"> + <fo:block text-align="center"> + <fo:leader leader-pattern="rule" rule-thickness="0.5pt"/> + </fo:block> + </xsl:template> +</xsl:stylesheet> +
Finally, I came across the <bridgehead> tag, which seem to be +a good fit for the task at hand, and I replaced <?newscene?> +with <bridgehead>*</bridgehead>. It isn't centred, but we +can fix it with some XSL rule if the current visual layout isn't +enough.
+ +I did not find a good DocBook compliant way to solve the +linebreak/paragraph challenge, so I ended up creating a new processor +directive <?linebreak?>, mapping to <br/> in HTML, and +<fo:block/> in FO/PDF. I suspect there are better ways to do +this, and welcome ideas and patches on github. The HTML XSL file now +look like this:
+ ++ ++<?xml version='1.0'?> +<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version='1.0'> + <xsl:template match="processing-instruction('linebreak)"> + <br/> + </xsl:template> +</xsl:stylesheet> +
And the FO/PDF XSL file looked like this:
+ ++ ++<?xml version='1.0'?> +<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version='1.0' + xmlns:fo="http://www.w3.org/1999/XSL/Format"> + <xsl:template match="processing-instruction('linebreak)"> + <fo:block/> + </xsl:template> +</xsl:stylesheet> +
One unsolved challenge is our wish to expose different ISBN numbers +per publication format, while keeping all of them in some conditional +structure in the DocBook source. No idea how to do this, so we ended +up listing all the ISBN numbers next to their format in the colophon +page.
+ +If you want to check out the finished result, check out the +source repository at +github +(future/new/official +repository). We expect it to be ready and announced in a few +days.
+