X-Git-Url: https://pere.pagekite.me/gitweb/homepage.git/blobdiff_plain/f36e6858875b50fe302f52ec06848879052040e3..e00539284613586c38abc01c62c034ba949fbdf9:/blog/index.rss diff --git a/blog/index.rss b/blog/index.rss index 63c588c714..12f1ce1906 100644 --- a/blog/index.rss +++ b/blog/index.rss @@ -6,6 +6,130 @@ http://people.skolelinux.org/pere/blog/ + + Typesetting a short story using docbook for PDF, HTML and EPUB + http://people.skolelinux.org/pere/blog/Typesetting_a_short_story_using_docbook_for_PDF__HTML_and_EPUB.html + http://people.skolelinux.org/pere/blog/Typesetting_a_short_story_using_docbook_for_PDF__HTML_and_EPUB.html + Sun, 24 Mar 2013 17:30:00 +0100 + <p>A few days ago, during a discussion in +<a href="http://www.efn.no/">EFN</a> about interesting books to read +about copyright and the data retention directive, a suggestion to read +the 1968 short story Kodémus by +<a href="http://web2.gyldendal.no/toraage/">Tore Åge Bringsværd</a> +came up. The text was only available in old paper books, and thus not +easily available for current and future generations. Some of the +people participating in the discussion contacted the author, and +reported back 2013-03-19 that the author was OK with releasing the +short story using a <a href="http://www.creativecommons.org/">Creative +Commons</a> license. The text was quickly scanned and OCR-ed, and we +were ready to start on the editing and typesetting.</p> + +<p>As I already had some experience formatting text in my project to +provide a Norwegian version of the Free Culture book by Lawrence +Lessig, I chipped in and set up a +<a href="http://www.docbook.org/">DocBook</a> processing framework to +generate PDF, HTML and EPUB version of the short story. The tools to +transform DocBook to different formats are already in my Linux +distribution of choice, <a href="http://www.debian.org/">Debian</a>, so +all I had to do was to use the +<a href="http://dblatex.sourceforge.net/">dblatex</a>, +<a href="http://docbook.sourceforge.net/release/xsl/current/epub/README">dbtoepub</a> +and <a href="https://fedorahosted.org/xmlto/">xmlto</a> tools to do the +conversion. After a few days, we decided to replace dblatex with +xsltproc/fop (aka +<a href="http://wiki.docbook.org/DocBookXslStylesheets">docbook-xsl</a>), +to get the copyright information to show up in the PDF and to get a +nicer &lt;variablelist&gt; typesetting, but that is just a minor +technical detail.</p> + +<p>There were a few challenges, of course. We want to typeset the +short story to look like the original, and that require fairly good +control over the layout. The original short story have three +parts/scenes separated by a single horizontally centred star (*), and +the paragraphs do not contain only flowing text, but dialogs and text +that started on a new line in the middle of the paragraph.</p> + +<p>I initially solved the first challenge by using a paragraph with a +single star in it, ie &lt;para&gt;*&lt;/para&gt;, but it made sure a +placeholder indicated where the scene shifted. This did not look too +good without the centring. The next approach was to create a new +preprocessor directive &lt;?newscene?&gt;, mapping to "&lt;hr/&gt;" +for HTML and "&lt;fo:block text-align="center"&gt;&lt;fo:leader +leader-pattern="rule" rule-thickness="0.5pt"/&gt;&lt;/fo:block&gt;" +for FO/PDF output (did not try to implement this in dblatex, as we had +switched at this time). The HTML XSL file looked like this:</p> + +<p><blockquote><pre> +&lt;?xml version='1.0'?&gt; +&lt;xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version='1.0'&gt; + &lt;xsl:template match="processing-instruction('newscene')"&gt; + &lt;hr/&gt; + &lt;/xsl:template&gt; +&lt;/xsl:stylesheet&gt; +</pre></blockquote></p> + +<p>And the FO/PDF XSL file looked like this:</p> + +<p><blockquote><pre> +&lt;?xml version='1.0'?&gt; +&lt;xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version='1.0'&gt; + &lt;xsl:template match="processing-instruction('newscene')"&gt; + &lt;fo:block text-align="center"&gt; + &lt;fo:leader leader-pattern="rule" rule-thickness="0.5pt"/&gt; + &lt;/fo:block&gt; + &lt;/xsl:template&gt; +&lt;/xsl:stylesheet&gt; +</pre></blockquote></p> + +<p>Finally, I came across the &lt;bridgehead&gt; tag, which seem to be +a good fit for the task at hand, and I replaced &lt;?newscene?&gt; +with &lt;bridgehead&gt;*&lt;/bridgehead&gt;. It isn't centred, but we +can fix it with some XSL rule if the current visual layout isn't +enough.</p> + +<p>I did not find a good DocBook compliant way to solve the +linebreak/paragraph challenge, so I ended up creating a new processor +directive &lt;?linebreak?&gt;, mapping to &lt;br/&gt; in HTML, and +&lt;fo:block/&gt; in FO/PDF. I suspect there are better ways to do +this, and welcome ideas and patches on github. The HTML XSL file now +look like this:</p> + +<p><blockquote><pre> +&lt;?xml version='1.0'?&gt; +&lt;xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version='1.0'&gt; + &lt;xsl:template match="processing-instruction('linebreak)"&gt; + &lt;br/&gt; + &lt;/xsl:template&gt; +&lt;/xsl:stylesheet&gt; +</pre></blockquote></p> + +<p>And the FO/PDF XSL file looked like this:</p> + +<p><blockquote><pre> +&lt;?xml version='1.0'?&gt; +&lt;xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version='1.0' + xmlns:fo="http://www.w3.org/1999/XSL/Format"&gt; + &lt;xsl:template match="processing-instruction('linebreak)"&gt; + &lt;fo:block/&gt; + &lt;/xsl:template&gt; +&lt;/xsl:stylesheet&gt; +</pre></blockquote></p> + +<p>One unsolved challenge is our wish to expose different ISBN numbers +per publication format, while keeping all of them in some conditional +structure in the DocBook source. No idea how to do this, so we ended +up listing all the ISBN numbers next to their format in the colophon +page.</p> + +<p>If you want to check out the finished result, check out the +<a href="https://github.com/sickel/kodemus">source repository at +github</a> +(<a href="https://github.com/EFN/kodemus">future/new/official +repository</a>). We expect it to be ready and announced in a few +days.</p> + + + Regjeringen, FAD og DIFI går inn for å fjerne ODF som obligatorisk standard i det offentlige http://people.skolelinux.org/pere/blog/Regjeringen__FAD_og_DIFI_g_r_inn_for___fjerne_ODF_som_obligatorisk_standard_i_det_offentlige.html @@ -620,90 +744,5 @@ map you can just edit the - - "Electronic" paper invoices - using vCard in a QR code - http://people.skolelinux.org/pere/blog/_Electronic__paper_invoices___using_vCard_in_a_QR_code.html - http://people.skolelinux.org/pere/blog/_Electronic__paper_invoices___using_vCard_in_a_QR_code.html - Tue, 12 Feb 2013 10:30:00 +0100 - <p>Here in Norway, electronic invoices are spreading, and the -<a href="http://www.anskaffelser.no/e-handel/faktura">solution promoted -by the Norwegian government</a> require that invoices are sent through -one of the approved facilitators, and it is not possible to send -electronic invoices without an agreement with one of these -facilitators. This seem like a needless limitation to be able to -transfer invoice information between buyers and sellers. My preferred -solution would be to just transfer the invoice information directly -between seller and buyer, for example using SMTP, or some HTTP based -protocol like REST or SOAP. But this might also be overkill, as the -"electronic" information can be transferred using paper invoices too, -using a simple bar code. My bar code encoding of choice would be QR -codes, as this encoding can be read by any smart phone out there. The -content of the code could be anything, but I would go with -<a href="http://en.wikipedia.org/wiki/VCard">the vCard format</a>, as -it too is supported by a lot of computer equipment these days.</p> - -<p>The vCard format support extentions, and the invoice specific -information can be included using such extentions. For example an -invoice from SLX Debian Labs (picked because we -<a href="http://www.linuxiskolen.no/slxdebianlabs/donations.html">ask -for donations to the Debian Edu project</a> and thus have bank account -information publicly available) for NOK 1000.00 could have these extra -fields:</p> - -<p><pre> -X-INVOICE-NUMBER:1 -X-INVOICE-AMOUNT:NOK1000.00 -X-INVOICE-KID:123412341234 -X-INVOICE-MSG:Donation to Debian Edu -X-BANK-ACCOUNT-NUMBER:16040884339 -X-BANK-IBAN-NUMBER:NO8516040884339 -X-BANK-SWIFT-NUMBER:DNBANOKKXXX -</pre></p> - -<p>The X-BANK-ACCOUNT-NUMBER field was proposed in a stackoverflow -answer regarding -<a href="http://stackoverflow.com/questions/10045664/storing-bank-account-in-vcard-file">how -to put bank account information into a vCard</a>. For payments in -Norway, either X-INVOICE-KID (payment ID) or X-INVOICE-MSG could be -used to pass on information to the seller when paying the invoice.</p> - -<p>The complete vCard could look like this:</p> - -<p><pre> -BEGIN:VCARD -VERSION:2.1 -ORG:SLX Debian Labs Foundation -ADR;WORK:;;Gunnar Schjelderups vei 29D;OSLO;;0485;Norway -URL;WORK:http://www.linuxiskolen.no/slxdebianlabs/ -EMAIL;PREF;INTERNET:sdl-styret@rt.nuug.no -REV:20130212T095000Z -X-INVOICE-NUMBER:1 -X-INVOICE-AMOUNT:NOK1000.00 -X-INVOICE-MSG:Donation to Debian Edu -X-BANK-ACCOUNT-NUMBER:16040884339 -X-BANK-IBAN-NUMBER:NO8516040884339 -X-BANK-SWIFT-NUMBER:DNBANOKKXXX -END:VCARD -</pre></p> - -<p>The resulting QR code created using -<a href="http://fukuchi.org/works/qrencode/">qrencode</a> would look -like this, and should be readable (and thus checkable) by any smart -phone, or for example the <a href="http://zbar.sourceforge.net/">zbar -bar code reader</a> and feed right into the approval and accounting -system.</p> - -<p><img src="http://people.skolelinux.org/pere/blog/images/2013-02-12-qr-invoice.png"></p> - -<p>The extension fields will most likely not show up in any normal -vCard reader, so those parts would have to go directly into a system -handling invoices. I am a bit unsure how vCards without name parts -are handled, but a simple test indicate that this work just fine.</p> - -<p><strong>Update 2013-02-12 11:30</strong>: Added KID to the proposal -based on feedback from Sturle Sunde.</p> - - -