- <div class="title"><a href="http://people.skolelinux.org/pere/blog/Litt_statistikk_over_offentlige_anbud_annonsert_via_Doffin_siden_2008.html">Litt statistikk over offentlige anbud annonsert via Doffin siden 2008</a></div>
- <div class="date">11th February 2013</div>
- <div class="body"><p>For et halvt år siden
-<a href="http://people.skolelinux.org/pere/blog/SQL_database_med_anbud_publisert_p__Doffin.html">satte
-jeg opp et system for å lage en database</a> med informasjon om
-offentlige anbud fra <a href="http://www.doffin.no/">Doffin</a> ved
-<a href="https://scraperwiki.com/scrapers/norwegian-doffin/">hjelp av
-Scraperwiki</a>. Nå er databasen så vidt jeg kan se komplett, med
-data helt tilbake til 2008. Her er litt statistikk over
-<a href="https://api.scraperwiki.com/api/1.0/datastore/sqlite?format=htmltable&name=norwegian-doffin&query=select%20strftime(%22%25Y-%25m%22%2C%20publishdate)%20as%20publishmonth%2C%20count(*)%20from%20%60swdata%60%20group%20by%20publishmonth%20order%20by%20publishmonth%20desc">antall
-anbud publisert hver måned</a>:</p>
-
-<p><table border="1">
-<tr> <th>Publiseringsmåned</th> <th>Antall</th> </tr>
-<tr> <td>2013-01</td> <td>1015</td> </tr>
-<tr> <td>2012-12</td> <td>756</td> </tr>
-<tr> <td>2012-11</td> <td>979</td> </tr>
-<tr> <td>2012-10</td> <td>1093</td> </tr>
-<tr> <td>2012-09</td> <td>1023</td> </tr>
-<tr> <td>2012-08</td> <td>951</td> </tr>
-<tr> <td>2012-07</td> <td>1103</td> </tr>
-<tr> <td>2012-06</td> <td>1334</td> </tr>
-<tr> <td>2012-05</td> <td>1435</td> </tr>
-<tr> <td>2012-04</td> <td>1169</td> </tr>
-<tr> <td>2012-03</td> <td>1573</td> </tr>
-<tr> <td>2012-02</td> <td>1335</td> </tr>
-<tr> <td>2012-01</td> <td>1147</td> </tr>
-<tr> <td>2011-12</td> <td>1045</td> </tr>
-<tr> <td>2011-11</td> <td>1114</td> </tr>
-<tr> <td>2011-10</td> <td>1230</td> </tr>
-<tr> <td>2011-09</td> <td>1165</td> </tr>
-<tr> <td>2011-08</td> <td>966</td> </tr>
-<tr> <td>2011-07</td> <td>1148</td> </tr>
-<tr> <td>2011-06</td> <td>1410</td> </tr>
-<tr> <td>2011-05</td> <td>1536</td> </tr>
-<tr> <td>2011-04</td> <td>1350</td> </tr>
-<tr> <td>2011-03</td> <td>1574</td> </tr>
-<tr> <td>2011-02</td> <td>1370</td> </tr>
-<tr> <td>2011-01</td> <td>1049</td> </tr>
-<tr> <td>2010-12</td> <td>992</td> </tr>
-<tr> <td>2010-11</td> <td>1089</td> </tr>
-<tr> <td>2010-10</td> <td>1110</td> </tr>
-<tr> <td>2010-09</td> <td>1132</td> </tr>
-<tr> <td>2010-08</td> <td>883</td> </tr>
-<tr> <td>2010-07</td> <td>1126</td> </tr>
-<tr> <td>2010-06</td> <td>1440</td> </tr>
-<tr> <td>2010-05</td> <td>1236</td> </tr>
-<tr> <td>2010-04</td> <td>1249</td> </tr>
-<tr> <td>2010-03</td> <td>1556</td> </tr>
-<tr> <td>2010-02</td> <td>1256</td> </tr>
-<tr> <td>2010-01</td> <td>1140</td> </tr>
-<tr> <td>2009-12</td> <td>1013</td> </tr>
-<tr> <td>2009-11</td> <td>1220</td> </tr>
-<tr> <td>2009-10</td> <td>1320</td> </tr>
-<tr> <td>2009-09</td> <td>1294</td> </tr>
-<tr> <td>2009-08</td> <td>953</td> </tr>
-<tr> <td>2009-07</td> <td>1162</td> </tr>
-<tr> <td>2009-06</td> <td>1605</td> </tr>
-<tr> <td>2009-05</td> <td>1568</td> </tr>
-<tr> <td>2009-04</td> <td>1522</td> </tr>
-<tr> <td>2009-03</td> <td>1599</td> </tr>
-<tr> <td>2009-02</td> <td>1376</td> </tr>
-<tr> <td>2009-01</td> <td>1080</td> </tr>
-<tr> <td>2008-12</td> <td>1028</td> </tr>
-<tr> <td>2008-11</td> <td>949</td> </tr>
-<tr> <td>2008-10</td> <td>1047</td> </tr>
-<tr> <td>2008-09</td> <td>965</td> </tr>
-<tr> <td>2008-08</td> <td>725</td> </tr>
-<tr> <td>2008-07</td> <td>1015</td> </tr>
-<tr> <td>2008-06</td> <td>1304</td> </tr>
-<tr> <td>2008-05</td> <td>323</td> </tr>
-</table></p>
-
-<p>Her er tilsvarende
-<a href="https://api.scraperwiki.com/api/1.0/datastore/sqlite?format=htmltable&name=norwegian-doffin&query=select%20strftime(%22%25Y%22%2C%20publishdate)%20as%20publishyear%2C%20count(*)%20from%20%60swdata%60%20group%20by%20publishyear%20order%20by%20publishyear%20desc">tall
-per år</a>, som viser en liten nedgang i antall anbud:</p>
-
-<table border="1">
-<tr> <th>Publiseringsår</th> <th>Antall</th> </tr>
-<tr> <td>2012</td> <td>13898</td> </tr>
-<tr> <td>2011</td> <td>14957</td> </tr>
-<tr> <td>2010</td> <td>14209</td> </tr>
-<tr> <td>2009</td> <td>15712</td> </tr>
-<tr> <td>2008</td> <td>7356</td> </tr>
-</table></p>
-
-<p>Jeg droppet den ufullstendige måneden og året fra tabellen. Se
-lenken for oppdaterte tall.</p>
+ <div class="title"><a href="http://people.skolelinux.org/pere/blog/Typesetting_a_short_story_using_docbook_for_PDF__HTML_and_EPUB.html">Typesetting a short story using docbook for PDF, HTML and EPUB</a></div>
+ <div class="date">24th March 2013</div>
+ <div class="body"><p>A few days ago, during a discussion in
+<a href="http://www.efn.no/">EFN</a> about interesting books to read
+about copyright and the data retention directive, a suggestion to read
+the 1968 short story Kodémus by
+<a href="http://web2.gyldendal.no/toraage/">Tore Åge Bringsværd</a>
+came up. The text was only available in old paper books, and thus not
+easily available for current and future generations. Some of the
+people participating in the discussion contacted the author, and
+reported back 2013-03-19 that the author was OK with releasing the
+short story using a <a href="http://www.creativecommons.org/">Creative
+Commons</a> license. The text was quickly scanned and OCR-ed, and we
+were ready to start on the editing and typesetting.</p>
+
+<p>As I already had some experience formatting text in my project to
+provide a Norwegian version of the Free Culture book by Lawrence
+Lessig, I chipped in and set up a
+<a href="http://www.docbook.org/">DocBook</a> processing framework to
+generate PDF, HTML and EPUB version of the short story. The tools to
+transform DocBook to different formats are already in my Linux
+distribution of choice, <a href="http://www.debian.org/">Debian</a>, so
+all I had to do was to use the
+<a href="http://dblatex.sourceforge.net/">dblatex</a>,
+<a href="http://docbook.sourceforge.net/release/xsl/current/epub/README">dbtoepub</a>
+and <a href="https://fedorahosted.org/xmlto/">xmlto</a> tools to do the
+conversion. After a few days, we decided to replace dblatex with
+xsltproc/fop (aka
+<a href="http://wiki.docbook.org/DocBookXslStylesheets">docbook-xsl</a>),
+to get the copyright information to show up in the PDF and to get a
+nicer <variablelist> typesetting, but that is just a minor
+technical detail.</p>
+
+<p>There were a few challenges, of course. We want to typeset the
+short story to look like the original, and that require fairly good
+control over the layout. The original short story have three
+parts/scenes separated by a single horizontally centred star (*), and
+the paragraphs do not contain only flowing text, but dialogs and text
+that started on a new line in the middle of the paragraph.</p>
+
+<p>I initially solved the first challenge by using a paragraph with a
+single star in it, ie <para>*</para>, but it made sure a
+placeholder indicated where the scene shifted. This did not look too
+good without the centring. The next approach was to create a new
+preprocessor directive <?newscene?>, mapping to "<hr/>"
+for HTML and "<fo:block text-align="center"><fo:leader
+leader-pattern="rule" rule-thickness="0.5pt"/></fo:block>"
+for FO/PDF output (did not try to implement this in dblatex, as we had
+switched at this time). The HTML XSL file looked like this:</p>
+
+<p><blockquote><pre>
+<?xml version='1.0'?>
+<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version='1.0'>
+ <xsl:template match="processing-instruction('newscene')">
+ <hr/>
+ </xsl:template>
+</xsl:stylesheet>
+</pre></blockquote></p>
+
+<p>And the FO/PDF XSL file looked like this:</p>
+
+<p><blockquote><pre>
+<?xml version='1.0'?>
+<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version='1.0'>
+ <xsl:template match="processing-instruction('newscene')">
+ <fo:block text-align="center">
+ <fo:leader leader-pattern="rule" rule-thickness="0.5pt"/>
+ </fo:block>
+ </xsl:template>
+</xsl:stylesheet>
+</pre></blockquote></p>
+
+<p>Finally, I came across the <bridgehead> tag, which seem to be
+a good fit for the task at hand, and I replaced <?newscene?>
+with <bridgehead>*</bridgehead>. It isn't centred, but we
+can fix it with some XSL rule if the current visual layout isn't
+enough.</p>
+
+<p>I did not find a good DocBook compliant way to solve the
+linebreak/paragraph challenge, so I ended up creating a new processor
+directive <?linebreak?>, mapping to <br/> in HTML, and
+<fo:block/> in FO/PDF. I suspect there are better ways to do
+this, and welcome ideas and patches on github. The HTML XSL file now
+look like this:</p>
+
+<p><blockquote><pre>
+<?xml version='1.0'?>
+<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version='1.0'>
+ <xsl:template match="processing-instruction('linebreak)">
+ <br/>
+ </xsl:template>
+</xsl:stylesheet>
+</pre></blockquote></p>
+
+<p>And the FO/PDF XSL file looked like this:</p>
+
+<p><blockquote><pre>
+<?xml version='1.0'?>
+<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version='1.0'
+ xmlns:fo="http://www.w3.org/1999/XSL/Format">
+ <xsl:template match="processing-instruction('linebreak)">
+ <fo:block/>
+ </xsl:template>
+</xsl:stylesheet>
+</pre></blockquote></p>
+
+<p>One unsolved challenge is our wish to expose different ISBN numbers
+per publication format, while keeping all of them in some conditional
+structure in the DocBook source. No idea how to do this, so we ended
+up listing all the ISBN numbers next to their format in the colophon
+page.</p>
+
+<p>If you want to check out the finished result, check out the
+<a href="https://github.com/sickel/kodemus">source repository at
+github</a>
+(<a href="https://github.com/EFN/kodemus">future/new/official
+repository</a>). We expect it to be ready and announced in a few
+days.</p>