Petter Reinholdtsen

Speeding up the Debian installer using eatmydata and dpkg-divert
16th September 2014

The Debian installer could be a lot quicker. When we install more than 2000 packages in Skolelinux / Debian Edu using tasksel in the installer, unpacking the binary packages take forever. A part of the slow I/O issue was discussed in bug #613428 about too much file system sync-ing done by dpkg, which is the package responsible for unpacking the binary packages. Other parts (like code executed by postinst scripts) might also sync to disk during installation. All this sync-ing to disk do not really make sense to me. If the machine crash half-way through, I start over, I do not try to salvage the half installed system. So the failure sync-ing is supposed to protect against, hardware or system crash, is not really relevant while the installer is running.

A few days ago, I thought of a way to get rid of all the file system sync()-ing in a fairly non-intrusive way, without the need to change the code in several packages. The idea is not new, but I have not heard anyone propose the approach using dpkg-divert before. It depend on the small and clever package eatmydata, which uses LD_PRELOAD to replace the system functions for syncing data to disk with functions doing nothing, thus allowing programs to live dangerous while speeding up disk I/O significantly. Instead of modifying the implementation of dpkg, apt and tasksel (which are the packages responsible for selecting, fetching and installing packages), it occurred to me that we could just divert the programs away, replace them with a simple shell wrapper calling "eatmydata $program $@", to get the same effect. Two days ago I decided to test the idea, and wrapped up a simple implementation for the Debian Edu udeb.

The effect was stunning. In my first test it reduced the running time of the pkgsel step (installing tasks) from 64 to less than 44 minutes (20 minutes shaved off the installation) on an old Dell Latitude D505 machine. I am not quite sure what the optimised time would have been, as I messed up the testing a bit, causing the debconf priority to get low enough for two questions to pop up during installation. As soon as I saw the questions I moved the installation along, but do not know how long the question were holding up the installation. I did some more measurements using Debian Edu Jessie, and got these results. The time measured is the time stamp in /var/log/syslog between the "pkgsel: starting tasksel" and the "pkgsel: finishing up" lines, if you want to do the same measurement yourself. In Debian Edu, the tasksel dialog do not show up, and the timing thus do not depend on how quickly the user handle the tasksel dialog.

Machine/setup Original tasksel Optimised tasksel Reduction
Latitude D505 Main+LTSP LXDE 64 min (07:46-08:50) <44 min (11:27-12:11) >20 min 18%
Latitude D505 Roaming LXDE 57 min (08:48-09:45) 34 min (07:43-08:17) 23 min 40%
Latitude D505 Minimal 22 min (10:37-10:59) 11 min (11:16-11:27) 11 min 50%
Thinkpad X200 Minimal 6 min (08:19-08:25) 4 min (08:04-08:08) 2 min 33%
Thinkpad X200 Roaming KDE 19 min (09:21-09:40) 15 min (10:25-10:40) 4 min 21%

The test is done using a netinst ISO on a USB stick, so some of the time is spent downloading packages. The connection to the Internet was 100Mbit/s during testing, so downloading should not be a significant factor in the measurement. Download typically took a few seconds to a few minutes, depending on the amount of packages being installed.

The speedup is implemented by using two hooks in Debian Installer, the pre-pkgsel.d hook to set up the diverts, and the finish-install.d hook to remove the divert at the end of the installation. I picked the pre-pkgsel.d hook instead of the post-base-installer.d hook because I test using an ISO without the eatmydata package included, and the post-base-installer.d hook in Debian Edu can only operate on packages included in the ISO. The negative effect of this is that I am unable to activate this optimization for the kernel installation step in d-i. If the code is moved to the post-base-installer.d hook, the speedup would be larger for the entire installation.

I've implemented this in the debian-edu-install git repository, and plan to provide the optimization as part of the Debian Edu installation. If you want to test this yourself, you can create two files in the installer (or in an udeb). One shell script need do go into /usr/lib/pre-pkgsel.d/, with content like this:

#!/bin/sh
set -e
. /usr/share/debconf/confmodule
info() {
    logger -t my-pkgsel "info: $*"
}
error() {
    logger -t my-pkgsel "error: $*"
}
override_install() {
    apt-install eatmydata || true
    if [ -x /target/usr/bin/eatmydata ] ; then
        for bin in dpkg apt-get aptitude tasksel ; do
            file=/usr/bin/$bin
            # Test that the file exist and have not been diverted already.
            if [ -f /target$file ] ; then
                info "diverting $file using eatmydata"
                printf "#!/bin/sh\neatmydata $bin.distrib \"\$@\"\n" \
                    > /target$file.edu
                chmod 755 /target$file.edu
                in-target dpkg-divert --package debian-edu-config \
                    --rename --quiet --add $file
                ln -sf ./$bin.edu /target$file
            else
                error "unable to divert $file, as it is missing."
            fi
        done
    else
        error "unable to find /usr/bin/eatmydata after installing the eatmydata pacage"
    fi
}

override_install

To clean up, another shell script should go into /usr/lib/finish-install.d/ with code like this:

#! /bin/sh -e
. /usr/share/debconf/confmodule
error() {
    logger -t my-finish-install "error: $@"
}
remove_install_override() {
    for bin in dpkg apt-get aptitude tasksel ; do
        file=/usr/bin/$bin
        if [ -x /target$file.edu ] ; then
            rm /target$file
            in-target dpkg-divert --package debian-edu-config \
                --rename --quiet --remove $file
            rm /target$file.edu
        else
            error "Missing divert for $file."
        fi
    done
    sync # Flush file buffers before continuing
}

remove_install_override

In Debian Edu, I placed both code fragments in a separate script edu-eatmydata-install and call it from the pre-pkgsel.d and finish-install.d scripts.

By now you might ask if this change should get into the normal Debian installer too? I suspect it should, but am not sure the current debian-installer coordinators find it useful enough. It also depend on the side effects of the change. I'm not aware of any, but I guess we will see if the change is safe after some more testing. Perhaps there is some package in Debian depending on sync() and fsync() having effect? Perhaps it should go into its own udeb, to allow those of us wanting to enable it to do so without affecting everyone.

Tags: debian, debian edu, english.
Good bye subkeys.pgp.net, welcome pool.sks-keyservers.net
10th September 2014

Yesterday, I had the pleasure of attending a talk with the Norwegian Unix User Group about the OpenPGP keyserver pool sks-keyservers.net, and was very happy to learn that there is a large set of publicly available key servers to use when looking for peoples public key. So far I have used subkeys.pgp.net, and some times wwwkeys.nl.pgp.net when the former were misbehaving, but those days are ended. The servers I have used up until yesterday have been slow and some times unavailable. I hope those problems are gone now.

Behind the round robin DNS entry of the sks-keyservers.net service there is a pool of more than 100 keyservers which are checked every day to ensure they are well connected and up to date. It must be better than what I have used so far. :)

Yesterdays speaker told me that the service is the default keyserver provided by the default configuration in GnuPG, but this do not seem to be used in Debian. Perhaps it should?

Anyway, I've updated my ~/.gnupg/options file to now include this line:

keyserver pool.sks-keyservers.net

With GnuPG version 2 one can also locate the keyserver using SRV entries in DNS. Just for fun, I did just that at work, so now every user of GnuPG at the University of Oslo should find a OpenGPG keyserver automatically should their need it:

% host -t srv _pgpkey-http._tcp.uio.no
_pgpkey-http._tcp.uio.no has SRV record 0 100 11371 pool.sks-keyservers.net.
%

Now if only the HKP lookup protocol supported finding signature paths, I would be very happy. It can look up a given key or search for a user ID, but I normally do not want that, but to find a trust path from my key to another key. Given a user ID or key ID, I would like to find (and download) the keys representing a signature path from my key to the key in question, to be able to get a trust path between the two keys. This is as far as I can tell not possible today. Perhaps something for a future version of the protocol?

Tags: debian, english, personvern, sikkerhet.
Do you need an agreement with MPEG-LA to publish and broadcast H.264 video in Norway?
25th August 2014

Two years later, I am still not sure if it is legal here in Norway to use or publish a video in H.264 or MPEG4 format edited by the commercially licensed video editors, without limiting the use to create "personal" or "non-commercial" videos or get a license agreement with MPEG LA. If one want to publish and broadcast video in a non-personal or commercial setting, it might be that those tools can not be used, or that video format can not be used, without breaking their copyright license. I am not sure. Back then, I found that the copyright license terms for Adobe Premiere and Apple Final Cut Pro both specified that one could not use the program to produce anything else without a patent license from MPEG LA. The issue is not limited to those two products, though. Other much used products like those from Avid and Sorenson Media have terms of use are similar to those from Adobe and Apple. The complicating factor making me unsure if those terms have effect in Norway or not is that the patents in question are not valid in Norway, but copyright licenses are.

These are the terms for Avid Artist Suite, according to their published end user license text (converted to lower case text for easier reading):

18.2. MPEG-4. MPEG-4 technology may be included with the software. MPEG LA, L.L.C. requires this notice:

This product is licensed under the MPEG-4 visual patent portfolio license for the personal and non-commercial use of a consumer for (i) encoding video in compliance with the MPEG-4 visual standard (“MPEG-4 video”) and/or (ii) decoding MPEG-4 video that was encoded by a consumer engaged in a personal and non-commercial activity and/or was obtained from a video provider licensed by MPEG LA to provide MPEG-4 video. No license is granted or shall be implied for any other use. Additional information including that relating to promotional, internal and commercial uses and licensing may be obtained from MPEG LA, LLC. See http://www.mpegla.com. This product is licensed under the MPEG-4 systems patent portfolio license for encoding in compliance with the MPEG-4 systems standard, except that an additional license and payment of royalties are necessary for encoding in connection with (i) data stored or replicated in physical media which is paid for on a title by title basis and/or (ii) data which is paid for on a title by title basis and is transmitted to an end user for permanent storage and/or use, such additional license may be obtained from MPEG LA, LLC. See http://www.mpegla.com for additional details.

18.3. H.264/AVC. H.264/AVC technology may be included with the software. MPEG LA, L.L.C. requires this notice:

This product is licensed under the AVC patent portfolio license for the personal use of a consumer or other uses in which it does not receive remuneration to (i) encode video in compliance with the AVC standard (“AVC video”) and/or (ii) decode AVC video that was encoded by a consumer engaged in a personal activity and/or was obtained from a video provider licensed to provide AVC video. No license is granted or shall be implied for any other use. Additional information may be obtained from MPEG LA, L.L.C. See http://www.mpegla.com.

Note the requirement that the videos created can only be used for personal or non-commercial purposes.

The Sorenson Media software have similar terms:

With respect to a license from Sorenson pertaining to MPEG-4 Video Decoders and/or Encoders: Any such product is licensed under the MPEG-4 visual patent portfolio license for the personal and non-commercial use of a consumer for (i) encoding video in compliance with the MPEG-4 visual standard (“MPEG-4 video”) and/or (ii) decoding MPEG-4 video that was encoded by a consumer engaged in a personal and non-commercial activity and/or was obtained from a video provider licensed by MPEG LA to provide MPEG-4 video. No license is granted or shall be implied for any other use. Additional information including that relating to promotional, internal and commercial uses and licensing may be obtained from MPEG LA, LLC. See http://www.mpegla.com.

With respect to a license from Sorenson pertaining to MPEG-4 Consumer Recorded Data Encoder, MPEG-4 Systems Internet Data Encoder, MPEG-4 Mobile Data Encoder, and/or MPEG-4 Unique Use Encoder: Any such product is licensed under the MPEG-4 systems patent portfolio license for encoding in compliance with the MPEG-4 systems standard, except that an additional license and payment of royalties are necessary for encoding in connection with (i) data stored or replicated in physical media which is paid for on a title by title basis and/or (ii) data which is paid for on a title by title basis and is transmitted to an end user for permanent storage and/or use. Such additional license may be obtained from MPEG LA, LLC. See http://www.mpegla.com for additional details.

Some free software like Handbrake and FFMPEG uses GPL/LGPL licenses and do not have any such terms included, so for those, there is no requirement to limit the use to personal and non-commercial.

Tags: english, multimedia, opphavsrett, standard, video, web.
Lenker for 2014-08-03
3rd August 2014

Lenge siden jeg har hatt tid til å publisere lenker til skriverier jeg har hatt glede og nytte av av å lese. Her er en liten norsk lenkesamling.

Tags: lenker, norsk, opphavsrett, personvern.
Debian Edu interview: Bernd Zeitzen
31st July 2014

The complete and free “out of the box” software solution for schools, Debian Edu / Skolelinux, is used quite a lot in Germany, and one of the people involved is Bernd Zeitzen, who show up on the project mailing lists from time to time with interesting questions and tips on how to adjust the setup. I managed to interview him this summer.

Who are you, and how do you spend your days?

My name is Bernd Zeitzen and I'm married with Hedda, a self employed physiotherapist. My former profession is tool maker, but I haven't worked for 30 years in this job. 30 years ago I started to support my wife and become her officeworker and a few years later the administrator for a small computer network, today based on Ubuntu Server (Samba, OpenVPN). For her daily work she has to use Windows Desktops because the software she needs to organize her business only works with Windows . :-(

In 1988 we started with one PC and DOS, then I learned to use Windows 98, 2000, XP, …, 8, Ubuntu, MacOSX. Today we are running a Linux server with 6 Windows clients and 10 persons (teacher of children with special needs, speech therapist, occupational therapist, psychologist and officeworkers) using our Samba shares via OpenVPN to work with the documentations of our patients.

How did you get in contact with the Skolelinux / Debian Edu project?

Two years ago a friend of mine asked me, if I want to get a job in his school (Gymnasium Harsewinkel). They started with Skolelinux / Debian Edu and they were looking for people to give support to the teachers using the software and the network and teaching the pupils increasing their computer skills in optional lessons. I'm spending 4-6 hours a week with this job.

What do you see as the advantages of Skolelinux / Debian Edu?

The independence.

First: Every person is allowed to use, share and develop the software. Even if you are poor, you are allowed to use the software included in Skolelinux/Debian Edu and all the other Free Software.

Second: The software runs on old machines and this gives us the possibility to recycle computers, weeded out from offices. The servers and desktops are running for more than two years and they are working reliable.

We have two servers (one tjener and one terminal server), 45 workstations in three classrooms and seven laptops as a mobile solution for all classrooms. These machines are all booting from the terminal server. In the moment we are installing 30 laptops as mobile workstations. Then the pupils have the possibility to work with these machines in their classrooms. Internet access is realized by a WLAN router, connected to the schools network. This is all done without a dedicated system administrator or a computer science teacher.

What do you see as the disadvantages of Skolelinux / Debian Edu?

Teachers and pupils are Windows users. <Irony on> And Linux isn't cool. It's software for freaks using the command line. <Irony off> They don't realize the stability of the system.

Which free software do you use daily?

Firefox, Thunderbird, LibreOffice, Ubuntu Server 12.04 (Samba, Apache, MySQL, Joomla!, … and Skolelinux / Debian Edu)

Which strategy do you believe is the right one to use to get schools to use free software?

In Germany we have the situation: every school is free to decide which software they want to use. This decision is influenced by teachers who learned to use Windows and MS Office. They buy a PC with Windows preinstalled and an additional testing version of MS Office. They don't know about the possibility to use Free Software instead. Another problem are the publisher of school books. They develop their software, added to the school books, for Windows.

Tags: debian edu, english, intervju.
98.6 percent done with the Norwegian draft translation of Free Culture
23rd July 2014

This summer I finally had time to continue working on the Norwegian docbook version of the 2004 book Free Culture by Lawrence Lessig, to get a Norwegian text explaining the problems with todays copyright law. Yesterday, I finally completed translated the book text. There are still some foot/end notes left to translate, the colophon page need to be rewritten, and a few words and phrases still need to be translated, but the Norwegian text is ready for the first proof reading. :) More spell checking is needed, and several illustrations need to be cleaned up. The work stopped up because I had to give priority to other projects the last year, and the progress graph of the translation show this very well:

If you want to read the result, check out the github project pages and the PDF, EPUB and HTML version available in the archive directory.

Please report typos, bugs and improvements to the github project if you find any.

Tags: docbook, english, freeculture.
From English wiki to translated PDF and epub via Docbook
17th June 2014

The Debian Edu / Skolelinux project provide an instruction manual for teachers, system administrators and other users that contain useful tips for setting up and maintaining a Debian Edu installation. This text is about how the text processing of this manual is handled in the project.

One goal of the project is to provide information in the native language of its users, and for this we need to handle translations. But we also want to make sure each language contain the same information, so for this we need a good way to keep the translations in sync. And we want it to be easy for our users to improve the documentation, avoiding the need to learn special formats or tools to contribute, and the obvious way to do this is to make it possible to edit the documentation using a web browser. We also want it to be easy for translators to keep the translation up to date, and give them help in figuring out what need to be translated. Here is the list of tools and the process we have found trying to reach all these goals.

We maintain the authoritative source of our manual in the Debian wiki, as several wiki pages written in English. It consist of one front page with references to the different chapters, several pages for each chapter, and finally one "collection page" gluing all the chapters together into one large web page (aka the AllInOne page). The AllInOne page is the one used for further processing and translations. Thanks to the fact that the MoinMoin installation on wiki.debian.org support exporting pages in the Docbook format, we can fetch the list of pages to export using the raw version of the AllInOne page, loop over each of them to generate a Docbook XML version of the manual. This process also download images and transform image references to use the locally downloaded images. The generated Docbook XML files are slightly broken, so some post-processing is done using the documentation/scripts/get_manual program, and the result is a nice Docbook XML file (debian-edu-wheezy-manual.xml) and a handfull of images. The XML file can now be used to generate PDF, HTML and epub versions of the English manual. This is the basic step of our process, making PDF (using dblatex), HTML (using xsltproc) and epub (using dbtoepub) version from Docbook XML, and the resulting files are placed in the debian-edu-doc-en binary package.

But English documentation is not enough for us. We want translated documentation too, and we want to make it easy for translators to track the English original. For this we use the poxml package, which allow us to transform the English Docbook XML file into a translation file (a .pot file), usable with the normal gettext based translation tools used by those translating free software. The pot file is used to create and maintain translation files (several .po files), which the translations update with the native language translations of all titles, paragraphs and blocks of text in the original. The next step is combining the original English Docbook XML and the translation file (say debian-edu-wheezy-manual.nb.po), to create a translated Docbook XML file (in this case debian-edu-wheezy-manual.nb.xml). This translated (or partly translated, if the translation is not complete) Docbook XML file can then be used like the original to create a PDF, HTML and epub version of the documentation.

The translators use different tools to edit the .po files. We recommend using lokalize, while some use emacs and vi, others can use web based editors like Poodle or Transifex. All we care about is where the .po file end up, in our git repository. Updated translations can either be committed directly to git, or submitted as bug reports against the debian-edu-doc package.

One challenge is images, which both might need to be translated (if they show translated user applications), and are needed in different formats when creating PDF and HTML versions (epub is a HTML version in this regard). For this we transform the original PNG images to the needed density and format during build, and have a way to provide translated images by storing translated versions in images/$LANGUAGECODE/. I am a bit unsure about the details here. The package maintainers know more.

If you wonder what the result look like, we provide the content of the documentation packages on the web. See for example the Italian PDF version or the German HTML version. We do not yet build the epub version by default, but perhaps it will be done in the future.

To learn more, check out the debian-edu-doc package, the manual on the wiki and the translation instructions in the manual.

Tags: debian, debian edu, docbook, english.
Hvordan enkelt laste ned filmer fra NRK med den "nye" løsningen
16th June 2014

Jeg har fortsatt behov for å kunne laste ned innslag fra NRKs nettsted av og til for å se senere når jeg ikke er på nett, men min oppskrift fra 2011 sluttet å fungere da NRK byttet avspillermetode. I dag fikk jeg endelig lett etter oppdatert løsning, og jeg er veldig glad for å fortelle at den enkleste måten å laste ned innslag er å bruke siste versjon 2014.06.07 av youtube-dl. Støtten i youtube-dl kom inn for 23 dager siden og versjonen i Debian fungerer fint også som backport til Debian Wheezy. Det er et lite problem, det håndterer kun URLer med små bokstaver, men hvis en har en URL med store bokstaver kan en bare gjøre alle store om til små bokstaver for å få youtube-dl til å laste ned. Rapporterte nettopp problemet til utviklerne, og antar de får fikset det snart.

Dermed er alt klart til å laste ned dokumentarene om USAs hemmelige avlytting og Selskapene bak USAs avlytting, i tillegg til intervjuet med Edward Snowden gjort av den tyske tv-kanalen ARD. Anbefaler alle å se disse, sammen med foredraget til Jacob Appelbaum på siste CCC-konferanse, for å forstå mer om hvordan overvåkningen av borgerne brer om seg.

Takk til gode venner på foreningen NUUGs IRC-kanal #nuug på irc.freenode.net for tipsene som fikk meg i mål.

Oppdatering 2014-06-17: Etter at jeg publiserte denne, ble jeg tipset om bloggposten "Downloading HD content from tv.nrk.no" av Ingvar Hagelund, som har alternativ implementasjon og tips for å lage mkv-fil med undertekstene inkludert. Kanskje den passer bedre for deg? I tillegg ble feilen i youtube-dl ble fikset litt senere ut på dagen i går, samt at youtube-dl fikk støtte for å laste ned undertitler. Takk til Anders Einar Hilden for god innsats og youtube-dl-utviklerne for rask respons.

Tags: multimedia, norsk, video, web.
Free software car computer solution?
29th May 2014

Dear lazyweb. I'm planning to set up a small Raspberry Pi computer in my car, connected to a small screen next to the rear mirror. I plan to hook it up with a GPS and a USB wifi card too. The idea is to get my own "Carputer". But I wonder if someone already created a good free software solution for such car computer.

This is my current wish list for such system:

If you know of any free software car computer system supporting some or all of these features, please let me know.

Tags: english.
Half the Coverity issues in Gnash fixed in the next release
29th April 2014

I've been following the Gnash project for quite a while now. It is a free software implementation of Adobe Flash, both a standalone player and a browser plugin. Gnash implement support for the AVM1 format (and not the newer AVM2 format - see Lightspark for that one), allowing several flash based sites to work. Thanks to the friendly developers at Youtube, it also work with Youtube videos, because the Javascript code at Youtube detect Gnash and serve a AVM1 player to those users. :) Would be great if someone found time to implement AVM2 support, but it has not happened yet. If you install both Lightspark and Gnash, Lightspark will invoke Gnash if it find a AVM1 flash file, so you can get both handled as free software. Unfortunately, Lightspark so far only implement a small subset of AVM2, and many sites do not work yet.

A few months ago, I started looking at Coverity, the static source checker used to find heaps and heaps of bugs in free software (thanks to the donation of a scanning service to free software projects by the company developing this non-free code checker), and Gnash was one of the projects I decided to check out. Coverity is able to find lock errors, memory errors, dead code and more. A few days ago they even extended it to also be able to find the heartbleed bug in OpenSSL. There are heaps of checks being done on the instrumented code, and the amount of bogus warnings is quite low compared to the other static code checkers I have tested over the years.

Since a few weeks ago, I've been working with the other Gnash developers squashing bugs discovered by Coverity. I was quite happy today when I checked the current status and saw that of the 777 issues detected so far, 374 are marked as fixed. This make me confident that the next Gnash release will be more stable and more dependable than the previous one. Most of the reported issues were and are in the test suite, but it also found a few in the rest of the code.

If you want to help out, you find us on the gnash-dev mailing list and on the #gnash channel on irc.freenode.net IRC server.

Tags: english, multimedia, video, web.

RSS feed

Created by Chronicle v4.6