- <div class="title"><a href="http://people.skolelinux.org/pere/blog/A_program_should_be_able_to_open_its_own_files_on_Linux.html">A program should be able to open its own files on Linux</a></div>
- <div class="date"> 5th June 2016</div>
- <div class="body"><p>Many years ago, when koffice was fresh and with few users, I
-decided to test its presentation tool when making the slides for a
-talk I was giving for NUUG on Japhar, a free Java virtual machine. I
-wrote the first draft of the slides, saved the result and went to bed
-the day before I would give the talk. The next day I took a plane to
-the location where the meeting should take place, and on the plane I
-started up koffice again to polish the talk a bit, only to discover
-that kpresenter refused to load its own data file. I cursed a bit and
-started making the slides again from memory, to have something to
-present when I arrived. I tested that the saved files could be
-loaded, and the day seemed to be rescued. I continued to polish the
-slides until I suddenly discovered that the saved file could no longer
-be loaded into kpresenter. In the end I had to rewrite the slides
-three times, condensing the content until the talk became shorter and
-shorter. After the talk I was able to pinpoint the problem –
-kpresenter wrote inline images in a way itself could not understand.
-Eventually that bug was fixed and kpresenter ended up being a great
-program to make slides. The point I'm trying to make is that we
-expect a program to be able to load its own data files, and it is
-embarrassing to its developers if it can't.</p>
-
-<p>Did you ever experience a program failing to load its own data
-files from the desktop file browser? It is not a uncommon problem. A
-while back I discovered that the screencast recorder
-gtk-recordmydesktop would save an Ogg Theora video file the KDE file
-browser would refuse to open. No video player claimed to understand
-such file. I tracked down the cause being <tt>file --mime-type</tt>
-returning the application/ogg MIME type, which no video player I had
-installed listed as a MIME type they would understand. I asked for
-<a href="http://bugs.gw.com/view.php?id=382">file to change its
-behavour</a> and use the MIME type video/ogg instead. I also asked
-several video players to add video/ogg to their desktop files, to give
-the file browser an idea what to do about Ogg Theora files. After a
-while, the desktop file browsers in Debian started to handle the
-output from gtk-recordmydesktop properly.</p>
-
-<p>But history repeats itself. A few days ago I tested the music
-system Rosegarden again, and I discovered that the KDE and xfce file
-browsers did not know what to do with the Rosegarden project files
-(*.rg). I've reported <a href="http://bugs.debian.org/825993">the
-rosegarden problem to BTS</a> and a fix is commited to git and will be
-included in the next upload. To increase the chance of me remembering
-how to fix the problem next time some program fail to load its files
-from the file browser, here are some notes on how to fix it.</p>
-
-<p>The file browsers in Debian in general operates on MIME types.
-There are two sources for the MIME type of a given file. The output from
-<tt>file --mime-type</tt> mentioned above, and the content of the
-shared MIME type registry (under /usr/share/mime/). The file MIME
-type is mapped to programs supporting the MIME type, and this
-information is collected from
-<a href="https://www.freedesktop.org/wiki/Specifications/desktop-entry-spec/">the
-desktop files</a> available in /usr/share/applications/. If there is
-one desktop file claiming support for the MIME type of the file, it is
-activated when asking to open a given file. If there are more, one
-can normally select which one to use by right-clicking on the file and
-selecting the wanted one using 'Open with' or similar. In general
-this work well. But it depend on each program picking a good MIME
-type (preferably
-<a href="http://www.iana.org/assignments/media-types/media-types.xhtml">a
-MIME type registered with IANA</a>), file and/or the shared MIME
-registry recognizing the file and the desktop file to list the MIME
-type in its list of supported MIME types.</p>
-
-<p>The <tt>/usr/share/mime/packages/rosegarden.xml</tt> entry for
-<a href="http://www.freedesktop.org/wiki/Specifications/shared-mime-info-spec">the
-Shared MIME database</a> look like this:</p>
-
-<p><blockquote><pre>
-<?xml version="1.0" encoding="UTF-8"?>
-<mime-info xmlns="http://www.freedesktop.org/standards/shared-mime-info">
- <mime-type type="audio/x-rosegarden">
- <sub-class-of type="application/x-gzip"/>
- <comment>Rosegarden project file</comment>
- <glob pattern="*.rg"/>
- </mime-type>
-</mime-info>
-</pre></blockquote></p>
-
-<p>This states that audio/x-rosegarden is a kind of application/x-gzip
-(it is a gzipped XML file). Note, it is much better to use an
-official MIME type registered with IANA than it is to make up ones own
-unofficial ones like the x-rosegarden type used by rosegarden.</p>
-
-<p>The desktop file of the rosegarden program failed to list
-audio/x-rosegarden in its list of supported MIME types, causing the
-file browsers to have no idea what to do with *.rg files:</p>
-
-<p><blockquote><pre>
-% grep Mime /usr/share/applications/rosegarden.desktop
-MimeType=audio/x-rosegarden-composition;audio/x-rosegarden-device;audio/x-rosegarden-project;audio/x-rosegarden-template;audio/midi;
-X-KDE-NativeMimeType=audio/x-rosegarden-composition
+ <div class="title"><a href="http://people.skolelinux.org/pere/blog/Where_did_that_package_go___mdash__geolocated_IP_traceroute.html">Where did that package go? — geolocated IP traceroute</a></div>
+ <div class="date"> 9th January 2017</div>
+ <div class="body"><p>Did you ever wonder where the web trafic really flow to reach the
+web servers, and who own the network equipment it is flowing through?
+It is possible to get a glimpse of this from using traceroute, but it
+is hard to find all the details. Many years ago, I wrote a system to
+map the Norwegian Internet (trying to figure out if our plans for a
+network game service would get low enough latency, and who we needed
+to talk to about setting up game servers close to the users. Back
+then I used traceroute output from many locations (I asked my friends
+to run a script and send me their traceroute output) to create the
+graph and the map. The output from traceroute typically look like
+this:
+
+<p><pre>
+traceroute to www.stortinget.no (85.88.67.10), 30 hops max, 60 byte packets
+ 1 uio-gw10.uio.no (129.240.202.1) 0.447 ms 0.486 ms 0.621 ms
+ 2 uio-gw8.uio.no (129.240.24.229) 0.467 ms 0.578 ms 0.675 ms
+ 3 oslo-gw1.uninett.no (128.39.65.17) 0.385 ms 0.373 ms 0.358 ms
+ 4 te3-1-2.br1.fn3.as2116.net (193.156.90.3) 1.174 ms 1.172 ms 1.153 ms
+ 5 he16-1-1.cr1.san110.as2116.net (195.0.244.234) 2.627 ms he16-1-1.cr2.oslosda310.as2116.net (195.0.244.48) 3.172 ms he16-1-1.cr1.san110.as2116.net (195.0.244.234) 2.857 ms
+ 6 ae1.ar8.oslosda310.as2116.net (195.0.242.39) 0.662 ms 0.637 ms ae0.ar8.oslosda310.as2116.net (195.0.242.23) 0.622 ms
+ 7 89.191.10.146 (89.191.10.146) 0.931 ms 0.917 ms 0.955 ms
+ 8 * * *
+ 9 * * *
+[...]
+</pre></p>
+
+<p>This show the DNS names and IP addresses of (at least some of the)
+network equipment involved in getting the data traffic from me to the
+www.stortinget.no server, and how long it took in milliseconds for a
+package to reach the equipment and return to me. Three packages are
+sent, and some times the packages do not follow the same path. This
+is shown for hop 5, where three different IP addresses replied to the
+traceroute request.</p>
+
+<p>There are many ways to measure trace routes. Other good traceroute
+implementations I use are traceroute (using ICMP packages) mtr (can do
+both ICMP, UDP and TCP) and scapy (python library with ICMP, UDP, TCP
+traceroute and a lot of other capabilities). All of them are easily
+available in <a href="https://www.debian.org/">Debian</a>.</p>
+
+<p>This time around, I wanted to know the geographic location of
+different route points, to visualize how visiting a web page spread
+information about the visit to a lot of servers around the globe. The
+background is that a web site today often will ask the browser to get
+from many servers the parts (for example HTML, JSON, fonts,
+JavaScript, CSS, video) required to display the content. This will
+leak information about the visit to those controlling these servers
+and anyone able to peek at the data traffic passing by (like your ISP,
+the ISPs backbone provider, FRA, GCHQ, NSA and others).</p>
+
+<p>Lets pick an example, the Norwegian parliament web site
+www.stortinget.no. It is read daily by all members of parliament and
+their staff, as well as political journalists, activits and many other
+citizens of Norway. A visit to the www.stortinget.no web site will
+ask your browser to contact 8 other servers: ajax.googleapis.com,
+insights.hotjar.com, script.hotjar.com, static.hotjar.com,
+stats.g.doubleclick.net, www.google-analytics.com,
+www.googletagmanager.com and www.netigate.se. I extracted this by
+asking <a href="http://phantomjs.org/">PhantomJS</a> to visit the
+Stortinget web page and tell me all the URLs PhantomJS downloaded to
+render the page (in HAR format using
+<a href="https://github.com/ariya/phantomjs/blob/master/examples/netsniff.js">their
+netsniff example</a>. I am very grateful to Gorm for showing me how
+to do this). My goal is to visualize network traces to all IP
+addresses behind these DNS names, do show where visitors personal
+information is spread when visiting the page.</p>
+
+<p align="center"><a href="www.stortinget.no-geoip.kml"><img
+src="http://people.skolelinux.org/pere/blog/images/2017-01-09-www.stortinget.no-geoip-small.png" alt="map of combined traces for URLs used by www.stortinget.no using GeoIP"/></a></p>
+
+<p>When I had a look around for options, I could not find any good
+free software tools to do this, and decided I needed my own traceroute
+wrapper outputting KML based on locations looked up using GeoIP. KML
+is easy to work with and easy to generate, and understood by several
+of the GIS tools I have available. I got good help from by NUUG
+colleague Anders Einar with this, and the result can be seen in
+<a href="https://github.com/petterreinholdtsen/kmltraceroute">my
+kmltraceroute git repository</a>. Unfortunately, the quality of the
+free GeoIP databases I could find (and the for-pay databases my
+friends had access to) is not up to the task. The IP addresses of
+central Internet infrastructure would typically be placed near the
+controlling companies main office, and not where the router is really
+located, as you can see from <a href="www.stortinget.no-geoip.kml">the
+KML file I created</a> using the GeoLite City dataset from MaxMind.
+
+<p align="center"><a href="http://people.skolelinux.org/pere/blog/images/2017-01-09-www.stortinget.no-scapy.svg"><img
+src="http://people.skolelinux.org/pere/blog/images/2017-01-09-www.stortinget.no-scapy-small.png" alt="scapy traceroute graph for URLs used by www.stortinget.no"/></a></p>
+
+<p>I also had a look at the visual traceroute graph created by
+<a href="http://www.secdev.org/projects/scapy/">the scrapy project</a>,
+showing IP network ownership (aka AS owner) for the IP address in
+question.
+<a href="http://people.skolelinux.org/pere/blog/images/2017-01-09-www.stortinget.no-scapy.svg">The
+graph display a lot of useful information about the traceroute in SVG
+format</a>, and give a good indication on who control the network
+equipment involved, but it do not include geolocation. This graph
+make it possible to see the information is made available at least for
+UNINETT, Catchcom, Stortinget, Nordunet, Google, Amazon, Telia, Level
+3 Communications and NetDNA.</p>
+
+<p align="center"><a href="https://geotraceroute.com/index.php?node=4&host=www.stortinget.no"><img
+src="http://people.skolelinux.org/pere/blog/images/2017-01-09-www.stortinget.no-geotraceroute-small.png" alt="example geotraceroute view for www.stortinget.no"/></a></p>
+
+<p>In the process, I came across the
+<a href="https://geotraceroute.com/">web service GeoTraceroute</a> by
+Salim Gasmi. Its methology of combining guesses based on DNS names,
+various location databases and finally use latecy times to rule out
+candidate locations seemed to do a very good job of guessing correct
+geolocation. But it could only do one trace at the time, did not have
+a sensor in Norway and did not make the geolocations easily available
+for postprocessing. So I contacted the developer and asked if he
+would be willing to share the code (he refused until he had time to
+clean it up), but he was interested in providing the geolocations in a
+machine readable format, and willing to set up a sensor in Norway. So
+since yesterday, it is possible to run traces from Norway in this
+service thanks to a sensor node set up by
+<a href="https://www.nuug.no/">the NUUG assosiation</a>, and get the
+trace in KML format for further processing.</p>
+
+<p align="center"><a href="http://people.skolelinux.org/pere/blog/images/2017-01-09-www.stortinget.no-geotraceroute-kml-join.kml"><img
+src="http://people.skolelinux.org/pere/blog/images/2017-01-09-www.stortinget.no-geotraceroute-kml-join.png" alt="map of combined traces for URLs used by www.stortinget.no using geotraceroute"/></a></p>
+
+<p>Here we can see a lot of trafic passes Sweden on its way to
+Denmark, Germany, Holland and Ireland. Plenty of places where the
+Snowden confirmations verified the traffic is read by various actors
+without your best interest as their top priority.</p>
+
+<p>Combining KML files is trivial using a text editor, so I could loop
+over all the hosts behind the urls imported by www.stortinget.no and
+ask for the KML file from GeoTraceroute, and create a combined KML
+file with all the traces (unfortunately only one of the IP addresses
+behind the DNS name is traced this time. To get them all, one would
+have to request traces using IP number instead of DNS names from
+GeoTraceroute). That might be the next step in this project.</p>
+
+<p>Armed with these tools, I find it a lot easier to figure out where
+the IP traffic moves and who control the boxes involved in moving it.
+And every time the link crosses for example the Swedish border, we can
+be sure Swedish Signal Intelligence (FRA) is listening, as GCHQ do in
+Britain and NSA in USA and cables around the globe. (Hm, what should
+we tell them? :) Keep that in mind if you ever send anything
+unencrypted over the Internet.</p>
+
+<p>PS: KML files are drawn using
+<a href="http://ivanrublev.me/kml/">the KML viewer from Ivan
+Rublev<a/>, as it was less cluttered than the local Linux application
+Marble. There are heaps of other options too.</p>
+
+<p>As usual, if you use Bitcoin and want to show your support of my
+activities, please send Bitcoin donations to my address
+<b><a href="bitcoin:15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b&label=PetterReinholdtsenBlog">15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b</a></b>.</p>
+</div>
+ <div class="tags">
+
+
+ Tags: <a href="http://people.skolelinux.org/pere/blog/tags/debian">debian</a>, <a href="http://people.skolelinux.org/pere/blog/tags/english">english</a>, <a href="http://people.skolelinux.org/pere/blog/tags/kart">kart</a>, <a href="http://people.skolelinux.org/pere/blog/tags/nuug">nuug</a>, <a href="http://people.skolelinux.org/pere/blog/tags/personvern">personvern</a>, <a href="http://people.skolelinux.org/pere/blog/tags/stortinget">stortinget</a>, <a href="http://people.skolelinux.org/pere/blog/tags/surveillance">surveillance</a>, <a href="http://people.skolelinux.org/pere/blog/tags/web">web</a>.
+
+
+ </div>
+ </div>
+ <div class="padding"></div>
+
+ <div class="entry">
+ <div class="title"><a href="http://people.skolelinux.org/pere/blog/Introducing_ical_archiver_to_split_out_old_iCalendar_entries.html">Introducing ical-archiver to split out old iCalendar entries</a></div>
+ <div class="date"> 4th January 2017</div>
+ <div class="body"><p>Do you have a large <a href="https://icalendar.org/">iCalendar</a>
+file with lots of old entries, and would like to archive them to save
+space and resources? At least those of us using KOrganizer know that
+turning on and off an event set become slower and slower the more
+entries are in the set. While working on migrating our calendars to a
+<a href="http://radicale.org/">Radicale CalDAV server</a> on our
+<a href="https://freedomboxfoundation.org/">Freedombox server</a/>, my
+loved one wondered if I could find a way to split up the calendar file
+she had in KOrganizer, and I set out to write a tool. I spent a few
+days writing and polishing the system, and it is now ready for general
+consumption. The
+<a href="https://github.com/petterreinholdtsen/ical-archiver">code for
+ical-archiver</a> is publicly available from a git repository on
+github. The system is written in Python and depend on
+<a href="http://eventable.github.io/vobject/">the vobject Python
+module</a>.</p>
+
+<p>To use it, locate the iCalendar file you want to operate on and
+give it as an argument to the ical-archiver script. This will
+generate a set of new files, one file per component type per year for
+all components expiring more than two years in the past. The vevent,
+vtodo and vjournal entries are handled by the script. The remaining
+entries are stored in a 'remaining' file.</p>
+
+<p>This is what a test run can look like:
+
+<p><pre>
+% ical-archiver t/2004-2016.ics
+Found 3612 vevents
+Found 6 vtodos
+Found 2 vjournals
+Writing t/2004-2016.ics-subset-vevent-2004.ics
+Writing t/2004-2016.ics-subset-vevent-2005.ics
+Writing t/2004-2016.ics-subset-vevent-2006.ics
+Writing t/2004-2016.ics-subset-vevent-2007.ics
+Writing t/2004-2016.ics-subset-vevent-2008.ics
+Writing t/2004-2016.ics-subset-vevent-2009.ics
+Writing t/2004-2016.ics-subset-vevent-2010.ics
+Writing t/2004-2016.ics-subset-vevent-2011.ics
+Writing t/2004-2016.ics-subset-vevent-2012.ics
+Writing t/2004-2016.ics-subset-vevent-2013.ics
+Writing t/2004-2016.ics-subset-vevent-2014.ics
+Writing t/2004-2016.ics-subset-vjournal-2007.ics
+Writing t/2004-2016.ics-subset-vjournal-2011.ics
+Writing t/2004-2016.ics-subset-vtodo-2012.ics
+Writing t/2004-2016.ics-remaining.ics