- <title>Where did that package go? &mdash; geolocated IP traceroute</title>
- <link>http://people.skolelinux.org/pere/blog/Where_did_that_package_go___mdash__geolocated_IP_traceroute.html</link>
- <guid isPermaLink="true">http://people.skolelinux.org/pere/blog/Where_did_that_package_go___mdash__geolocated_IP_traceroute.html</guid>
- <pubDate>Mon, 9 Jan 2017 12:20:00 +0100</pubDate>
- <description><p>Did you ever wonder where the web trafic really flow to reach the
-web servers, and who own the network equipment it is flowing through?
-It is possible to get a glimpse of this from using traceroute, but it
-is hard to find all the details. Many years ago, I wrote a system to
-map the Norwegian Internet (trying to figure out if our plans for a
-network game service would get low enough latency, and who we needed
-to talk to about setting up game servers close to the users. Back
-then I used traceroute output from many locations (I asked my friends
-to run a script and send me their traceroute output) to create the
-graph and the map. The output from traceroute typically look like
-this:
-
-<p><pre>
-traceroute to www.stortinget.no (85.88.67.10), 30 hops max, 60 byte packets
- 1 uio-gw10.uio.no (129.240.202.1) 0.447 ms 0.486 ms 0.621 ms
- 2 uio-gw8.uio.no (129.240.24.229) 0.467 ms 0.578 ms 0.675 ms
- 3 oslo-gw1.uninett.no (128.39.65.17) 0.385 ms 0.373 ms 0.358 ms
- 4 te3-1-2.br1.fn3.as2116.net (193.156.90.3) 1.174 ms 1.172 ms 1.153 ms
- 5 he16-1-1.cr1.san110.as2116.net (195.0.244.234) 2.627 ms he16-1-1.cr2.oslosda310.as2116.net (195.0.244.48) 3.172 ms he16-1-1.cr1.san110.as2116.net (195.0.244.234) 2.857 ms
- 6 ae1.ar8.oslosda310.as2116.net (195.0.242.39) 0.662 ms 0.637 ms ae0.ar8.oslosda310.as2116.net (195.0.242.23) 0.622 ms
- 7 89.191.10.146 (89.191.10.146) 0.931 ms 0.917 ms 0.955 ms
- 8 * * *
- 9 * * *
-[...]
-</pre></p>
-
-<p>This show the DNS names and IP addresses of (at least some of the)
-network equipment involved in getting the data traffic from me to the
-www.stortinget.no server, and how long it took in milliseconds for a
-package to reach the equipment and return to me. Three packages are
-sent, and some times the packages do not follow the same path. This
-is shown for hop 5, where three different IP addresses replied to the
-traceroute request.</p>
-
-<p>There are many ways to measure trace routes. Other good traceroute
-implementations I use are traceroute (using ICMP packages) mtr (can do
-both ICMP, UDP and TCP) and scapy (python library with ICMP, UDP, TCP
-traceroute and a lot of other capabilities). All of them are easily
-available in <a href="https://www.debian.org/">Debian</a>.</p>
-
-<p>This time around, I wanted to know the geographic location of
-different route points, to visualize how visiting a web page spread
-information about the visit to a lot of servers around the globe. The
-background is that a web site today often will ask the browser to get
-from many servers the parts (for example HTML, JSON, fonts,
-JavaScript, CSS, video) required to display the content. This will
-leak information about the visit to those controlling these servers
-and anyone able to peek at the data traffic passing by (like your ISP,
-the ISPs backbone provider, FRA, GCHQ, NSA and others).</p>
-
-<p>Lets pick an example, the Norwegian parliament web site
-www.stortinget.no. It is read daily by all members of parliament and
-their staff, as well as political journalists, activits and many other
-citizens of Norway. A visit to the www.stortinget.no web site will
-ask your browser to contact 8 other servers: ajax.googleapis.com,
-insights.hotjar.com, script.hotjar.com, static.hotjar.com,
-stats.g.doubleclick.net, www.google-analytics.com,
-www.googletagmanager.com and www.netigate.se. I extracted this by
-asking <a href="http://phantomjs.org/">PhantomJS</a> to visit the
-Stortinget web page and tell me all the URLs PhantomJS downloaded to
-render the page (in HAR format using
-<a href="https://github.com/ariya/phantomjs/blob/master/examples/netsniff.js">their
-netsniff example</a>. I am very grateful to Gorm for showing me how
-to do this). My goal is to visualize network traces to all IP
-addresses behind these DNS names, do show where visitors personal
-information is spread when visiting the page.</p>
-
-<p align="center"><a href="www.stortinget.no-geoip.kml"><img
-src="http://people.skolelinux.org/pere/blog/images/2017-01-09-www.stortinget.no-geoip-small.png" alt="map of combined traces for URLs used by www.stortinget.no using GeoIP"/></a></p>
-
-<p>When I had a look around for options, I could not find any good
-free software tools to do this, and decided I needed my own traceroute
-wrapper outputting KML based on locations looked up using GeoIP. KML
-is easy to work with and easy to generate, and understood by several
-of the GIS tools I have available. I got good help from by NUUG
-colleague Anders Einar with this, and the result can be seen in
-<a href="https://github.com/petterreinholdtsen/kmltraceroute">my
-kmltraceroute git repository</a>. Unfortunately, the quality of the
-free GeoIP databases I could find (and the for-pay databases my
-friends had access to) is not up to the task. The IP addresses of
-central Internet infrastructure would typically be placed near the
-controlling companies main office, and not where the router is really
-located, as you can see from <a href="www.stortinget.no-geoip.kml">the
-KML file I created</a> using the GeoLite City dataset from MaxMind.
-
-<p align="center"><a href="http://people.skolelinux.org/pere/blog/images/2017-01-09-www.stortinget.no-scapy.svg"><img
-src="http://people.skolelinux.org/pere/blog/images/2017-01-09-www.stortinget.no-scapy-small.png" alt="scapy traceroute graph for URLs used by www.stortinget.no"/></a></p>
-
-<p>I also had a look at the visual traceroute graph created by
-<a href="http://www.secdev.org/projects/scapy/">the scrapy project</a>,
-showing IP network ownership (aka AS owner) for the IP address in
-question.
-<a href="http://people.skolelinux.org/pere/blog/images/2017-01-09-www.stortinget.no-scapy.svg">The
-graph display a lot of useful information about the traceroute in SVG
-format</a>, and give a good indication on who control the network
-equipment involved, but it do not include geolocation. This graph
-make it possible to see the information is made available at least for
-UNINETT, Catchcom, Stortinget, Nordunet, Google, Amazon, Telia, Level
-3 Communications and NetDNA.</p>
-
-<p align="center"><a href="https://geotraceroute.com/index.php?node=4&host=www.stortinget.no"><img
-src="http://people.skolelinux.org/pere/blog/images/2017-01-09-www.stortinget.no-geotraceroute-small.png" alt="example geotraceroute view for www.stortinget.no"/></a></p>
-
-<p>In the process, I came across the
-<a href="https://geotraceroute.com/">web service GeoTraceroute</a> by
-Salim Gasmi. Its methology of combining guesses based on DNS names,
-various location databases and finally use latecy times to rule out
-candidate locations seemed to do a very good job of guessing correct
-geolocation. But it could only do one trace at the time, did not have
-a sensor in Norway and did not make the geolocations easily available
-for postprocessing. So I contacted the developer and asked if he
-would be willing to share the code (he refused until he had time to
-clean it up), but he was interested in providing the geolocations in a
-machine readable format, and willing to set up a sensor in Norway. So
-since yesterday, it is possible to run traces from Norway in this
-service thanks to a sensor node set up by
-<a href="https://www.nuug.no/">the NUUG assosiation</a>, and get the
-trace in KML format for further processing.</p>
-
-<p align="center"><a href="http://people.skolelinux.org/pere/blog/images/2017-01-09-www.stortinget.no-geotraceroute-kml-join.kml"><img
-src="http://people.skolelinux.org/pere/blog/images/2017-01-09-www.stortinget.no-geotraceroute-kml-join.png" alt="map of combined traces for URLs used by www.stortinget.no using geotraceroute"/></a></p>
-
-<p>Here we can see a lot of trafic passes Sweden on its way to
-Denmark, Germany, Holland and Ireland. Plenty of places where the
-Snowden confirmations verified the traffic is read by various actors
-without your best interest as their top priority.</p>
-
-<p>Combining KML files is trivial using a text editor, so I could loop
-over all the hosts behind the urls imported by www.stortinget.no and
-ask for the KML file from GeoTraceroute, and create a combined KML
-file with all the traces (unfortunately only one of the IP addresses
-behind the DNS name is traced this time. To get them all, one would
-have to request traces using IP number instead of DNS names from
-GeoTraceroute). That might be the next step in this project.</p>
-
-<p>Armed with these tools, I find it a lot easier to figure out where
-the IP traffic moves and who control the boxes involved in moving it.
-And every time the link crosses for example the Swedish border, we can
-be sure Swedish Signal Intelligence (FRA) is listening, as GCHQ do in
-Britain and NSA in USA and cables around the globe. (Hm, what should
-we tell them? :) Keep that in mind if you ever send anything
-unencrypted over the Internet.</p>
-
-<p>PS: KML files are drawn using
-<a href="http://ivanrublev.me/kml/">the KML viewer from Ivan
-Rublev<a/>, as it was less cluttered than the local Linux application
-Marble. There are heaps of other options too.</p>
-
-<p>As usual, if you use Bitcoin and want to show your support of my
-activities, please send Bitcoin donations to my address
-<b><a href="bitcoin:15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b&label=PetterReinholdtsenBlog">15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b</a></b>.</p>
-</description>
- </item>
-
- <item>
- <title>Introducing ical-archiver to split out old iCalendar entries</title>
- <link>http://people.skolelinux.org/pere/blog/Introducing_ical_archiver_to_split_out_old_iCalendar_entries.html</link>
- <guid isPermaLink="true">http://people.skolelinux.org/pere/blog/Introducing_ical_archiver_to_split_out_old_iCalendar_entries.html</guid>
- <pubDate>Wed, 4 Jan 2017 12:20:00 +0100</pubDate>
- <description><p>Do you have a large <a href="https://icalendar.org/">iCalendar</a>
-file with lots of old entries, and would like to archive them to save
-space and resources? At least those of us using KOrganizer know that
-turning on and off an event set become slower and slower the more
-entries are in the set. While working on migrating our calendars to a
-<a href="http://radicale.org/">Radicale CalDAV server</a> on our
-<a href="https://freedomboxfoundation.org/">Freedombox server</a/>, my
-loved one wondered if I could find a way to split up the calendar file
-she had in KOrganizer, and I set out to write a tool. I spent a few
-days writing and polishing the system, and it is now ready for general
-consumption. The
-<a href="https://github.com/petterreinholdtsen/ical-archiver">code for
-ical-archiver</a> is publicly available from a git repository on
-github. The system is written in Python and depend on
-<a href="http://eventable.github.io/vobject/">the vobject Python
-module</a>.</p>
-
-<p>To use it, locate the iCalendar file you want to operate on and
-give it as an argument to the ical-archiver script. This will
-generate a set of new files, one file per component type per year for
-all components expiring more than two years in the past. The vevent,
-vtodo and vjournal entries are handled by the script. The remaining
-entries are stored in a 'remaining' file.</p>
-
-<p>This is what a test run can look like:
-
-<p><pre>
-% ical-archiver t/2004-2016.ics
-Found 3612 vevents
-Found 6 vtodos
-Found 2 vjournals
-Writing t/2004-2016.ics-subset-vevent-2004.ics
-Writing t/2004-2016.ics-subset-vevent-2005.ics
-Writing t/2004-2016.ics-subset-vevent-2006.ics
-Writing t/2004-2016.ics-subset-vevent-2007.ics
-Writing t/2004-2016.ics-subset-vevent-2008.ics
-Writing t/2004-2016.ics-subset-vevent-2009.ics
-Writing t/2004-2016.ics-subset-vevent-2010.ics
-Writing t/2004-2016.ics-subset-vevent-2011.ics
-Writing t/2004-2016.ics-subset-vevent-2012.ics
-Writing t/2004-2016.ics-subset-vevent-2013.ics
-Writing t/2004-2016.ics-subset-vevent-2014.ics
-Writing t/2004-2016.ics-subset-vjournal-2007.ics
-Writing t/2004-2016.ics-subset-vjournal-2011.ics
-Writing t/2004-2016.ics-subset-vtodo-2012.ics
-Writing t/2004-2016.ics-remaining.ics
-%
-</pre></p>
-
-<p>As you can see, the original file is untouched and new files are
-written with names derived from the original file. If you are happy
-with their content, the *-remaining.ics file can replace the original
-the the others can be archived or imported as historical calendar
-collections.</p>
-
-<p>The script should probably be improved a bit. The error handling
-when discovering broken entries is not good, and I am not sure yet if
-it make sense to split different entry types into separate files or
-not. The program is thus likely to change. If you find it
-interesting, please get in touch. :)</p>
-
-<p>As usual, if you use Bitcoin and want to show your support of my
-activities, please send Bitcoin donations to my address
-<b><a href="bitcoin:15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b&label=PetterReinholdtsenBlog">15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b</a></b>.</p>
-</description>
- </item>
-
- <item>
- <title>Appstream just learned how to map hardware to packages too!</title>
- <link>http://people.skolelinux.org/pere/blog/Appstream_just_learned_how_to_map_hardware_to_packages_too_.html</link>
- <guid isPermaLink="true">http://people.skolelinux.org/pere/blog/Appstream_just_learned_how_to_map_hardware_to_packages_too_.html</guid>
- <pubDate>Fri, 23 Dec 2016 10:30:00 +0100</pubDate>
- <description><p>I received a very nice Christmas present today. As my regular
-readers probably know, I have been working on the
-<a href="http://packages.qa.debian.org/isenkram">the Isenkram
-system</a> for many years. The goal of the Isenkram system is to make
-it easier for users to figure out what to install to get a given piece
-of hardware to work in Debian, and a key part of this system is a way
-to map hardware to packages. Isenkram have its own mapping database,
-and also uses data provided by each package using the AppStream
-metadata format. And today,
-<a href="https://tracker.debian.org/pkg/appstream">AppStream</a> in
-Debian learned to look up hardware the same way Isenkram is doing it,
-ie using fnmatch():</p>
-
-<p><pre>
-% appstreamcli what-provides modalias \
- usb:v1130p0202d0100dc00dsc00dp00ic03isc00ip00in00
-Identifier: pymissile [generic]
-Name: pymissile
-Summary: Control original Striker USB Missile Launcher
-Package: pymissile
-% appstreamcli what-provides modalias usb:v0694p0002d0000
-Identifier: libnxt [generic]
-Name: libnxt
-Summary: utility library for talking to the LEGO Mindstorms NXT brick
-Package: libnxt
----
-Identifier: t2n [generic]
-Name: t2n
-Summary: Simple command-line tool for Lego NXT
-Package: t2n
----
-Identifier: python-nxt [generic]
-Name: python-nxt
-Summary: Python driver/interface/wrapper for the Lego Mindstorms NXT robot
-Package: python-nxt
----
-Identifier: nbc [generic]
-Name: nbc
-Summary: C compiler for LEGO Mindstorms NXT bricks
-Package: nbc
-%
-</pre></p>
-
-<p>A similar query can be done using the combined AppStream and
-Isenkram databases using the isenkram-lookup tool:</p>
-
-<p><pre>
-% isenkram-lookup usb:v1130p0202d0100dc00dsc00dp00ic03isc00ip00in00
-pymissile
-% isenkram-lookup usb:v0694p0002d0000
-libnxt
-nbc
-python-nxt
-t2n
-%
-</pre></p>
-
-<p>You can find modalias values relevant for your machine using
-<tt>cat $(find /sys/devices/ -name modalias)</tt>.
-
-<p>If you want to make this system a success and help Debian users
-make the most of the hardware they have, please
-help<a href="https://wiki.debian.org/AppStream/Guidelines">add
-AppStream metadata for your package following the guidelines</a>
-documented in the wiki. So far only 11 packages provide such
-information, among the several hundred hardware specific packages in
-Debian. The Isenkram database on the other hand contain 101 packages,
-mostly related to USB dongles. Most of the packages with hardware
-mapping in AppStream are LEGO Mindstorms related, because I have, as
-part of my involvement in
-<a href="https://wiki.debian.org/LegoDesigners">the Debian LEGO
-team</a> given priority to making sure LEGO users get proposed the
-complete set of packages in Debian for that particular hardware. The
-team also got a nice Christmas present today. The
-<a href="https://tracker.debian.org/pkg/nxt-firmware">nxt-firmware
-package</a> made it into Debian. With this package in place, it is
-now possible to use the LEGO Mindstorms NXT unit with only free
-software, as the nxt-firmware package contain the source and firmware
-binaries for the NXT brick.</p>
-
-<p>As usual, if you use Bitcoin and want to show your support of my
-activities, please send Bitcoin donations to my address
-<b><a href="bitcoin:15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b&label=PetterReinholdtsenBlog">15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b</a></b>.</p>
-</description>
- </item>
-
- <item>
- <title>Isenkram updated with a lot more hardware-package mappings</title>
- <link>http://people.skolelinux.org/pere/blog/Isenkram_updated_with_a_lot_more_hardware_package_mappings.html</link>
- <guid isPermaLink="true">http://people.skolelinux.org/pere/blog/Isenkram_updated_with_a_lot_more_hardware_package_mappings.html</guid>
- <pubDate>Tue, 20 Dec 2016 11:55:00 +0100</pubDate>
- <description><p><a href="http://packages.qa.debian.org/isenkram">The Isenkram
-system</a> I wrote two years ago to make it easier in Debian to find
-and install packages to get your hardware dongles to work, is still
-going strong. It is a system to look up the hardware present on or
-connected to the current system, and map the hardware to Debian
-packages. It can either be done using the tools in isenkram-cli or
-using the user space daemon in the isenkram package. The latter will
-notify you, when inserting new hardware, about what packages to
-install to get the dongle working. It will even provide a button to
-click on to ask packagekit to install the packages.</p>
-
-<p>Here is an command line example from my Thinkpad laptop:</p>
-
-<p><pre>
-% isenkram-lookup
-bluez
-cheese
-ethtool
-fprintd
-fprintd-demo
-gkrellm-thinkbat
-hdapsd
-libpam-fprintd
-pidgin-blinklight
-thinkfan
-tlp
-tp-smapi-dkms
-tp-smapi-source
-tpb
-%
-</pre></p>