1 <?xml version=
"1.0" encoding=
"ISO-8859-1"?>
2 <rss version='
2.0' xmlns:lj='http://www.livejournal.org/rss/lj/
1.0/'
>
4 <title>Petter Reinholdtsen - Entries from January
2017</title>
5 <description>Entries from January
2017</description>
6 <link>http://people.skolelinux.org/pere/blog/
</link>
10 <title>Bryter NAV sin egen personvernerklæring?
</title>
11 <link>http://people.skolelinux.org/pere/blog/Bryter_NAV_sin_egen_personvernerkl_ring_.html
</link>
12 <guid isPermaLink=
"true">http://people.skolelinux.org/pere/blog/Bryter_NAV_sin_egen_personvernerkl_ring_.html
</guid>
13 <pubDate>Wed,
11 Jan
2017 06:
50:
00 +
0100</pubDate>
14 <description><p
>Jeg leste med interesse en nyhetssak hos
15 <a href=
"http://www.digi.no/artikler/nav-avslorer-trygdemisbruk-ved-a-spore-ip-adresser/
367394">digi.no
</a
>
17 <a href=
"https://www.nrk.no/buskerud/trygdesvindlere-avslores-av-utenlandske-ip-adresser-
1.13313461">NRK
</a
>
18 om at det ikke bare er meg, men at også NAV bedriver geolokalisering
19 av IP-adresser, og at det gjøres analyse av IP-adressene til de som
20 sendes inn meldekort for å se om meldekortet sendes inn fra
21 utenlandske IP-adresser. Politiadvokat i Drammen, Hans Lyder Haare,
22 er sitert i NRK på at «De to er jo blant annet avslørt av
23 IP-adresser. At man ser at meldekortet kommer fra utlandet.»
</p
>
25 <p
>Jeg synes det er fint at det blir bedre kjent at IP-adresser
26 knyttes til enkeltpersoner og at innsamlet informasjon brukes til å
27 stedsbestemme personer også av aktører her i Norge. Jeg ser det som
28 nok et argument for å bruke
29 <a href=
"https://www.torproject.org/
">Tor
</a
> så mye som mulig for å
30 gjøre gjøre IP-lokalisering vanskeligere, slik at en kan beskytte sin
31 privatsfære og unngå å dele sin fysiske plassering med
32 uvedkommede.
</p
>
34 <P
>Men det er en ting som bekymrer meg rundt denne nyheten. Jeg ble
35 tipset (takk #nuug) om
36 <a href=
"https://www.nav.no/no/NAV+og+samfunn/Kontakt+NAV/Teknisk+brukerstotte/Snarveier/personvernerkl%C3%A6ring-for-arbeids-og-velferdsetaten
">NAVs
37 personvernerklæring
</a
>, som under punktet «Personvern og statistikk»
40 <p
><blockquote
>
42 <p
>«Når du besøker nav.no, etterlater du deg elektroniske spor. Sporene
43 dannes fordi din nettleser automatisk sender en rekke opplysninger til
44 NAVs tjener (server-maskin) hver gang du ber om å få vist en side. Det
45 er eksempelvis opplysninger om hvilken nettleser og -versjon du
46 bruker, og din internettadresse (ip-adresse). For hver side som vises,
47 lagres følgende opplysninger:
</p
>
50 <li
>hvilken side du ser på
</li
>
51 <li
>dato og tid
</li
>
52 <li
>hvilken nettleser du bruker
</li
>
53 <li
>din ip-adresse
</li
>
56 <p
>Ingen av opplysningene vil bli brukt til å identifisere
57 enkeltpersoner. NAV bruker disse opplysningene til å generere en
58 samlet statistikk som blant annet viser hvilke sider som er mest
59 populære. Statistikken er et redskap til å forbedre våre
62 </blockquote
></p
>
64 <p
>Jeg klarer ikke helt å se hvordan analyse av de besøkendes
65 IP-adresser for å se hvem som sender inn meldekort via web fra en
66 IP-adresse i utlandet kan gjøres uten å komme i strid med påstanden om
67 at «ingen av opplysningene vil bli brukt til å identifisere
68 enkeltpersoner». Det virker dermed for meg som at NAV bryter sine
69 egen personvernerklæring, hvilket
70 <a href=
"http://people.skolelinux.org/pere/blog/Er_lover_brutt_n_r_personvernpolicy_ikke_stemmer_med_praksis_.html
">Datatilsynet
71 fortalte meg i starten av desember antagelig er brudd på
72 personopplysningsloven
</a
>.
74 <p
>I tillegg er personvernerklæringen ganske misvisende i og med at
75 NAVs nettsider ikke bare forsyner NAV med personopplysninger, men i
76 tillegg ber brukernes nettleser kontakte fem andre nettjenere
77 (script.hotjar.com, static.hotjar.com, vars.hotjar.com,
78 www.google-analytics.com og www.googletagmanager.com), slik at
79 personopplysninger blir gjort tilgjengelig for selskapene Hotjar og
80 Google , og alle som kan lytte på trafikken på veien (som FRA, GCHQ og
81 NSA). Jeg klarer heller ikke se hvordan slikt spredning av
82 personopplysninger kan være i tråd med kravene i
83 personopplysningloven, eller i tråd med NAVs personvernerklæring.
</p
>
85 <p
>Kanskje NAV bør ta en nøye titt på sin personvernerklæring? Eller
86 kanskje Datatilsynet bør gjøre det?
</p
>
91 <title>Where did that package go?
&mdash; geolocated IP traceroute
</title>
92 <link>http://people.skolelinux.org/pere/blog/Where_did_that_package_go___mdash__geolocated_IP_traceroute.html
</link>
93 <guid isPermaLink=
"true">http://people.skolelinux.org/pere/blog/Where_did_that_package_go___mdash__geolocated_IP_traceroute.html
</guid>
94 <pubDate>Mon,
9 Jan
2017 12:
20:
00 +
0100</pubDate>
95 <description><p
>Did you ever wonder where the web trafic really flow to reach the
96 web servers, and who own the network equipment it is flowing through?
97 It is possible to get a glimpse of this from using traceroute, but it
98 is hard to find all the details. Many years ago, I wrote a system to
99 map the Norwegian Internet (trying to figure out if our plans for a
100 network game service would get low enough latency, and who we needed
101 to talk to about setting up game servers close to the users. Back
102 then I used traceroute output from many locations (I asked my friends
103 to run a script and send me their traceroute output) to create the
104 graph and the map. The output from traceroute typically look like
108 traceroute to www.stortinget.no (
85.88.67.10),
30 hops max,
60 byte packets
109 1 uio-gw10.uio.no (
129.240.202.1)
0.447 ms
0.486 ms
0.621 ms
110 2 uio-gw8.uio.no (
129.240.24.229)
0.467 ms
0.578 ms
0.675 ms
111 3 oslo-gw1.uninett.no (
128.39.65.17)
0.385 ms
0.373 ms
0.358 ms
112 4 te3-
1-
2.br1.fn3.as2116.net (
193.156.90.3)
1.174 ms
1.172 ms
1.153 ms
113 5 he16-
1-
1.cr1.san110.as2116.net (
195.0.244.234)
2.627 ms he16-
1-
1.cr2.oslosda310.as2116.net (
195.0.244.48)
3.172 ms he16-
1-
1.cr1.san110.as2116.net (
195.0.244.234)
2.857 ms
114 6 ae1.ar8.oslosda310.as2116.net (
195.0.242.39)
0.662 ms
0.637 ms ae0.ar8.oslosda310.as2116.net (
195.0.242.23)
0.622 ms
115 7 89.191.10.146 (
89.191.10.146)
0.931 ms
0.917 ms
0.955 ms
119 </pre
></p
>
121 <p
>This show the DNS names and IP addresses of (at least some of the)
122 network equipment involved in getting the data traffic from me to the
123 www.stortinget.no server, and how long it took in milliseconds for a
124 package to reach the equipment and return to me. Three packages are
125 sent, and some times the packages do not follow the same path. This
126 is shown for hop
5, where three different IP addresses replied to the
127 traceroute request.
</p
>
129 <p
>There are many ways to measure trace routes. Other good traceroute
130 implementations I use are traceroute (using ICMP packages) mtr (can do
131 both ICMP, UDP and TCP) and scapy (python library with ICMP, UDP, TCP
132 traceroute and a lot of other capabilities). All of them are easily
133 available in
<a href=
"https://www.debian.org/
">Debian
</a
>.
</p
>
135 <p
>This time around, I wanted to know the geographic location of
136 different route points, to visualize how visiting a web page spread
137 information about the visit to a lot of servers around the globe. The
138 background is that a web site today often will ask the browser to get
139 from many servers the parts (for example HTML, JSON, fonts,
140 JavaScript, CSS, video) required to display the content. This will
141 leak information about the visit to those controlling these servers
142 and anyone able to peek at the data traffic passing by (like your ISP,
143 the ISPs backbone provider, FRA, GCHQ, NSA and others).
</p
>
145 <p
>Lets pick an example, the Norwegian parliament web site
146 www.stortinget.no. It is read daily by all members of parliament and
147 their staff, as well as political journalists, activits and many other
148 citizens of Norway. A visit to the www.stortinget.no web site will
149 ask your browser to contact
8 other servers: ajax.googleapis.com,
150 insights.hotjar.com, script.hotjar.com, static.hotjar.com,
151 stats.g.doubleclick.net, www.google-analytics.com,
152 www.googletagmanager.com and www.netigate.se. I extracted this by
153 asking
<a href=
"http://phantomjs.org/
">PhantomJS
</a
> to visit the
154 Stortinget web page and tell me all the URLs PhantomJS downloaded to
155 render the page (in HAR format using
156 <a href=
"https://github.com/ariya/phantomjs/blob/master/examples/netsniff.js
">their
157 netsniff example
</a
>. I am very grateful to Gorm for showing me how
158 to do this). My goal is to visualize network traces to all IP
159 addresses behind these DNS names, do show where visitors personal
160 information is spread when visiting the page.
</p
>
162 <p align=
"center
"><a href=
"www.stortinget.no-geoip.kml
"><img
163 src=
"http://people.skolelinux.org/pere/blog/images/
2017-
01-
09-www.stortinget.no-geoip-small.png
" alt=
"map of combined traces for URLs used by www.stortinget.no using GeoIP
"/
></a
></p
>
165 <p
>When I had a look around for options, I could not find any good
166 free software tools to do this, and decided I needed my own traceroute
167 wrapper outputting KML based on locations looked up using GeoIP. KML
168 is easy to work with and easy to generate, and understood by several
169 of the GIS tools I have available. I got good help from by NUUG
170 colleague Anders Einar with this, and the result can be seen in
171 <a href=
"https://github.com/petterreinholdtsen/kmltraceroute
">my
172 kmltraceroute git repository
</a
>. Unfortunately, the quality of the
173 free GeoIP databases I could find (and the for-pay databases my
174 friends had access to) is not up to the task. The IP addresses of
175 central Internet infrastructure would typically be placed near the
176 controlling companies main office, and not where the router is really
177 located, as you can see from
<a href=
"www.stortinget.no-geoip.kml
">the
178 KML file I created
</a
> using the GeoLite City dataset from MaxMind.
180 <p align=
"center
"><a href=
"http://people.skolelinux.org/pere/blog/images/
2017-
01-
09-www.stortinget.no-scapy.svg
"><img
181 src=
"http://people.skolelinux.org/pere/blog/images/
2017-
01-
09-www.stortinget.no-scapy-small.png
" alt=
"scapy traceroute graph for URLs used by www.stortinget.no
"/
></a
></p
>
183 <p
>I also had a look at the visual traceroute graph created by
184 <a href=
"http://www.secdev.org/projects/scapy/
">the scrapy project
</a
>,
185 showing IP network ownership (aka AS owner) for the IP address in
187 <a href=
"http://people.skolelinux.org/pere/blog/images/
2017-
01-
09-www.stortinget.no-scapy.svg
">The
188 graph display a lot of useful information about the traceroute in SVG
189 format
</a
>, and give a good indication on who control the network
190 equipment involved, but it do not include geolocation. This graph
191 make it possible to see the information is made available at least for
192 UNINETT, Catchcom, Stortinget, Nordunet, Google, Amazon, Telia, Level
193 3 Communications and NetDNA.
</p
>
195 <p align=
"center
"><a href=
"https://geotraceroute.com/index.php?node=
4&host=www.stortinget.no
"><img
196 src=
"http://people.skolelinux.org/pere/blog/images/
2017-
01-
09-www.stortinget.no-geotraceroute-small.png
" alt=
"example geotraceroute view for www.stortinget.no
"/
></a
></p
>
198 <p
>In the process, I came across the
199 <a href=
"https://geotraceroute.com/
">web service GeoTraceroute
</a
> by
200 Salim Gasmi. Its methology of combining guesses based on DNS names,
201 various location databases and finally use latecy times to rule out
202 candidate locations seemed to do a very good job of guessing correct
203 geolocation. But it could only do one trace at the time, did not have
204 a sensor in Norway and did not make the geolocations easily available
205 for postprocessing. So I contacted the developer and asked if he
206 would be willing to share the code (he refused until he had time to
207 clean it up), but he was interested in providing the geolocations in a
208 machine readable format, and willing to set up a sensor in Norway. So
209 since yesterday, it is possible to run traces from Norway in this
210 service thanks to a sensor node set up by
211 <a href=
"https://www.nuug.no/
">the NUUG assosiation
</a
>, and get the
212 trace in KML format for further processing.
</p
>
214 <p align=
"center
"><a href=
"http://people.skolelinux.org/pere/blog/images/
2017-
01-
09-www.stortinget.no-geotraceroute-kml-join.kml
"><img
215 src=
"http://people.skolelinux.org/pere/blog/images/
2017-
01-
09-www.stortinget.no-geotraceroute-kml-join.png
" alt=
"map of combined traces for URLs used by www.stortinget.no using geotraceroute
"/
></a
></p
>
217 <p
>Here we can see a lot of trafic passes Sweden on its way to
218 Denmark, Germany, Holland and Ireland. Plenty of places where the
219 Snowden confirmations verified the traffic is read by various actors
220 without your best interest as their top priority.
</p
>
222 <p
>Combining KML files is trivial using a text editor, so I could loop
223 over all the hosts behind the urls imported by www.stortinget.no and
224 ask for the KML file from GeoTraceroute, and create a combined KML
225 file with all the traces (unfortunately only one of the IP addresses
226 behind the DNS name is traced this time. To get them all, one would
227 have to request traces using IP number instead of DNS names from
228 GeoTraceroute). That might be the next step in this project.
</p
>
230 <p
>Armed with these tools, I find it a lot easier to figure out where
231 the IP traffic moves and who control the boxes involved in moving it.
232 And every time the link crosses for example the Swedish border, we can
233 be sure Swedish Signal Intelligence (FRA) is listening, as GCHQ do in
234 Britain and NSA in USA and cables around the globe. (Hm, what should
235 we tell them? :) Keep that in mind if you ever send anything
236 unencrypted over the Internet.
</p
>
238 <p
>PS: KML files are drawn using
239 <a href=
"http://ivanrublev.me/kml/
">the KML viewer from Ivan
240 Rublev
<a/
>, as it was less cluttered than the local Linux application
241 Marble. There are heaps of other options too.
</p
>
243 <p
>As usual, if you use Bitcoin and want to show your support of my
244 activities, please send Bitcoin donations to my address
245 <b
><a href=
"bitcoin:
15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b
&label=PetterReinholdtsenBlog
">15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b
</a
></b
>.
</p
>
250 <title>Introducing ical-archiver to split out old iCalendar entries
</title>
251 <link>http://people.skolelinux.org/pere/blog/Introducing_ical_archiver_to_split_out_old_iCalendar_entries.html
</link>
252 <guid isPermaLink=
"true">http://people.skolelinux.org/pere/blog/Introducing_ical_archiver_to_split_out_old_iCalendar_entries.html
</guid>
253 <pubDate>Wed,
4 Jan
2017 12:
20:
00 +
0100</pubDate>
254 <description><p
>Do you have a large
<a href=
"https://icalendar.org/
">iCalendar
</a
>
255 file with lots of old entries, and would like to archive them to save
256 space and resources? At least those of us using KOrganizer know that
257 turning on and off an event set become slower and slower the more
258 entries are in the set. While working on migrating our calendars to a
259 <a href=
"http://radicale.org/
">Radicale CalDAV server
</a
> on our
260 <a href=
"https://freedomboxfoundation.org/
">Freedombox server
</a/
>, my
261 loved one wondered if I could find a way to split up the calendar file
262 she had in KOrganizer, and I set out to write a tool. I spent a few
263 days writing and polishing the system, and it is now ready for general
265 <a href=
"https://github.com/petterreinholdtsen/ical-archiver
">code for
266 ical-archiver
</a
> is publicly available from a git repository on
267 github. The system is written in Python and depend on
268 <a href=
"http://eventable.github.io/vobject/
">the vobject Python
269 module
</a
>.
</p
>
271 <p
>To use it, locate the iCalendar file you want to operate on and
272 give it as an argument to the ical-archiver script. This will
273 generate a set of new files, one file per component type per year for
274 all components expiring more than two years in the past. The vevent,
275 vtodo and vjournal entries are handled by the script. The remaining
276 entries are stored in a
'remaining
' file.
</p
>
278 <p
>This is what a test run can look like:
281 % ical-archiver t/
2004-
2016.ics
285 Writing t/
2004-
2016.ics-subset-vevent-
2004.ics
286 Writing t/
2004-
2016.ics-subset-vevent-
2005.ics
287 Writing t/
2004-
2016.ics-subset-vevent-
2006.ics
288 Writing t/
2004-
2016.ics-subset-vevent-
2007.ics
289 Writing t/
2004-
2016.ics-subset-vevent-
2008.ics
290 Writing t/
2004-
2016.ics-subset-vevent-
2009.ics
291 Writing t/
2004-
2016.ics-subset-vevent-
2010.ics
292 Writing t/
2004-
2016.ics-subset-vevent-
2011.ics
293 Writing t/
2004-
2016.ics-subset-vevent-
2012.ics
294 Writing t/
2004-
2016.ics-subset-vevent-
2013.ics
295 Writing t/
2004-
2016.ics-subset-vevent-
2014.ics
296 Writing t/
2004-
2016.ics-subset-vjournal-
2007.ics
297 Writing t/
2004-
2016.ics-subset-vjournal-
2011.ics
298 Writing t/
2004-
2016.ics-subset-vtodo-
2012.ics
299 Writing t/
2004-
2016.ics-remaining.ics
301 </pre
></p
>
303 <p
>As you can see, the original file is untouched and new files are
304 written with names derived from the original file. If you are happy
305 with their content, the *-remaining.ics file can replace the original
306 the the others can be archived or imported as historical calendar
307 collections.
</p
>
309 <p
>The script should probably be improved a bit. The error handling
310 when discovering broken entries is not good, and I am not sure yet if
311 it make sense to split different entry types into separate files or
312 not. The program is thus likely to change. If you find it
313 interesting, please get in touch. :)
</p
>
315 <p
>As usual, if you use Bitcoin and want to show your support of my
316 activities, please send Bitcoin donations to my address
317 <b
><a href=
"bitcoin:
15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b
&label=PetterReinholdtsenBlog
">15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b
</a
></b
>.
</p
>