X-Git-Url: http://pere.pagekite.me/gitweb/homepage.git/blobdiff_plain/5555b9e78965484460c4764b91c11fb8d75e0cf1..b279793fc3681f16b43cc7dba38f0d6f4e22d247:/blog/index.html diff --git a/blog/index.html b/blog/index.html index 1a839e55ab..133f41dd38 100644 --- a/blog/index.html +++ b/blog/index.html @@ -19,6 +19,451 @@ +
+
Release 0.1.1 of free software archive system Nikita announced
+
10th June 2017
+

I am very happy to report that the +Nikita Noark 5 +core project tagged its second release today. The free software +solution is an implementation of the Norwegian archive standard Noark +5 used by government offices in Norway. These were the changes in +version 0.1.1 since version 0.1.0 (from NEWS.md): + +

+ +

If this sound interesting to you, please contact us on IRC (#nikita +on irc.freenode.net) or email +(nikita-noark +mailing list).

+
+
+ + + Tags: english, nuug, offentlig innsyn, standard. + + +
+
+
+ +
+
Idea for storing trusted timestamps in a Noark 5 archive
+
7th June 2017
+

This is a copy of +an +email I posted to the nikita-noark mailing list. Please follow up +there if you would like to discuss this topic. The background is that +we are making a free software archive system based on the Norwegian +Noark +5 standard for government archives.

+ +

I've been wondering a bit lately how trusted timestamps could be +stored in Noark 5. +Trusted +timestamps can be used to verify that some information +(document/file/checksum/metadata) have not been changed since a +specific time in the past. This is useful to verify the integrity of +the documents in the archive.

+ +

Then it occured to me, perhaps the trusted timestamps could be +stored as dokument variants (ie dokumentobjekt referered to from +dokumentbeskrivelse) with the filename set to the hash it is +stamping?

+ +

Given a "dokumentbeskrivelse" with an associated "dokumentobjekt", +a new dokumentobjekt is associated with "dokumentbeskrivelse" with the +same attributes as the stamped dokumentobjekt except these +attributes:

+ + + +

This assume a service following +IETF RFC 3161 is +used, which specifiy the given MIME type for replies and the .tsr file +ending for the content of such trusted timestamp. As far as I can +tell from the Noark 5 specifications, it is OK to have several +variants/renderings of a dokument attached to a given +dokumentbeskrivelse objekt. It might be stretching it a bit to make +some of these variants represent crypto-signatures useful for +verifying the document integrity instead of representing the dokument +itself.

+ +

Using the source of the service in formatDetaljer allow several +timestamping services to be used. This is useful to spread the risk +of key compromise over several organisations. It would only be a +problem to trust the timestamps if all of the organisations are +compromised.

+ +

The following oneliner on Linux can be used to generate the tsr +file. $input is the path to the file to checksum, and $sha256 is the +SHA-256 checksum of the file (ie the ".tsr" value mentioned +above).

+ +

+openssl ts -query -data "$inputfile" -cert -sha256 -no_nonce \
+  | curl -s -H "Content-Type: application/timestamp-query" \
+      --data-binary "@-" http://zeitstempel.dfn.de > $sha256.tsr
+

+ +

To verify the timestamp, you first need to download the public key +of the trusted timestamp service, for example using this command:

+ +

+wget -O ca-cert.txt \
+  https://pki.pca.dfn.de/global-services-ca/pub/cacert/chain.txt
+

+ +

Note, the public key should be stored alongside the timestamps in +the archive to make sure it is also available 100 years from now. It +is probably a good idea to standardise how and were to store such +public keys, to make it easier to find for those trying to verify +documents 100 or 1000 years from now. :)

+ +

The verification itself is a simple openssl command:

+ +

+openssl ts -verify -data $inputfile -in $sha256.tsr \
+  -CAfile ca-cert.txt -text
+

+ +

Is there any reason this approach would not work? Is it somehow against +the Noark 5 specification?

+
+
+ + + Tags: english, offentlig innsyn, standard. + + +
+
+
+ +
+
NÃ¥r nynorskoversettelsen svikter til eksamen...
+
3rd June 2017
+

Aftenposten +melder i dag om feil i eksamensoppgavene for eksamen i politikk og +menneskerettigheter, der teksten i bokmåls og nynorskutgaven ikke var +like. Oppgaveteksten er gjengitt i artikkelen, og jeg ble nysgjerring +på om den fri oversetterløsningen +Apertium ville gjort en bedre +jobb enn Utdanningsdirektoratet. Det kan se slik ut.

+ +

Her er bokmålsoppgaven fra eksamenen:

+ +
+

Drøft utfordringene knyttet til nasjonalstatenes og andre aktørers +rolle og muligheter til å håndtere internasjonale utfordringer, som +for eksempel flykningekrisen.

+ +

Vedlegge er eksempler på tekster som kan gi relevante perspektiver +på temaet:

+
    +
  1. Flykningeregnskapet 2016, UNHCR og IDMC +
  2. «Grenseløst Europa for fall» A-Magasinet, 26. november 2015 +
+ +
+ +

Dette oversetter Apertium slik:

+ +
+

Drøft utfordringane knytte til nasjonalstatane sine og rolla til +andre aktørar og høve til å handtera internasjonale utfordringar, som +til dømes *flykningekrisen.

+ +

Vedleggja er døme på tekster som kan gje relevante perspektiv på +temaet:

+ +
    +
  1. *Flykningeregnskapet 2016, *UNHCR og *IDMC
  2. +
  3. «*Grenseløst Europa for fall» A-Magasinet, 26. november 2015
  4. +
+ +
+ +

Ord som ikke ble forstått er markert med stjerne (*), og trenger +ekstra språksjekk. Men ingen ord er forsvunnet, slik det var i +oppgaven elevene fikk presentert på eksamen. Jeg mistenker dog at +"andre aktørers rolle og muligheter til ..." burde vært oversatt til +"rolla til andre aktørar og deira høve til ..." eller noe slikt, men +det er kanskje flisespikking. Det understreker vel bare at det alltid +trengs korrekturlesning etter automatisk oversettelse.

+
+
+ + + Tags: debian, norsk, stavekontroll. + + +
+
+
+ +
+
Epost inn som arkivformat i Riksarkivarens forskrift?
+
27th April 2017
+

I disse dager, med frist 1. mai, har Riksarkivaren ute en høring på +sin forskrift. Som en kan se er det ikke mye tid igjen før fristen +som går ut på søndag. Denne forskriften er det som lister opp hvilke +formater det er greit å arkivere i +Noark +5-løsninger i Norge.

+ +

Jeg fant høringsdokumentene hos +Norsk +Arkivråd etter å ha blitt tipset på epostlisten til +fri +programvareprosjektet Nikita Noark5-Core, som lager et Noark 5 +Tjenestegresesnitt. Jeg er involvert i Nikita-prosjektet og takket +være min interesse for tjenestegrensesnittsprosjektet har jeg lest en +god del Noark 5-relaterte dokumenter, og til min overraskelse oppdaget +at standard epost ikke er på listen over godkjente formater som kan +arkiveres. Høringen med frist søndag er en glimrende mulighet til å +forsøke å gjøre noe med det. Jeg holder på med +egen +høringsuttalelse, og lurer på om andre er interessert i å støtte +forslaget om å tillate arkivering av epost som epost i arkivet.

+ +

Er du igang med å skrive egen høringsuttalelse allerede? I så fall +kan du jo vurdere å ta med en formulering om epost-lagring. Jeg tror +ikke det trengs så mye. Her et kort forslag til tekst:

+ +

+ +

Viser til høring sendt ut 2017-02-17 (Riksarkivarens referanse + 2016/9840 HELHJO), og tillater oss å sende inn noen innspill om + revisjon av Forskrift om utfyllende tekniske og arkivfaglige + bestemmelser om behandling av offentlige arkiver (Riksarkivarens + forskrift).

+ +

Svært mye av vår kommuikasjon foregår i dag på e-post.  Vi + foreslår derfor at Internett-e-post, slik det er beskrevet i IETF + RFC 5322, + https://tools.ietf.org/html/rfc5322. bør + inn som godkjent dokumentformat.  Vi foreslår at forskriftens + oversikt over godkjente dokumentformater ved innlevering i § 5-16 + endres til å ta med Internett-e-post.

+ +

+ +

Som del av arbeidet med tjenestegrensesnitt har vi testet hvordan +epost kan lagres i en Noark 5-struktur, og holder på å skrive et +forslag om hvordan dette kan gjøres som vil bli sendt over til +arkivverket så snart det er ferdig. De som er interesserte kan +følge +fremdriften på web.

+ +

Oppdatering 2017-04-28: I dag ble høringuttalelsen jeg skrev + sendt + inn av foreningen NUUG.

+
+
+ + + Tags: norsk, offentlig innsyn, standard. + + +
+
+
+ +
+
Offentlig elektronisk postjournal blokkerer tilgang for utvalgte webklienter
+
20th April 2017
+

Jeg oppdaget i dag at nettstedet som +publiserer offentlige postjournaler fra statlige etater, OEP, har +begynt å blokkerer enkelte typer webklienter fra å få tilgang. Vet +ikke hvor mange det gjelder, men det gjelder i hvert fall libwww-perl +og curl. For å teste selv, kjør følgende:

+ +
+% curl -v -s https://www.oep.no/pub/report.xhtml?reportId=3 2>&1 |grep '< HTTP'
+< HTTP/1.1 404 Not Found
+% curl -v -s --header 'User-Agent:Opera/12.0' https://www.oep.no/pub/report.xhtml?reportId=3 2>&1 |grep '< HTTP'
+< HTTP/1.1 200 OK
+%
+
+ +

Her kan en se at tjenesten gir «404 Not Found» for curl i +standardoppsettet, mens den gir «200 OK» hvis curl hevder å være Opera +versjon 12.0. Offentlig elektronisk postjournal startet blokkeringen +2017-03-02.

+ +

Blokkeringen vil gjøre det litt vanskeligere å maskinelt hente +informasjon fra oep.no. Kan blokkeringen være gjort for å hindre +automatisert innsamling av informasjon fra OEP, slik Pressens +Offentlighetsutvalg gjorde for å dokumentere hvordan departementene +hindrer innsyn i +rapporten +«Slik hindrer departementer innsyn» som ble publiserte i januar +2017. Det virker usannsynlig, da det jo er trivielt å bytte +User-Agent til noe nytt.

+ +

Finnes det juridisk grunnlag for det offentlige å diskriminere +webklienter slik det gjøres her? Der tilgang gis eller ikke alt etter +hva klienten sier at den heter? Da OEP eies av DIFI og driftes av +Basefarm, finnes det kanskje noen dokumenter sendt mellom disse to +aktørene man kan be om innsyn i for å forstå hva som har skjedd. Men +postjournalen +til DIFI viser kun to dokumenter det siste året mellom DIFI og +Basefarm. +Mimes brønn neste, +tenker jeg.

+
+
+ + + Tags: norsk, offentlig innsyn. + + +
+
+
+ +
+
Free software archive system Nikita now able to store documents
+
19th March 2017
+

The Nikita +Noark 5 core project is implementing the Norwegian standard for +keeping an electronic archive of government documents. +The +Noark 5 standard document the requirement for data systems used by +the archives in the Norwegian government, and the Noark 5 web interface +specification document a REST web service for storing, searching and +retrieving documents and metadata in such archive. I've been involved +in the project since a few weeks before Christmas, when the Norwegian +Unix User Group +announced +it supported the project. I believe this is an important project, +and hope it can make it possible for the government archives in the +future to use free software to keep the archives we citizens depend +on. But as I do not hold such archive myself, personally my first use +case is to store and analyse public mail journal metadata published +from the government. I find it useful to have a clear use case in +mind when developing, to make sure the system scratches one of my +itches.

+ +

If you would like to help make sure there is a free software +alternatives for the archives, please join our IRC channel +(#nikita on +irc.freenode.net) and +the +project mailing list.

+ +

When I got involved, the web service could store metadata about +documents. But a few weeks ago, a new milestone was reached when it +became possible to store full text documents too. Yesterday, I +completed an implementation of a command line tool +archive-pdf to upload a PDF file to the archive using this +API. The tool is very simple at the moment, and find existing +fonds, series and +files while asking the user to select which one to use if more than +one exist. Once a file is identified, the PDF is associated with the +file and uploaded, using the title extracted from the PDF itself. The +process is fairly similar to visiting the archive, opening a cabinet, +locating a file and storing a piece of paper in the archive. Here is +a test run directly after populating the database with test data using +our API tester:

+ +

+~/src//noark5-tester$ ./archive-pdf mangelmelding/mangler.pdf
+using arkiv: Title of the test fonds created 2017-03-18T23:49:32.103446
+using arkivdel: Title of the test series created 2017-03-18T23:49:32.103446
+
+ 0 - Title of the test case file created 2017-03-18T23:49:32.103446
+ 1 - Title of the test file created 2017-03-18T23:49:32.103446
+Select which mappe you want (or search term): 0
+Uploading mangelmelding/mangler.pdf
+  PDF title: Mangler i spesifikasjonsdokumentet for NOARK 5 Tjenestegrensesnitt
+  File 2017/1: Title of the test case file created 2017-03-18T23:49:32.103446
+~/src//noark5-tester$
+

+ +

You can see here how the fonds (arkiv) and serie (arkivdel) only had +one option, while the user need to choose which file (mappe) to use +among the two created by the API tester. The archive-pdf +tool can be found in the git repository for the API tester.

+ +

In the project, I have been mostly working on +the API +tester so far, while getting to know the code base. The API +tester currently use +the HATEOAS links +to traverse the entire exposed service API and verify that the exposed +operations and objects match the specification, as well as trying to +create objects holding metadata and uploading a simple XML file to +store. The tester has proved very useful for finding flaws in our +implementation, as well as flaws in the reference site and the +specification.

+ +

The test document I uploaded is a summary of all the specification +defects we have collected so far while implementing the web service. +There are several unclear and conflicting parts of the specification, +and we have +started +writing down the questions we get from implementing it. We use a +format inspired by how The +Austin Group collect defect reports for the POSIX standard with +their +instructions for the MANTIS defect tracker system, in lack of an official way to structure defect reports for Noark 5 (our first submitted defect report was a request for a procedure for submitting defect reports :). + +

The Nikita project is implemented using Java and Spring, and is +fairly easy to get up and running using Docker containers for those +that want to test the current code base. The API tester is +implemented in Python.

+
+
+ + + Tags: english, nuug, offentlig innsyn, standard. + + +
+
+
+
Detecting NFS hangs on Linux without hanging yourself...
9th March 2017
@@ -291,456 +736,6 @@ post.

-
-
Detect OOXML files with undefined behaviour?
-
21st February 2017
-

I just noticed -the -new Norwegian proposal for archiving rules in the goverment list -ECMA-376 -/ ISO/IEC 29500 (aka OOXML) as valid formats to put in long term -storage. Luckily such files will only be accepted based on -pre-approval from the National Archive. Allowing OOXML files to be -used for long term storage might seem like a good idea as long as we -forget that there are plenty of ways for a "valid" OOXML document to -have content with no defined interpretation in the standard, which -lead to a question and an idea.

- -

Is there any tool to detect if a OOXML document depend on such -undefined behaviour? It would be useful for the National Archive (and -anyone else interested in verifying that a document is well defined) -to have such tool available when considering to approve the use of -OOXML. I'm aware of the -officeotron OOXML -validator, but do not know how complete it is nor if it will -report use of undefined behaviour. Are there other similar tools -available? Please send me an email if you know of any such tool.

-
-
- - - Tags: english, nuug, standard. - - -
-
-
- -
-
Ruling ignored our objections to the seizure of popcorn-time.no (#domstolkontroll)
-
13th February 2017
-

A few days ago, we received the ruling from -my -day in court. The case in question is a challenge of the seizure -of the DNS domain popcorn-time.no. The ruling simply did not mention -most of our arguments, and seemed to take everything ØKOKRIM said at -face value, ignoring our demonstration and explanations. But it is -hard to tell for sure, as we still have not seen most of the documents -in the case and thus were unprepared and unable to contradict several -of the claims made in court by the opposition. We are considering an -appeal, but it is partly a question of funding, as it is costing us -quite a bit to pay for our lawyer. If you want to help, please -donate to the -NUUG defense fund.

- -

The details of the case, as far as we know it, is available in -Norwegian from -the NUUG -blog. This also include -the -ruling itself.

-
-
- - - Tags: english, nuug, offentlig innsyn, opphavsrett. - - -
-
-
- -
-
A day in court challenging seizure of popcorn-time.no for #domstolkontroll
-
3rd February 2017
-

- -

On Wednesday, I spent the entire day in court in Follo Tingrett -representing the member association -NUUG, alongside the member -association EFN and the DNS registrar -IMC, challenging the seizure of the DNS name popcorn-time.no. It -was interesting to sit in a court of law for the first time in my -life. Our team can be seen in the picture above: attorney Ola -Tellesbø, EFN board member Tom Fredrik Blenning, IMC CEO Morten Emil -Eriksen and NUUG board member Petter Reinholdtsen.

- -

The -case at hand is that the Norwegian National Authority for -Investigation and Prosecution of Economic and Environmental Crime (aka -Økokrim) decided on their own, to seize a DNS domain early last -year, without following -the -official policy of the Norwegian DNS authority which require a -court decision. The web site in question was a site covering Popcorn -Time. And Popcorn Time is the name of a technology with both legal -and illegal applications. Popcorn Time is a client combining -searching a Bittorrent directory available on the Internet with -downloading/distribute content via Bittorrent and playing the -downloaded content on screen. It can be used illegally if it is used -to distribute content against the will of the right holder, but it can -also be used legally to play a lot of content, for example the -millions of movies -available from the -Internet Archive or the collection -available from Vodo. We created -a -video demonstrating legally use of Popcorn Time and played it in -Court. It can of course be downloaded using Bittorrent.

- -

I did not quite know what to expect from a day in court. The -government held on to their version of the story and we held on to -ours, and I hope the judge is able to make sense of it all. We will -know in two weeks time. Unfortunately I do not have high hopes, as -the Government have the upper hand here with more knowledge about the -case, better training in handling criminal law and in general higher -standing in the courts than fairly unknown DNS registrar and member -associations. It is expensive to be right also in Norway. So far the -case have cost more than NOK 70 000,-. To help fund the case, NUUG -and EFN have asked for donations, and managed to collect around NOK 25 -000,- so far. Given the presentation from the Government, I expect -the government to appeal if the case go our way. And if the case do -not go our way, I hope we have enough funding to appeal.

- -

From the other side came two people from Økokrim. On the benches, -appearing to be part of the group from the government were two people -from the Simonsen Vogt Wiik lawyer office, and three others I am not -quite sure who was. Økokrim had proposed to present two witnesses -from The Motion Picture Association, but this was rejected because -they did not speak Norwegian and it was a bit late to bring in a -translator, but perhaps the two from MPA were present anyway. All -seven appeared to know each other. Good to see the case is take -seriously.

- -

If you, like me, believe the courts should be involved before a DNS -domain is hijacked by the government, or you believe the Popcorn Time -technology have a lot of useful and legal applications, I suggest you -too donate to -the NUUG defense fund. Both Bitcoin and bank transfer are -available. If NUUG get more than we need for the legal action (very -unlikely), the rest will be spend promoting free software, open -standards and unix-like operating systems in Norway, so no matter what -happens the money will be put to good use.

- -

If you want to lean more about the case, I recommend you check out -the blog -posts from NUUG covering the case. They cover the legal arguments -on both sides.

-
-
- - - Tags: english, nuug, offentlig innsyn, opphavsrett. - - -
-
-
- -
-
Nasjonalbiblioteket avslutter sin ulovlige bruk av Google Skjemaer
-
12th January 2017
-

I dag fikk jeg en skikkelig gladmelding. Bakgrunnen er at før jul -arrangerte Nasjonalbiblioteket -et -seminar om sitt knakende gode tiltak «verksregister». Eneste -måten å melde seg på dette seminaret var å sende personopplysninger -til Google via Google Skjemaer. Dette syntes jeg var tvilsom praksis, -da det bør være mulig å delta på seminarer arrangert av det offentlige -uten å måtte dele sine interesser, posisjon og andre -personopplysninger med Google. Jeg ba derfor om innsyn via -Mimes brønn i -avtaler -og vurderinger Nasjonalbiblioteket hadde rundt dette. -Personopplysningsloven legger klare rammer for hva som må være på -plass før en kan be tredjeparter, spesielt i utlandet, behandle -personopplysninger på sine vegne, så det burde eksistere grundig -dokumentasjon før noe slikt kan bli lovlig. To jurister hos -Nasjonalbiblioteket mente først dette var helt i orden, og at Googles -standardavtale kunne brukes som databehandlingsavtale. Det syntes jeg -var merkelig, men har ikke hatt kapasitet til å følge opp saken før -for to dager siden.

- -

Gladnyheten i dag, som kom etter at jeg tipset Nasjonalbiblioteket -om at Datatilsynet underkjente Googles standardavtaler som -databehandleravtaler i 2011, er at Nasjonalbiblioteket har bestemt seg -for å avslutte bruken av Googles Skjemaer/Apps og gå i dialog med DIFI -for å finne bedre måter å håndtere påmeldinger i tråd med -personopplysningsloven. Det er fantastisk å se at av og til hjelper -det å spørre hva i alle dager det offentlige holder på med.

-
-
- - - Tags: norsk, personvern, surveillance, web. - - -
-
-
- -
-
Bryter NAV sin egen personvernerklæring?
-
11th January 2017
-

Jeg leste med interesse en nyhetssak hos -digi.no -og -NRK -om at det ikke bare er meg, men at også NAV bedriver geolokalisering -av IP-adresser, og at det gjøres analyse av IP-adressene til de som -sendes inn meldekort for å se om meldekortet sendes inn fra -utenlandske IP-adresser. Politiadvokat i Drammen, Hans Lyder Haare, -er sitert i NRK på at «De to er jo blant annet avslørt av -IP-adresser. At man ser at meldekortet kommer fra utlandet.»

- -

Jeg synes det er fint at det blir bedre kjent at IP-adresser -knyttes til enkeltpersoner og at innsamlet informasjon brukes til å -stedsbestemme personer også av aktører her i Norge. Jeg ser det som -nok et argument for å bruke -Tor så mye som mulig for å -gjøre gjøre IP-lokalisering vanskeligere, slik at en kan beskytte sin -privatsfære og unngå å dele sin fysiske plassering med -uvedkommede.

- -

Men det er en ting som bekymrer meg rundt denne nyheten. Jeg ble -tipset (takk #nuug) om -NAVs -personvernerklæring, som under punktet «Personvern og statistikk» -lyder:

- -

- -

«Når du besøker nav.no, etterlater du deg elektroniske spor. Sporene -dannes fordi din nettleser automatisk sender en rekke opplysninger til -NAVs tjener (server-maskin) hver gang du ber om å få vist en side. Det -er eksempelvis opplysninger om hvilken nettleser og -versjon du -bruker, og din internettadresse (ip-adresse). For hver side som vises, -lagres følgende opplysninger:

- -
    -
  • hvilken side du ser pÃ¥
  • -
  • dato og tid
  • -
  • hvilken nettleser du bruker
  • -
  • din ip-adresse
  • -
- -

Ingen av opplysningene vil bli brukt til å identifisere -enkeltpersoner. NAV bruker disse opplysningene til å generere en -samlet statistikk som blant annet viser hvilke sider som er mest -populære. Statistikken er et redskap til å forbedre våre -tjenester.»

- -

- -

Jeg klarer ikke helt å se hvordan analyse av de besøkendes -IP-adresser for å se hvem som sender inn meldekort via web fra en -IP-adresse i utlandet kan gjøres uten å komme i strid med påstanden om -at «ingen av opplysningene vil bli brukt til å identifisere -enkeltpersoner». Det virker dermed for meg som at NAV bryter sine -egen personvernerklæring, hvilket -Datatilsynet -fortalte meg i starten av desember antagelig er brudd på -personopplysningsloven. - -

I tillegg er personvernerklæringen ganske misvisende i og med at -NAVs nettsider ikke bare forsyner NAV med personopplysninger, men i -tillegg ber brukernes nettleser kontakte fem andre nettjenere -(script.hotjar.com, static.hotjar.com, vars.hotjar.com, -www.google-analytics.com og www.googletagmanager.com), slik at -personopplysninger blir gjort tilgjengelig for selskapene Hotjar og -Google , og alle som kan lytte på trafikken på veien (som FRA, GCHQ og -NSA). Jeg klarer heller ikke se hvordan slikt spredning av -personopplysninger kan være i tråd med kravene i -personopplysningloven, eller i tråd med NAVs personvernerklæring.

- -

Kanskje NAV bør ta en nøye titt på sin personvernerklæring? Eller -kanskje Datatilsynet bør gjøre det?

-
-
- - - Tags: norsk, nuug, personvern, surveillance. - - -
-
-
- -
-
Where did that package go? — geolocated IP traceroute
-
9th January 2017
-

Did you ever wonder where the web trafic really flow to reach the -web servers, and who own the network equipment it is flowing through? -It is possible to get a glimpse of this from using traceroute, but it -is hard to find all the details. Many years ago, I wrote a system to -map the Norwegian Internet (trying to figure out if our plans for a -network game service would get low enough latency, and who we needed -to talk to about setting up game servers close to the users. Back -then I used traceroute output from many locations (I asked my friends -to run a script and send me their traceroute output) to create the -graph and the map. The output from traceroute typically look like -this: - -

-traceroute to www.stortinget.no (85.88.67.10), 30 hops max, 60 byte packets
- 1  uio-gw10.uio.no (129.240.202.1)  0.447 ms  0.486 ms  0.621 ms
- 2  uio-gw8.uio.no (129.240.24.229)  0.467 ms  0.578 ms  0.675 ms
- 3  oslo-gw1.uninett.no (128.39.65.17)  0.385 ms  0.373 ms  0.358 ms
- 4  te3-1-2.br1.fn3.as2116.net (193.156.90.3)  1.174 ms  1.172 ms  1.153 ms
- 5  he16-1-1.cr1.san110.as2116.net (195.0.244.234)  2.627 ms he16-1-1.cr2.oslosda310.as2116.net (195.0.244.48)  3.172 ms he16-1-1.cr1.san110.as2116.net (195.0.244.234)  2.857 ms
- 6  ae1.ar8.oslosda310.as2116.net (195.0.242.39)  0.662 ms  0.637 ms ae0.ar8.oslosda310.as2116.net (195.0.242.23)  0.622 ms
- 7  89.191.10.146 (89.191.10.146)  0.931 ms  0.917 ms  0.955 ms
- 8  * * *
- 9  * * *
-[...]
-

- -

This show the DNS names and IP addresses of (at least some of the) -network equipment involved in getting the data traffic from me to the -www.stortinget.no server, and how long it took in milliseconds for a -package to reach the equipment and return to me. Three packages are -sent, and some times the packages do not follow the same path. This -is shown for hop 5, where three different IP addresses replied to the -traceroute request.

- -

There are many ways to measure trace routes. Other good traceroute -implementations I use are traceroute (using ICMP packages) mtr (can do -both ICMP, UDP and TCP) and scapy (python library with ICMP, UDP, TCP -traceroute and a lot of other capabilities). All of them are easily -available in Debian.

- -

This time around, I wanted to know the geographic location of -different route points, to visualize how visiting a web page spread -information about the visit to a lot of servers around the globe. The -background is that a web site today often will ask the browser to get -from many servers the parts (for example HTML, JSON, fonts, -JavaScript, CSS, video) required to display the content. This will -leak information about the visit to those controlling these servers -and anyone able to peek at the data traffic passing by (like your ISP, -the ISPs backbone provider, FRA, GCHQ, NSA and others).

- -

Lets pick an example, the Norwegian parliament web site -www.stortinget.no. It is read daily by all members of parliament and -their staff, as well as political journalists, activits and many other -citizens of Norway. A visit to the www.stortinget.no web site will -ask your browser to contact 8 other servers: ajax.googleapis.com, -insights.hotjar.com, script.hotjar.com, static.hotjar.com, -stats.g.doubleclick.net, www.google-analytics.com, -www.googletagmanager.com and www.netigate.se. I extracted this by -asking PhantomJS to visit the -Stortinget web page and tell me all the URLs PhantomJS downloaded to -render the page (in HAR format using -their -netsniff example. I am very grateful to Gorm for showing me how -to do this). My goal is to visualize network traces to all IP -addresses behind these DNS names, do show where visitors personal -information is spread when visiting the page.

- -

map of combined traces for URLs used by www.stortinget.no using GeoIP

- -

When I had a look around for options, I could not find any good -free software tools to do this, and decided I needed my own traceroute -wrapper outputting KML based on locations looked up using GeoIP. KML -is easy to work with and easy to generate, and understood by several -of the GIS tools I have available. I got good help from by NUUG -colleague Anders Einar with this, and the result can be seen in -my -kmltraceroute git repository. Unfortunately, the quality of the -free GeoIP databases I could find (and the for-pay databases my -friends had access to) is not up to the task. The IP addresses of -central Internet infrastructure would typically be placed near the -controlling companies main office, and not where the router is really -located, as you can see from the -KML file I created using the GeoLite City dataset from MaxMind. - -

scapy traceroute graph for URLs used by www.stortinget.no

- -

I also had a look at the visual traceroute graph created by -the scrapy project, -showing IP network ownership (aka AS owner) for the IP address in -question. -The -graph display a lot of useful information about the traceroute in SVG -format, and give a good indication on who control the network -equipment involved, but it do not include geolocation. This graph -make it possible to see the information is made available at least for -UNINETT, Catchcom, Stortinget, Nordunet, Google, Amazon, Telia, Level -3 Communications and NetDNA.

- -

example geotraceroute view for www.stortinget.no

- -

In the process, I came across the -web service GeoTraceroute by -Salim Gasmi. Its methology of combining guesses based on DNS names, -various location databases and finally use latecy times to rule out -candidate locations seemed to do a very good job of guessing correct -geolocation. But it could only do one trace at the time, did not have -a sensor in Norway and did not make the geolocations easily available -for postprocessing. So I contacted the developer and asked if he -would be willing to share the code (he refused until he had time to -clean it up), but he was interested in providing the geolocations in a -machine readable format, and willing to set up a sensor in Norway. So -since yesterday, it is possible to run traces from Norway in this -service thanks to a sensor node set up by -the NUUG assosiation, and get the -trace in KML format for further processing.

- -

map of combined traces for URLs used by www.stortinget.no using geotraceroute

- -

Here we can see a lot of trafic passes Sweden on its way to -Denmark, Germany, Holland and Ireland. Plenty of places where the -Snowden confirmations verified the traffic is read by various actors -without your best interest as their top priority.

- -

Combining KML files is trivial using a text editor, so I could loop -over all the hosts behind the urls imported by www.stortinget.no and -ask for the KML file from GeoTraceroute, and create a combined KML -file with all the traces (unfortunately only one of the IP addresses -behind the DNS name is traced this time. To get them all, one would -have to request traces using IP number instead of DNS names from -GeoTraceroute). That might be the next step in this project.

- -

Armed with these tools, I find it a lot easier to figure out where -the IP traffic moves and who control the boxes involved in moving it. -And every time the link crosses for example the Swedish border, we can -be sure Swedish Signal Intelligence (FRA) is listening, as GCHQ do in -Britain and NSA in USA and cables around the globe. (Hm, what should -we tell them? :) Keep that in mind if you ever send anything -unencrypted over the Internet.

- -

PS: KML files are drawn using -the KML viewer from Ivan -Rublev, as it was less cluttered than the local Linux application -Marble. There are heaps of other options too.

- -

As usual, if you use Bitcoin and want to show your support of my -activities, please send Bitcoin donations to my address -15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b.

-
-
- - - Tags: debian, english, kart, nuug, personvern, stortinget, surveillance, web. - - -
-
-
-

RSS feed