Epost inn som arkivformat i Riksarkivarens forskrift?

27th April 2017

I disse dager, med frist 1. mai, har Riksarkivaren ute en hÃ¸ring pÃ¥ -sin forskrift. Som en kan se er det ikke mye tid igjen fÃ¸r fristen -som gÃ¥r ut pÃ¥ sÃ¸ndag. Denne forskriften er det som lister opp hvilke -formater det er greit Ã¥ arkivere i -Noark -5-lÃ¸sninger i Norge.

- -

Jeg fant hÃ¸ringsdokumentene hos -Norsk -ArkivrÃ¥d etter Ã¥ ha blitt tipset pÃ¥ epostlisten til -fri -programvareprosjektet Nikita Noark5-Core, som lager et Noark 5 -Tjenestegresesnitt. Jeg er involvert i Nikita-prosjektet og takket -vÃ¦re min interesse for tjenestegrensesnittsprosjektet har jeg lest en -god del Noark 5-relaterte dokumenter, og til min overraskelse oppdaget -at standard epost ikke er pÃ¥ listen over godkjente formater som kan -arkiveres. HÃ¸ringen med frist sÃ¸ndag er en glimrende mulighet til Ã¥ -forsÃ¸ke Ã¥ gjÃ¸re noe med det. Jeg holder pÃ¥ med -egen -hÃ¸ringsuttalelse, og lurer pÃ¥ om andre er interessert i Ã¥ stÃ¸tte -forslaget om Ã¥ tillate arkivering av epost som epost i arkivet.

- -

Er du igang med Ã¥ skrive egen hÃ¸ringsuttalelse allerede? I sÃ¥ fall -kan du jo vurdere Ã¥ ta med en formulering om epost-lagring. Jeg tror -ikke det trengs sÃ¥ mye. Her et kort forslag til tekst:

- -

- -
Viser til hÃ¸ring sendt ut 2017-02-17 (Riksarkivarens referanse - 2016/9840 HELHJO), og tillater oss Ã¥ sende inn noen innspill om - revisjon av Forskrift om utfyllende tekniske og arkivfaglige - bestemmelser om behandling av offentlige arkiver (Riksarkivarens - forskrift).
- -
SvÃ¦rt mye av vÃ¥r kommuikasjon foregÃ¥r i dag pÃ¥ e-post.Â Vi - foreslÃ¥r derfor at Internett-e-post, slik det er beskrevet i IETF - RFC 5322, - https://tools.ietf.org/html/rfc5322. bÃ¸r - inn som godkjent dokumentformat.Â Vi foreslÃ¥r at forskriftens - oversikt over godkjente dokumentformater ved innlevering i Â§ 5-16 - endres til Ã¥ ta med Internett-e-post.
- -

- -

Som del av arbeidet med tjenestegrensesnitt har vi testet hvordan -epost kan lagres i en Noark 5-struktur, og holder pÃ¥ Ã¥ skrive et -forslag om hvordan dette kan gjÃ¸res som vil bli sendt over til -arkivverket sÃ¥ snart det er ferdig. De som er interesserte kan -fÃ¸lge -fremdriften pÃ¥ web.

Legal to share more than 3000 movies listed on IMDB?

18th November 2017

A month ago, I blogged about my work to +automatically +check the copyright status of IMDB entries, and try to count the +number of movies listed in IMDB that is legal to distribute on the +Internet. I have continued to look for good data sources, and +identified a few more. The code used to extract information from +various data sources is available in +a +git repository, currently available from github.

+ +

So far I have identified 3186 unique IMDB title IDs. To gain +better understanding of the structure of the data set, I created a +histogram of the year associated with each movie (typically release +year). It is interesting to notice where the peaks and dips in the +graph are located. I wonder why they are placed there. I suspect +World War II caused the dip around 1940, but what caused the peak +around 2010?

+ +

I've so far identified ten sources for IMDB title IDs for movies in +the public domain or with a free license. This is the statistics +reported when running 'make stats' in the git repository:

+ +

+  249 entries (    6 unique) with and   288 without IMDB title ID in free-movies-archive-org-butter.json
+ 2301 entries (  540 unique) with and     0 without IMDB title ID in free-movies-archive-org-wikidata.json
+  830 entries (   29 unique) with and     0 without IMDB title ID in free-movies-icheckmovies-archive-mochard.json
+ 2109 entries (  377 unique) with and     0 without IMDB title ID in free-movies-imdb-pd.json
+  291 entries (  122 unique) with and     0 without IMDB title ID in free-movies-letterboxd-pd.json
+  144 entries (  135 unique) with and     0 without IMDB title ID in free-movies-manual.json
+  350 entries (    1 unique) with and   801 without IMDB title ID in free-movies-publicdomainmovies.json
+    4 entries (    0 unique) with and   124 without IMDB title ID in free-movies-publicdomainreview.json
+  698 entries (  119 unique) with and   118 without IMDB title ID in free-movies-publicdomaintorrents.json
+    8 entries (    8 unique) with and   196 without IMDB title ID in free-movies-vodo.json
+ 3186 unique IMDB title IDs in total
+

+ +

The entries without IMDB title ID are candidates to increase the +data set, but might equally well be duplicates of entries already +listed with IMDB title ID in one of the other sources, or represent +movies that lack a IMDB title ID. I've seen examples of all these +situations when peeking at the entries without IMDB title ID. Based +on these data sources, the lower bound for movies listed in IMDB that +are legal to distribute on the Internet is between 3186 and 4713. + +

It would be great for improving the accuracy of this measurement, +if the various sources added IMDB title ID to their metadata. I have +tried to reach the people behind the various sources to ask if they +are interested in doing this, without any replies so far. Perhaps you +can help me get in touch with the people behind VODO, Public Domain +Torrents, Public Domain Movies and Public Domain Review to try to +convince them to add more metadata to their movie entries?

+ +

Another way you could help is by adding pages to Wikipedia about +movies that are legal to distribute on the Internet. If such page +exist and include a link to both IMDB and The Internet Archive, the +script used to generate free-movies-archive-org-wikidata.json should +pick up the mapping as soon as wikidata is updates.

- Tags: norsk, offentlig innsyn. + Tags: english, opphavsrett.

@@ -84,52 +93,83 @@ fremdriften pÃ¥ web.

Offentlig elektronisk postjournal blokkerer tilgang for utvalgte webklienter

20th April 2017

Jeg oppdaget i dag at nettstedet som -publiserer offentlige postjournaler fra statlige etater, OEP, har -begynt Ã¥ blokkerer enkelte typer webklienter fra Ã¥ fÃ¥ tilgang. Vet -ikke hvor mange det gjelder, men det gjelder i hvert fall libwww-perl -og curl. For Ã¥ teste selv, kjÃ¸r fÃ¸lgende:

- -

-% curl -v -s https://www.oep.no/pub/report.xhtml?reportId=3 2>&1 |grep '< HTTP'
-< HTTP/1.1 404 Not Found
-% curl -v -s --header 'User-Agent:Opera/12.0' https://www.oep.no/pub/report.xhtml?reportId=3 2>&1 |grep '< HTTP'
-< HTTP/1.1 200 OK
-%
-

- -

Her kan en se at tjenesten gir Â«404 Not FoundÂ» for curl i -standardoppsettet, mens den gir Â«200 OKÂ» hvis curl hevder Ã¥ vÃ¦re Opera -versjon 12.0. Offentlig elektronisk postjournal startet blokkeringen -2017-03-02.

- -

Blokkeringen vil gjÃ¸re det litt vanskeligere Ã¥ maskinelt hente -informasjon fra oep.no. Kan blokkeringen vÃ¦re gjort for Ã¥ hindre -automatisert innsamling av informasjon fra OEP, slik Pressens -Offentlighetsutvalg gjorde for Ã¥ dokumentere hvordan departementene -hindrer innsyn i -rapporten -Â«Slik hindrer departementer innsynÂ» som ble publiserte i januar -2017. Det virker usannsynlig, da det jo er trivielt Ã¥ bytte -User-Agent til noe nytt.

- -

Finnes det juridisk grunnlag for det offentlige Ã¥ diskriminere -webklienter slik det gjÃ¸res her? Der tilgang gis eller ikke alt etter -hva klienten sier at den heter? Da OEP eies av DIFI og driftes av -Basefarm, finnes det kanskje noen dokumenter sendt mellom disse to -aktÃ¸rene man kan be om innsyn i for Ã¥ forstÃ¥ hva som har skjedd. Men -postjournalen -til DIFI viser kun to dokumenter det siste Ã¥ret mellom DIFI og -Basefarm. -Mimes brÃ¸nn neste, -tenker jeg.

Some notes on fault tolerant storage systems

1st November 2017

If you care about how fault tolerant your storage is, you might +find these articles and papers interesting. They have formed how I +think of when designing a storage system.

+ +

USENIX :login; Redundancy +Does Not Imply Fault Tolerance. Analysis of Distributed Storage +Reactions to Single Errors and Corruptions by Aishwarya Ganesan, +Ramnatthan Alagappan, Andrea C. Arpaci-Dusseau, and Remzi +H. Arpaci-Dusseau
ZDNet +Why +RAID 5 stops working in 2009 by Robin Harris
ZDNet +Why +RAID 6 stops working in 2019 by Robin Harris
USENIX FAST'07 +Failure +Trends in a Large Disk Drive Population by Eduardo Pinheiro, +Wolf-Dietrich Weber and Luiz AndreÌ Barroso
USENIX ;login: Data +Integrity. Finding Truth in a World of Guesses and Lies by Doug +Hughes
USENIX FAST'08 +An +Analysis of Data Corruption in the Storage Stack by +L. N. Bairavasundaram, G. R. Goodson, B. Schroeder, A. C. +Arpaci-Dusseau, and R. H. Arpaci-Dusseau
USENIX FAST'07 Disk +failures in the real world: what does an MTTF of 1,000,000 hours mean +to you? by B. Schroeder and G. A. Gibson.
USENIX ;login: Are +Disks the Dominant Contributor for Storage Failures? A Comprehensive +Study of Storage Subsystem Failure Characteristics by Weihang +Jiang, Chongfeng Hu, Yuanyuan Zhou, and Arkady Kanevsky
SIGMETRICS 2007 +An +analysis of latent sector errors in disk drives by +L. N. Bairavasundaram, G. R. Goodson, S. Pasupathy, and J. Schindler

+ +

Several of these research papers are based on data collected from +hundred thousands or millions of disk, and their findings are eye +opening. The short story is simply do not implicitly trust RAID or +redundant storage systems. Details matter. And unfortunately there +are few options on Linux addressing all the identified issues. Both +ZFS and Btrfs are doing a fairly good job, but have legal and +practical issues on their own. I wonder how cluster file systems like +Ceph do in this regard. After all, there is an old saying, you know +you have a distributed system when the crash of a computer you have +never heard of stops you from getting any work done. The same holds +true if fault tolerance do not work.

+ +

Just remember, in the end, it do not matter how redundant, or how +fault tolerant your storage is, if you do not continuously monitor its +status to detect and replace failed disks.

- Tags: norsk, offentlig innsyn. + Tags: english, raid, sysadmin.

@@ -137,101 +177,46 @@ tenker jeg.

Free software archive system Nikita now able to store documents

19th March 2017

The Nikita -Noark 5 core project is implementing the Norwegian standard for -keeping an electronic archive of government documents. -The -Noark 5 standard document the requirement for data systems used by -the archives in the Norwegian government, and the Noark 5 web interface -specification document a REST web service for storing, searching and -retrieving documents and metadata in such archive. I've been involved -in the project since a few weeks before Christmas, when the Norwegian -Unix User Group -announced -it supported the project. I believe this is an important project, -and hope it can make it possible for the government archives in the -future to use free software to keep the archives we citizens depend -on. But as I do not hold such archive myself, personally my first use -case is to store and analyse public mail journal metadata published -from the government. I find it useful to have a clear use case in -mind when developing, to make sure the system scratches one of my -itches.

- -

If you would like to help make sure there is a free software -alternatives for the archives, please join our IRC channel -(#nikita on -irc.freenode.net) and -the -project mailing list.

- -

When I got involved, the web service could store metadata about -documents. But a few weeks ago, a new milestone was reached when it -became possible to store full text documents too. Yesterday, I -completed an implementation of a command line tool -archive-pdf to upload a PDF file to the archive using this -API. The tool is very simple at the moment, and find existing -fonds, series and -files while asking the user to select which one to use if more than -one exist. Once a file is identified, the PDF is associated with the -file and uploaded, using the title extracted from the PDF itself. The -process is fairly similar to visiting the archive, opening a cabinet, -locating a file and storing a piece of paper in the archive. Here is -a test run directly after populating the database with test data using -our API tester:

- -

-~/src//noark5-tester$ ./archive-pdf mangelmelding/mangler.pdf
-using arkiv: Title of the test fonds created 2017-03-18T23:49:32.103446
-using arkivdel: Title of the test series created 2017-03-18T23:49:32.103446
-
- 0 - Title of the test case file created 2017-03-18T23:49:32.103446
- 1 - Title of the test file created 2017-03-18T23:49:32.103446
-Select which mappe you want (or search term): 0
-Uploading mangelmelding/mangler.pdf
-  PDF title: Mangler i spesifikasjonsdokumentet for NOARK 5 Tjenestegrensesnitt
-  File 2017/1: Title of the test case file created 2017-03-18T23:49:32.103446
-~/src//noark5-tester$
-

- -

You can see here how the fonds (arkiv) and serie (arkivdel) only had -one option, while the user need to choose which file (mappe) to use -among the two created by the API tester. The archive-pdf -tool can be found in the git repository for the API tester.

- -

In the project, I have been mostly working on -the API -tester so far, while getting to know the code base. The API -tester currently use -the HATEOAS links -to traverse the entire exposed service API and verify that the exposed -operations and objects match the specification, as well as trying to -create objects holding metadata and uploading a simple XML file to -store. The tester has proved very useful for finding flaws in our -implementation, as well as flaws in the reference site and the -specification.

- -

The test document I uploaded is a summary of all the specification -defects we have collected so far while implementing the web service. -There are several unclear and conflicting parts of the specification, -and we have -started -writing down the questions we get from implementing it. We use a -format inspired by how The -Austin Group collect defect reports for the POSIX standard with -their -instructions for the MANTIS defect tracker system, in lack of an official way to structure defect reports for Noark 5 (our first submitted defect report was a request for a procedure for submitting defect reports :). - -

The Nikita project is implemented using Java and Spring, and is -fairly easy to get up and running using Docker containers for those -that want to test the current code base. The API tester is -implemented in Python.

Web services for writing academic LaTeX papers as a team

31st October 2017

I was surprised today to learn that a friend in academia did not +know there are easily available web services available for writing +LaTeX documents as a team. I thought it was common knowledge, but to +make sure at least my readers are aware of it, I would like to mention +these useful services for writing LaTeX documents. Some of them even +provide a WYSIWYG editor to ease writing even further.

+ +

There are two commercial services available, +ShareLaTeX and +Overleaf. They are very easy to +use. Just start a new document, select which publisher to write for +(ie which LaTeX style to use), and start writing. Note, these two +have announced their intention to join forces, so soon it will only be +one joint service. I've used both for different documents, and they +work just fine. While +ShareLaTeX is free +software, while the latter is not. According to a +announcement from Overleaf, they plan to keep the ShareLaTeX code +base maintained as free software.

+ +But these two are not the only alternatives. +Fidus Writer is another free +software solution with the +source available on github. I have not used it myself. Several +others can be found on the nice +alterntiveTo +web service. + +

If you like Google Docs or Etherpad, but would like to write +documents in LaTeX, you should check out these services. You can even +host your own, if you want to. :)

- Tags: english, nuug, offentlig innsyn, standard. + Tags: english.

@@ -239,114 +224,288 @@ implemented in Python.

Detecting NFS hangs on Linux without hanging yourself...

9th March 2017

Over the years, administrating thousand of NFS mounting linux -computers at the time, I often needed a way to detect if the machine -was experiencing NFS hang. If you try to use df or look at a -file or directory affected by the hang, the process (and possibly the -shell) will hang too. So you want to be able to detect this without -risking the detection process getting stuck too. It has not been -obvious how to do this. When the hang has lasted a while, it is -possible to find messages like these in dmesg:

- -

-nfs: server nfsserver not responding, still trying -
nfs: server nfsserver OK -

- -

It is hard to know if the hang is still going on, and it is hard to -be sure looking in dmesg is going to work. If there are lots of other -messages in dmesg the lines might have rotated out of site before they -are noticed.

- -

While reading through the nfs client implementation in linux kernel -code, I came across some statistics that seem to give a way to detect -it. The om_timeouts sunrpc value in the kernel will increase every -time the above log entry is inserted into dmesg. And after digging a -bit further, I discovered that this value show up in -/proc/self/mountstats on Linux.

- -

The mountstats content seem to be shared between files using the -same file system context, so it is enough to check one of the -mountstats files to get the state of the mount point for the machine. -I assume this will not show lazy umounted NFS points, nor NFS mount -points in a different process context (ie with a different filesystem -view), but that does not worry me.

- -

The content for a NFS mount point look similar to this:

- -

-[...]
-device /dev/mapper/Debian-var mounted on /var with fstype ext3
-device nfsserver:/mnt/nfsserver/home0 mounted on /mnt/nfsserver/home0 with fstype nfs statvers=1.1
-        opts:   rw,vers=3,rsize=65536,wsize=65536,namlen=255,acregmin=3,acregmax=60,acdirmin=30,acdirmax=60,soft,nolock,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=129.240.3.145,mountvers=3,mountport=4048,mountproto=udp,local_lock=all
-        age:    7863311
-        caps:   caps=0x3fe7,wtmult=4096,dtsize=8192,bsize=0,namlen=255
-        sec:    flavor=1,pseudoflavor=1
-        events: 61063112 732346265 1028140 35486205 16220064 8162542 761447191 71714012 37189 3891185 45561809 110486139 4850138 420353 15449177 296502 52736725 13523379 0 52182 9016896 1231 0 0 0 0 0 
-        bytes:  166253035039 219519120027 0 0 40783504807 185466229638 11677877 45561809 
-        RPC iostats version: 1.0  p/v: 100003/3 (nfs)
-        xprt:   tcp 925 1 6810 0 0 111505412 111480497 109 2672418560317 0 248 53869103 22481820
-        per-op statistics
-                NULL: 0 0 0 0 0 0 0 0
-             GETATTR: 61063106 61063108 0 9621383060 6839064400 453650 77291321 78926132
-             SETATTR: 463469 463470 0 92005440 66739536 63787 603235 687943
-              LOOKUP: 17021657 17021657 0 3354097764 4013442928 57216 35125459 35566511
-              ACCESS: 14281703 14290009 5 2318400592 1713803640 1709282 4865144 7130140
-            READLINK: 125 125 0 20472 18620 0 1112 1118
-                READ: 4214236 4214237 0 715608524 41328653212 89884 22622768 22806693
-               WRITE: 8479010 8494376 22 187695798568 1356087148 178264904 51506907 231671771
-              CREATE: 171708 171708 0 38084748 46702272 873 1041833 1050398
-               MKDIR: 3680 3680 0 773980 993920 26 23990 24245
-             SYMLINK: 903 903 0 233428 245488 6 5865 5917
-               MKNOD: 80 80 0 20148 21760 0 299 304
-              REMOVE: 429921 429921 0 79796004 61908192 3313 2710416 2741636
-               RMDIR: 3367 3367 0 645112 484848 22 5782 6002
-              RENAME: 466201 466201 0 130026184 121212260 7075 5935207 5961288
-                LINK: 289155 289155 0 72775556 67083960 2199 2565060 2585579
-             READDIR: 2933237 2933237 0 516506204 13973833412 10385 3190199 3297917
-         READDIRPLUS: 1652839 1652839 0 298640972 6895997744 84735 14307895 14448937
-              FSSTAT: 6144 6144 0 1010516 1032192 51 9654 10022
-              FSINFO: 2 2 0 232 328 0 1 1
-            PATHCONF: 1 1 0 116 140 0 0 0
-              COMMIT: 0 0 0 0 0 0 0 0
-
-device binfmt_misc mounted on /proc/sys/fs/binfmt_misc with fstype binfmt_misc
-[...]
-

- -

The key number to look at is the third number in the per-op list. -It is the number of NFS timeouts experiences per file system -operation. Here 22 write timeouts and 5 access timeouts. If these -numbers are increasing, I believe the machine is experiencing NFS -hang. Unfortunately the timeout value do not start to increase right -away. The NFS operations need to time out first, and this can take a -while. The exact timeout value depend on the setup. For example the -defaults for TCP and UDP mount points are quite different, and the -timeout value is affected by the soft, hard, timeo and retrans NFS -mount options.

- -

The only way I have been able to get working on Debian and RedHat -Enterprise Linux for getting the timeout count is to peek in /proc/. -But according to -Solaris -10 System Administration Guide: Network Services, the 'nfsstat -c' -command can be used to get these timeout values. But this do not work -on Linux, as far as I can tell. I -asked Debian about this, -but have not seen any replies yet.

- -

Is there a better way to figure out if a Linux NFS client is -experiencing NFS hangs? Is there a way to detect which processes are -affected? Is there a way to get the NFS mount going quickly once the -network problem causing the NFS hang has been cleared? I would very -much welcome some clues, as we regularly run into NFS hangs.

Locating IMDB IDs of movies in the Internet Archive using Wikidata

25th October 2017

Recently, I needed to automatically check the copyright status of a +set of The Internet Movie database +(IMDB) entries, to figure out which one of the movies they refer +to can be freely distributed on the Internet. This proved to be +harder than it sounds. IMDB for sure list movies without any +copyright protection, where the copyright protection has expired or +where the movie is lisenced using a permissive license like one from +Creative Commons. These are mixed with copyright protected movies, +and there seem to be no way to separate these classes of movies using +the information in IMDB.

+ +

First I tried to look up entries manually in IMDB, +Wikipedia and +The Internet Archive, to get a +feel how to do this. It is hard to know for sure using these sources, +but it should be possible to be reasonable confident a movie is "out +of copyright" with a few hours work per movie. As I needed to check +almost 20,000 entries, this approach was not sustainable. I simply +can not work around the clock for about 6 years to check this data +set.

+ +

I asked the people behind The Internet Archive if they could +introduce a new metadata field in their metadata XML for IMDB ID, but +was told that they leave it completely to the uploaders to update the +metadata. Some of the metadata entries had IMDB links in the +description, but I found no way to download all metadata files in bulk +to locate those ones and put that approach aside.

+ +

In the process I noticed several Wikipedia articles about movies +had links to both IMDB and The Internet Archive, and it occured to me +that I could use the Wikipedia RDF data set to locate entries with +both, to at least get a lower bound on the number of movies on The +Internet Archive with a IMDB ID. This is useful based on the +assumption that movies distributed by The Internet Archive can be +legally distributed on the Internet. With some help from the RDF +community (thank you DanC), I was able to come up with this query to +pass to the SPARQL interface on +Wikidata: + +

+SELECT ?work ?imdb ?ia ?when ?label
+WHERE
+{
+  ?work wdt:P31/wdt:P279* wd:Q11424.
+  ?work wdt:P345 ?imdb.
+  ?work wdt:P724 ?ia.
+  OPTIONAL {
+        ?work wdt:P577 ?when.
+        ?work rdfs:label ?label.
+        FILTER(LANG(?label) = "en").
+  }
+}
+

+ +

If I understand the query right, for every film entry anywhere in +Wikpedia, it will return the IMDB ID and The Internet Archive ID, and +when the movie was released and its English title, if either or both +of the latter two are available. At the moment the result set contain +2338 entries. Of course, it depend on volunteers including both +correct IMDB and The Internet Archive IDs in the wikipedia articles +for the movie. It should be noted that the result will include +duplicates if the movie have entries in several languages. There are +some bogus entries, either because The Internet Archive ID contain a +typo or because the movie is not available from The Internet Archive. +I did not verify the IMDB IDs, as I am unsure how to do that +automatically.

+ +

I wrote a small python script to extract the data set from Wikidata +and check if the XML metadata for the movie is available from The +Internet Archive, and after around 1.5 hour it produced a list of 2097 +free movies and their IMDB ID. In total, 171 entries in Wikidata lack +the refered Internet Archive entry. I assume the 70 "disappearing" +entries (ie 2338-2097-171) are duplicate entries.

+ +

This is not too bad, given that The Internet Archive report to +contain 5331 +feature films at the moment, but it also mean more than 3000 +movies are missing on Wikipedia or are missing the pair of references +on Wikipedia.

+ +

I was curious about the distribution by release year, and made a +little graph to show how the amount of free movies is spread over the +years:

+ +

I expect the relative distribution of the remaining 3000 movies to +be similar.

+ +

If you want to help, and want to ensure Wikipedia can be used to +cross reference The Internet Archive and The Internet Movie Database, +please make sure entries like this are listed under the "External +links" heading on the Wikipedia article for the movie:

+ +

+* {{Internet Archive film|id=FightingLady}}
+* {{IMDb title|id=0036823|title=The Fighting Lady}}
+

+ +

Please verify the links on the final page, to make sure you did not +introduce a typo.

+ +

Here is the complete list, if you want to correct the 171 +identified Wikipedia entries with broken links to The Internet +Archive: Q1140317, +Q458656, +Q458656, +Q470560, +Q743340, +Q822580, +Q480696, +Q128761, +Q1307059, +Q1335091, +Q1537166, +Q1438334, +Q1479751, +Q1497200, +Q1498122, +Q865973, +Q834269, +Q841781, +Q841781, +Q1548193, +Q499031, +Q1564769, +Q1585239, +Q1585569, +Q1624236, +Q4796595, +Q4853469, +Q4873046, +Q915016, +Q4660396, +Q4677708, +Q4738449, +Q4756096, +Q4766785, +Q880357, +Q882066, +Q882066, +Q204191, +Q204191, +Q1194170, +Q940014, +Q946863, +Q172837, +Q573077, +Q1219005, +Q1219599, +Q1643798, +Q1656352, +Q1659549, +Q1660007, +Q1698154, +Q1737980, +Q1877284, +Q1199354, +Q1199354, +Q1199451, +Q1211871, +Q1212179, +Q1238382, +Q4906454, +Q320219, +Q1148649, +Q645094, +Q5050350, +Q5166548, +Q2677926, +Q2698139, +Q2707305, +Q2740725, +Q2024780, +Q2117418, +Q2138984, +Q1127992, +Q1058087, +Q1070484, +Q1080080, +Q1090813, +Q1251918, +Q1254110, +Q1257070, +Q1257079, +Q1197410, +Q1198423, +Q706951, +Q723239, +Q2079261, +Q1171364, +Q617858, +Q5166611, +Q5166611, +Q324513, +Q374172, +Q7533269, +Q970386, +Q976849, +Q7458614, +Q5347416, +Q5460005, +Q5463392, +Q3038555, +Q5288458, +Q2346516, +Q5183645, +Q5185497, +Q5216127, +Q5223127, +Q5261159, +Q1300759, +Q5521241, +Q7733434, +Q7736264, +Q7737032, +Q7882671, +Q7719427, +Q7719444, +Q7722575, +Q2629763, +Q2640346, +Q2649671, +Q7703851, +Q7747041, +Q6544949, +Q6672759, +Q2445896, +Q12124891, +Q3127044, +Q2511262, +Q2517672, +Q2543165, +Q426628, +Q426628, +Q12126890, +Q13359969, +Q13359969, +Q2294295, +Q2294295, +Q2559509, +Q2559912, +Q7760469, +Q6703974, +Q4744, +Q7766962, +Q7768516, +Q7769205, +Q7769988, +Q2946945, +Q3212086, +Q3212086, +Q18218448, +Q18218448, +Q18218448, +Q6909175, +Q7405709, +Q7416149, +Q7239952, +Q7317332, +Q7783674, +Q7783704, +Q7857590, +Q3372526, +Q3372642, +Q3372816, +Q3372909, +Q7959649, +Q7977485, +Q7992684, +Q3817966, +Q3821852, +Q3420907, +Q3429733, +Q774474

- Tags: debian, english, sysadmin. + Tags: english, opphavsrett.

@@ -354,44 +513,26 @@ much welcome some clues, as we regularly run into NFS hangs.

How does it feel to be wiretapped, when you should be doing the wiretapping...

8th March 2017

So the new president in the United States of America claim to be -surprised to discover that he was wiretapped during the election -before he was elected president. He even claim this must be illegal. -Well, doh, if it is one thing the confirmations from Snowden -documented, it is that the entire population in USA is wiretapped, one -way or another. Of course the president candidates were wiretapped, -alongside the senators, judges and the rest of the people in USA.

- -

Next, the Federal Bureau of Investigation ask the Department of -Justice to go public rejecting the claims that Donald Trump was -wiretapped illegally. I fail to see the relevance, given that I am -sure the surveillance industry in USA believe they have all the legal -backing they need to conduct mass surveillance on the entire -world.

- -

There is even the director of the FBI stating that he never saw an -order requesting wiretapping of Donald Trump. That is not very -surprising, given how the FISA court work, with all its activity being -secret. Perhaps he only heard about it?

- -

What I find most sad in this story is how Norwegian journalists -present it. In a news reports the other day in the radio from the -Norwegian National broadcasting Company (NRK), I heard the journalist -claim that 'the FBI denies any wiretapping', while the reality is that -'the FBI denies any illegal wiretapping'. There is a fundamental and -important difference, and it make me sad that the journalists are -unable to grasp it.

- -

Update 2017-03-13: Look like -The -Intercept report that US Senator Rand Paul confirm what I state above.

A one-way wall on the border?

14th October 2017

I find it fascinating how many of the people being locked inside +the proposed border wall between USA and Mexico support the idea. The +proposal to keep Mexicans out reminds me of +the +propaganda twist from the East Germany government calling the wall +the âAntifascist Bulwarkâ after erecting the Berlin Wall, claiming +that the wall was erected to keep enemies from creeping into East +Germany, while it was obvious to the people locked inside it that it +was erected to keep the people from escaping.

+ +

Do the people in USA supporting this wall really believe it is a +one way wall, only keeping people on the outside from getting in, +while not keeping people in the inside from getting out?

- Tags: english, surveillance. + Tags: english.

@@ -399,33 +540,48 @@ Intercept report that US Senator Rand Paul confirm what I state above.

Norwegian BokmÃ¥l translation of The Debian Administrator's Handbook complete, proofreading in progress

3rd March 2017

For almost a year now, we have been working on making a Norwegian -BokmÃ¥l edition of The Debian -Administrator's Handbook. Now, thanks to the tireless effort of -Ole-Erik, Ingrid and Andreas, the initial translation is complete, and -we are working on the proof reading to ensure consistent language and -use of correct computer science terms. The plan is to make the book -available on paper, as well as in electronic form. For that to -happen, the proof reading must be completed and all the figures need -to be translated. If you want to help out, get in touch.

- -

A - -fresh PDF edition in A4 format (the final book will have smaller -pages) of the book created every morning is available for -proofreading. If you find any errors, please -visit -Weblate and correct the error. The -state -of the translation including figures is a useful source for those -provide Norwegian bokmÃ¥l screen shots and figures.

Generating 3D prints in Debian using Cura and Slic3r(-prusa)

9th October 2017

At my nearby maker space, +Sonen, I heard the story that it +was easier to generate gcode files for theyr 3D printers (Ultimake 2+) +on Windows and MacOS X than Linux, because the software involved had +to be manually compiled and set up on Linux while premade packages +worked out of the box on Windows and MacOS X. I found this annoying, +as the software involved, +Cura, is free software +and should be trivial to get up and running on Linux if someone took +the time to package it for the relevant distributions. I even found +a request for adding into +Debian from 2013, which had seem some activity over the years but +never resulted in the software showing up in Debian. So a few days +ago I offered my help to try to improve the situation.

+ +

Now I am very happy to see that all the packages required by a +working Cura in Debian are uploaded into Debian and waiting in the NEW +queue for the ftpmasters to have a look. You can track the progress +on +the +status page for the 3D printer team.

+ +

The uploaded packages are a bit behind upstream, and was uploaded +now to get slots in the NEW +queue while we work up updating the packages to the latest +upstream version.

+ +

On a related note, two competitors for Cura, which I found harder +to use and was unable to configure correctly for Ultimaker 2+ in the +short time I spent on it, are already in Debian. If you are looking +for 3D printer "slicers" and want something already available in +Debian, check out +slic3r and +slic3r-prusa. +The latter is a fork of the former.

- Tags: debian, debian-handbook, english. + Tags: 3d-printer, debian, english.

@@ -433,77 +589,30 @@ provide Norwegian bokmÃ¥l screen shots and figures.

Unlimited randomness with the ChaosKey?

1st March 2017

A few days ago I ordered a small batch of -the ChaosKey, a small -USB dongle for generating entropy created by Bdale Garbee and Keith -Packard. Yesterday it arrived, and I am very happy to report that it -work great! According to its designers, to get it to work out of the -box, you need the Linux kernel version 4.1 or later. I tested on a -Debian Stretch machine (kernel version 4.9), and there it worked just -fine, increasing the available entropy very quickly. I wrote a small -test oneliner to test. It first print the current entropy level, -drain /dev/random, and then print the entropy level for five seconds. -Here is the situation without the ChaosKey inserted:

- -

-% cat /proc/sys/kernel/random/entropy_avail; \
-  dd bs=1M if=/dev/random of=/dev/null count=1; \
-  for n in $(seq 1 5); do \
-     cat /proc/sys/kernel/random/entropy_avail; \
-     sleep 1; \
-  done
-300
-0+1 oppfÃ¸ringer inn
-0+1 oppfÃ¸ringer ut
-28 byte kopiert, 0,000264565 s, 106 kB/s
-4
-8
-12
-17
-21
-%
-

- -

The entropy level increases by 3-4 every second. In such case any -application requiring random bits (like a HTTPS enabled web server) -will halt and wait for more entrpy. And here is the situation with -the ChaosKey inserted:

- -

-% cat /proc/sys/kernel/random/entropy_avail; \
-  dd bs=1M if=/dev/random of=/dev/null count=1; \
-  for n in $(seq 1 5); do \
-     cat /proc/sys/kernel/random/entropy_avail; \
-     sleep 1; \
-  done
-1079
-0+1 oppfÃ¸ringer inn
-0+1 oppfÃ¸ringer ut
-104 byte kopiert, 0,000487647 s, 213 kB/s
-433
-1028
-1031
-1035
-1038
-%
-

- -

Quite the difference. :) I bought a few more than I need, in case -someone want to buy one here in Norway. :)

- -

Update: The dongle was presented at Debconf last year. You might -find the talk -recording illuminating. It explains exactly what the source of -randomness is, if you are unable to spot it from the schema drawing -available from the ChaosKey web site linked at the start of this blog -post.

Mangler du en skrue, eller har du en skrue lÃ¸s?

4th October 2017

NÃ¥r jeg holder pÃ¥ med ulike prosjekter, sÃ¥ trenger jeg stadig ulike +skruer. Det siste prosjektet jeg holder pÃ¥ med er Ã¥ lage +en boks til en +HDMI-touch-skjerm som skal brukes med Raspberry Pi. Boksen settes +sammen med skruer og bolter, og jeg har vÃ¦rt i tvil om hvor jeg kan +fÃ¥ tak i de riktige skruene. Clas Ohlson og Jernia i nÃ¦rheten har +sjelden hatt det jeg trenger. Men her om dagen fikk jeg et fantastisk +tips for oss som bor i Oslo. +Zachariassen Jernvare AS i +Hegermannsgate +23A pÃ¥ Torshov har et fantastisk utvalg, og Ã¥pent mellom 09:00 og +17:00. De selger skruer, muttere, bolter, skiver etc i lÃ¸s vekt, og +sÃ¥ langt har jeg fÃ¥tt alt jeg har lett etter. De har i tillegg det +meste av annen jernvare, som verktÃ¸y, lamper, ledninger, etc. Jeg +hÃ¥per de har nok kunder til Ã¥ holde det gÃ¥ende lenge, da dette er en +butikk jeg kommer til Ã¥ besÃ¸ke ofte. Butikken er et funn Ã¥ ha i +nabolaget for oss som liker Ã¥ bygge litt selv. :)

- Tags: debian, english. + Tags: norsk.

@@ -511,34 +620,64 @@ post.

Detect OOXML files with undefined behaviour?

21st February 2017

I just noticed -the -new Norwegian proposal for archiving rules in the goverment list -ECMA-376 -/ ISO/IEC 29500 (aka OOXML) as valid formats to put in long term -storage. Luckily such files will only be accepted based on -pre-approval from the National Archive. Allowing OOXML files to be -used for long term storage might seem like a good idea as long as we -forget that there are plenty of ways for a "valid" OOXML document to -have content with no defined interpretation in the standard, which -lead to a question and an idea.

- -

Is there any tool to detect if a OOXML document depend on such -undefined behaviour? It would be useful for the National Archive (and -anyone else interested in verifying that a document is well defined) -to have such tool available when considering to approve the use of -OOXML. I'm aware of the -officeotron OOXML -validator, but do not know how complete it is nor if it will -report use of undefined behaviour. Are there other similar tools -available? Please send me an email if you know of any such tool.

Visualizing GSM radio chatter using gr-gsm and Hopglass

29th September 2017

Every mobile phone announce its existence over radio to the nearby +mobile cell towers. And this radio chatter is available for anyone +with a radio receiver capable of receiving them. Details about the +mobile phones with very good accuracy is of course collected by the +phone companies, but this is not the topic of this blog post. The +mobile phone radio chatter make it possible to figure out when a cell +phone is nearby, as it include the SIM card ID (IMSI). By paying +attention over time, one can see when a phone arrive and when it leave +an area. I believe it would be nice to make this information more +available to the general public, to make more people aware of how +their phones are announcing their whereabouts to anyone that care to +listen.

+ +

I am very happy to report that we managed to get something +visualizing this information up and running for +Oslo Skaperfestival 2017 +(Oslo Makers Festival) taking place today and tomorrow at Deichmanske +library. The solution is based on the +simple +recipe for listening to GSM chatter I posted a few days ago, and +will show up at the stand of Ãpen +Sone from the Computer Science department of the University of +Oslo. The presentation will show the nearby mobile phones (aka +IMSIs) as dots in a web browser graph, with lines to the dot +representing mobile base station it is talking to. It was working in +the lab yesterday, and was moved into place this morning.

+ +

We set up a fairly powerful desktop machine using Debian +Buster/Testing with several (five, I believe) RTL2838 DVB-T receivers +connected and visualize the visible cell phone towers using an +English version of +Hopglass. A fairly powerfull machine is needed as the +grgsm_livemon_headless processes from +gr-gsm converting +the radio signal to data packages is quite CPU intensive.

+ +

The frequencies to listen to, are identified using a slightly +patched scan-and-livemon (to set the --args values for each receiver), +and the Hopglass data is generated using the +patches +in my meshviewer-output branch. For some reason we could not get +more than four SDRs working. There is also a geographical map trying +to show the location of the base stations, but I believe their +coordinates are hardcoded to some random location in Germany, I +believe. The code should be replaced with code to look up location in +a text file, a sqlite database or one of the online databases +mentioned in +the github +issue for the topic. + +

If this sound interesting, visit the stand at the festival!

- Tags: english, nuug, standard. + Tags: debian, english, personvern, surveillance.

@@ -546,33 +685,83 @@ available? Please send me an email if you know of any such tool.

Ruling ignored our objections to the seizure of popcorn-time.no (#domstolkontroll)

13th February 2017

A few days ago, we received the ruling from -my -day in court. The case in question is a challenge of the seizure -of the DNS domain popcorn-time.no. The ruling simply did not mention -most of our arguments, and seemed to take everything ÃKOKRIM said at -face value, ignoring our demonstration and explanations. But it is -hard to tell for sure, as we still have not seen most of the documents -in the case and thus were unprepared and unable to contradict several -of the claims made in court by the opposition. We are considering an -appeal, but it is partly a question of funding, as it is costing us -quite a bit to pay for our lawyer. If you want to help, please -donate to the -NUUG defense fund.

- -

The details of the case, as far as we know it, is available in -Norwegian from -the NUUG -blog. This also include -the -ruling itself.

Easier recipe to observe the cell phones around you

24th September 2017

A little more than a month ago I wrote +how +to observe the SIM card ID (aka IMSI number) of mobile phones talking +to nearby mobile phone base stations using Debian GNU/Linux and a +cheap USB software defined radio, and thus being able to pinpoint +the location of people and equipment (like cars and trains) with an +accuracy of a few kilometer. Since then we have worked to make the +procedure even simpler, and it is now possible to do this without any +manual frequency tuning and without building your own packages.

+ +

The gr-gsm +package is now included in Debian testing and unstable, and the +IMSI-catcher code no longer require root access to fetch and decode +the GSM data collected using gr-gsm.

+ +

Here is an updated recipe, using packages built by Debian and a git +clone of two python scripts:

+ +

Start with a Debian machine running the Buster version (aka + testing).
Run 'apt install gr-gsm python-numpy python-scipy + python-scapy' as root to install required packages.
Fetch the code decoding GSM packages using 'git clone + github.com/Oros42/IMSI-catcher.git'.
Insert USB software defined radio supported by GNU Radio.
Enter the IMSI-catcher directory and run 'python + scan-and-livemon' to locate the frequency of nearby base + stations and start listening for GSM packages on one of them.
Enter the IMSI-catcher directory and run 'python + simple_IMSI-catcher.py' to display the collected information.

+ +

Note, due to a bug somewhere the scan-and-livemon program (actually +its underlying +program grgsm_scanner) do not work with the HackRF radio. It does +work with RTL 8232 and other similar USB radio receivers you can get +very cheaply +(for example +from ebay), so for now the solution is to scan using the RTL radio +and only use HackRF for fetching GSM data.

+ +

As far as I can tell, a cell phone only show up on one of the +frequencies at the time, so if you are going to track and count every +cell phone around you, you need to listen to all the frequencies used. +To listen to several frequencies, use the --numrecv argument to +scan-and-livemon to use several receivers. Further, I am not sure if +phones using 3G or 4G will show as talking GSM to base stations, so +this approach might not see all phones around you. I typically see +0-400 IMSI numbers an hour when looking around where I live.

+ +

I've tried to run the scanner on a +Raspberry Pi 2 and 3 +running Debian Buster, but the grgsm_livemon_headless process seem +to be too CPU intensive to keep up. When GNU Radio print 'O' to +stdout, I am told there it is caused by a buffer overflow between the +radio and GNU Radio, caused by the program being unable to read the +GSM data fast enough. If you see a stream of 'O's from the terminal +where you started scan-and-livemon, you need a give the process more +CPU power. Perhaps someone are able to optimize the code to a point +where it become possible to set up RPi3 based GSM sniffers? I tried +using Raspbian instead of Debian, but there seem to be something wrong +with GNU Radio on raspbian, causing glibc to abort().

- Tags: english, nuug, offentlig innsyn, opphavsrett. + Tags: debian, english, personvern, surveillance.

@@ -580,86 +769,54 @@ ruling itself.

A day in court challenging seizure of popcorn-time.no for #domstolkontroll

3rd February 2017

- -

On Wednesday, I spent the entire day in court in Follo Tingrett -representing the member association -NUUG, alongside the member -association EFN and the DNS registrar -IMC, challenging the seizure of the DNS name popcorn-time.no. It -was interesting to sit in a court of law for the first time in my -life. Our team can be seen in the picture above: attorney Ola -TellesbÃ¸, EFN board member Tom Fredrik Blenning, IMC CEO Morten Emil -Eriksen and NUUG board member Petter Reinholdtsen.

- -

The -case at hand is that the Norwegian National Authority for -Investigation and Prosecution of Economic and Environmental Crime (aka -Ãkokrim) decided on their own, to seize a DNS domain early last -year, without following -the -official policy of the Norwegian DNS authority which require a -court decision. The web site in question was a site covering Popcorn -Time. And Popcorn Time is the name of a technology with both legal -and illegal applications. Popcorn Time is a client combining -searching a Bittorrent directory available on the Internet with -downloading/distribute content via Bittorrent and playing the -downloaded content on screen. It can be used illegally if it is used -to distribute content against the will of the right holder, but it can -also be used legally to play a lot of content, for example the -millions of movies -available from the -Internet Archive or the collection -available from Vodo. We created -a -video demonstrating legally use of Popcorn Time and played it in -Court. It can of course be downloaded using Bittorrent.

- -

I did not quite know what to expect from a day in court. The -government held on to their version of the story and we held on to -ours, and I hope the judge is able to make sense of it all. We will -know in two weeks time. Unfortunately I do not have high hopes, as -the Government have the upper hand here with more knowledge about the -case, better training in handling criminal law and in general higher -standing in the courts than fairly unknown DNS registrar and member -associations. It is expensive to be right also in Norway. So far the -case have cost more than NOK 70 000,-. To help fund the case, NUUG -and EFN have asked for donations, and managed to collect around NOK 25 -000,- so far. Given the presentation from the Government, I expect -the government to appeal if the case go our way. And if the case do -not go our way, I hope we have enough funding to appeal.

- -

From the other side came two people from Ãkokrim. On the benches, -appearing to be part of the group from the government were two people -from the Simonsen Vogt Wiik lawyer office, and three others I am not -quite sure who was. Ãkokrim had proposed to present two witnesses -from The Motion Picture Association, but this was rejected because -they did not speak Norwegian and it was a bit late to bring in a -translator, but perhaps the two from MPA were present anyway. All -seven appeared to know each other. Good to see the case is take -seriously.

- -

If you, like me, believe the courts should be involved before a DNS -domain is hijacked by the government, or you believe the Popcorn Time -technology have a lot of useful and legal applications, I suggest you -too donate to -the NUUG defense fund. Both Bitcoin and bank transfer are -available. If NUUG get more than we need for the legal action (very -unlikely), the rest will be spend promoting free software, open -standards and unix-like operating systems in Norway, so no matter what -happens the money will be put to good use.

- -

If you want to lean more about the case, I recommend you check out -the blog -posts from NUUG covering the case. They cover the legal arguments -on both sides.

Datalagringsdirektivet kaster skygger over HÃ¸yre og Arbeiderpartiet

7th September 2017

For noen dager siden publiserte Jon Wessel-Aas en bloggpost om +Â«Konklusjonen om datalagring som +EU-kommisjonen ikke ville at vi skulle fÃ¥ seÂ». Det er en +interessant gjennomgang av EU-domstolens syn pÃ¥ snurpenotovervÃ¥kning +av befolkningen, som er klar pÃ¥ at det er i strid med +EU-lovgivingen.

+ +

Valgkampen gÃ¥r for fullt i Norge, og om noen fÃ¥ dager er siste +frist for Ã¥ avgi stemme. En ting er sikkert, HÃ¸yre og Arbeiderpartiet +fÃ¥r ikke min stemme +denne +gangen heller. Jeg har ikke glemt at de tvang igjennom loven som +skulle pÃ¥legge alle data- og teletjenesteleverandÃ¸rer Ã¥ overvÃ¥ke alle +sine kunder. En lov som er vedtatt, og aldri opphevet igjen.

+ +

Det er tydelig fra diskusjonen rundt grenselÃ¸s digital overvÃ¥kning +(eller "Digital Grenseforsvar" som det kalles i Orvellisk nytale) at +hverken HÃ¸yre og Arbeiderpartiet har noen prinsipielle sperrer mot Ã¥ +overvÃ¥ke hele befolkningen, og diskusjonen sÃ¥ langt tyder pÃ¥ at flere +av de andre partiene heller ikke har det. Mange av +de som stemte +for Datalagringsdirektivet i Stortinget (64 fra Arbeiderpartiet, +25 fra HÃ¸yre) er fortsatt aktive og argumenterer fortsatt for Ã¥ radere +vekk mer av innbyggernes privatsfÃ¦re.

+ +

NÃ¥r myndighetene demonstrerer sin mistillit til folket, tror jeg +folket selv bÃ¸r legge litt innsats i Ã¥ verne sitt privatliv, ved Ã¥ ta +i bruk ende-til-ende-kryptert kommunikasjon med sine kjente og kjÃ¦re, +og begrense hvor mye privat informasjon som deles med uvedkommende. +Det er jo ingenting som tyder pÃ¥ at myndighetene kommer til Ã¥ vÃ¦re vÃ¥r +privatsfÃ¦re. +Det +er mange muligheter. Selv har jeg litt sans for +Ring, som er basert pÃ¥ p2p-teknologi +uten sentral kontroll, er fri programvare, og stÃ¸tter meldinger, tale +og video. Systemet er tilgjengelig ut av boksen fra +Debian og +Ubuntu, og det +finnes pakker for Android, MacOSX og Windows. ForelÃ¸pig er det fÃ¥ +brukere med Ring, slik at jeg ogsÃ¥ bruker +Signal som nettleserutvidelse.