Jeg kom over teksten -«Killing -car privacy by federal mandate» av Leonid Reyzin på Freedom to -Tinker i dag, og det gleder meg å se en god gjennomgang om hvorfor det -er et urimelig inngrep i privatsfæren å la alle biler kringkaste sin -posisjon og bevegelse via radio. Det omtalte forslaget basert på -Dedicated Short Range Communication (DSRC) kalles Basic Safety Message -(BSM) i USA og Cooperative Awareness Message (CAM) i Europa, og det -norske Vegvesenet er en av de som ser ut til å kunne tenke seg å -pålegge alle biler å fjerne nok en bit av innbyggernes privatsfære. -Anbefaler alle å lese det som står der. - -
Mens jeg tittet litt på DSRC på biler i Norge kom jeg over et sitat -jeg synes er illustrativt for hvordan det offentlige Norge håndterer -problemstillinger rundt innbyggernes privatsfære i SINTEF-rapporten -«Informasjonssikkerhet -i AutoPASS-brikker» av Trond Foss:
+ +While looking at +the scanned copies +for the copyright renewal entries for movies published in the USA, +an idea occurred to me. The number of renewals are so few per year, it +should be fairly quick to transcribe them all and add references to +the corresponding IMDB title ID. This would give the (presumably) +complete list of movies published 28 years earlier that did _not_ +enter the public domain for the transcribed year. By fetching the +list of USA movies published 28 years earlier and subtract the movies +with renewals, we should be left with movies registered in IMDB that +are now in the public domain. For the year 1955 (which is the one I +have looked at the most), the total number of pages to transcribe is +21. For the 28 years from 1950 to 1978, it should be in the range +500-600 pages. It is just a few days of work, and spread among a +small group of people it should be doable in a few weeks of spare +time.
+ +A typical copyright renewal entry look like this (the first one +listed for 1955):
-«Rapporten ser ikke på informasjonssikkerhet knyttet til personlig - integritet.» + ADAM AND EVIL, a photoplay in seven reels by Metro-Goldwyn-Mayer + Distribution Corp. (c) 17Aug27; L24293. Loew's Incorporated (PWH); + 10Jun55; R151558.-
SÃ¥ enkelt kan det tydeligvis gjøres nÃ¥r en vurderer -informasjonssikkerheten. Det holder vel at folkene pÃ¥ toppen kan si -at «Personvernet er ivaretatt», som jo er den populære intetsigende -frasen som gjør at mange tror enkeltindividers integritet tas vare pÃ¥. -Sitatet fikk meg til Ã¥ undres pÃ¥ hvor ofte samme tilnærming, Ã¥ bare se -bort fra behovet for personlig itegritet, blir valgt nÃ¥r en velger Ã¥ -legge til rette for nok et inngrep i privatsfæren til personer i -Norge. Det er jo sjelden det fÃ¥r reaksjoner. Historien om -reaksjonene pÃ¥ Helse Sør-Ãsts tjenesteutsetting er jo sørgelig nok et -unntak og toppen av isfjellet, desverre. Tror jeg fortsatt takker nei -til bÃ¥de AutoPASS og holder meg sÃ¥ langt unna det norske helsevesenet -som jeg kan, inntil de har demonstrert og dokumentert at de verdsetter -individets privatsfære og personlige integritet høyere enn kortsiktig -gevist og samfunnsnytte.
+The movie title as well as registration and renewal dates are easy +enough to locate by a program (split on first comma and look for +DDmmmYY). The rest of the text is not required to find the movie in +IMDB, but is useful to confirm the correct movie is found. I am not +quite sure what the L and R numbers mean, but suspect they are +reference numbers into the archive of the US Copyright Office.
+ +Tracking down the equivalent IMDB title ID is probably going to be +a manual task, but given the year it is fairly easy to search for the +movie title using for example +http://www.imdb.com/find?q=adam+and+evil+1927&s=all. +Using this search, I find that the equivalent IMDB title ID for the +first renewal entry from 1955 is +http://www.imdb.com/title/tt0017588/.
+ +I suspect the best way to do this would be to make a specialised +web service to make it easy for contributors to transcribe and track +down IMDB title IDs. In the web service, once a entry is transcribed, +the title and year could be extracted from the text, a search in IMDB +conducted for the user to pick the equivalent IMDB title ID right +away. By spreading out the work among volunteers, it would also be +possible to make at least two persons transcribe the same entries to +be able to discover any typos introduced. But I will need help to +make this happen, as I lack the spare time to do all of this on my +own. If you would like to help, please get in touch. Perhaps you can +draft a web service for crowd sourcing the task?
+ +Note, Project Gutenberg already have some +transcribed +copies of the US Copyright Office renewal protocols, but I have +not been able to find any film renewals there, so I suspect they only +have copies of renewal for written works. I have not been able to find +any transcribed versions of movie renewals so far. Perhaps they exist +somewhere?
+ +I would love to figure out methods for finding all the public +domain works in other countries too, but it is a lot harder. At least +for Norway and Great Britain, such work involve tracking down the +people involved in making the movie and figuring out when they died. +It is hard enough to figure out who was part of making a movie, but I +do not know how to automate such procedure without a registry of every +person involved in making movies and their death year.
+ +As usual, if you use Bitcoin and want to show your support of my +activities, please send Bitcoin donations to my address +15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b.
It is pleasing to see that the work we put down in publishing new -editions of the classic Free -Culture book by the founder of the Creative Commons movement, -Lawrence Lessig, is still being appreciated. I had a look at the -latest sales numbers for the paper edition today. Not too impressive, -but happy to see some buyers still exist. All the revenue from the -books is sent to the Creative -Commons Corporation, and they receive the largest cut if you buy -directly from Lulu. Most books are sold via Amazon, with Ingram -second and only a small fraction directly from Lulu. The ebook -edition is available for free from -Github.
- -Title / language | Quantity | ||
---|---|---|---|
2016 jan-jun | 2016 jul-dec | 2017 jan-may | |
Culture Libre / French | -3 | -6 | -15 | -
Fri kultur / Norwegian | -7 | -1 | -0 | -
Free Culture / English | -14 | -27 | -16 | -
Total | -24 | -34 | -31 | -
A bit sad to see the low sales number on the Norwegian edition, and -a bit surprising the English edition still selling so well.
- -If you would like to translate and publish the book in your native -language, I would be happy to help make it happen. Please get in -touch.
+ +Three years ago, a presumed lost animation film, +Empty Socks from +1927, was discovered in the Norwegian National Library. At the +time it was discovered, it was generally assumed to be copyrighted by +The Walt Disney Company, and I blogged about +my +reasoning to conclude that it would would enter the Norwegian +equivalent of the public domain in 2053, based on my understanding of +Norwegian Copyright Law. But a few days ago, I came across +a +blog post claiming the movie was already in the public domain, at +least in USA. The reasoning is as follows: The film was released in +November or Desember 1927 (sources disagree), and presumably +registered its copyright that year. At that time, right holders of +movies registered by the copyright office received government +protection for there work for 28 years. After 28 years, the copyright +had to be renewed if the wanted the government to protect it further. +The blog post I found claim such renewal did not happen for this +movie, and thus it entered the public domain in 1956. Yet someone +claim the copyright was renewed and the movie is still copyright +protected. Can anyone help me to figure out which claim is correct? +I have not been able to find Empty Socks in Catalog of copyright +entries. Ser.3 pt.12-13 v.9-12 1955-1958 Motion Pictures +available +from the University of Pennsylvania, neither in +page +45 for the first half of 1955, nor in +page +119 for the second half of 1955. It is of course possible that +the renewal entry was left out of the printed catalog by mistake. Is +there some way to rule out this possibility? Please help, and update +the wikipedia page with your findings. + +
As usual, if you use Bitcoin and want to show your support of my +activities, please send Bitcoin donations to my address +15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b.
I am very happy to report that the -Nikita Noark 5 -core project tagged its second release today. The free software -solution is an implementation of the Norwegian archive standard Noark -5 used by government offices in Norway. These were the changes in -version 0.1.1 since version 0.1.0 (from NEWS.md): - -
-
-
-
- Continued work on the angularjs GUI, including document upload. -
- Implemented correspondencepartPerson, correspondencepartUnit and - correspondencepartInternal -
- Applied for coverity coverage and started submitting code on - regualr basis. -
- Started fixing bugs reported by coverity -
- Corrected and completed HATEOAS links to make sure entire API is - available via URLs in _links. -
- Corrected all relation URLs to use trailing slash. -
- Add initial support for storing data in ElasticSearch. -
- Now able to receive and store uploaded files in the archive. -
- Changed JSON output for object lists to have relations in _links. -
- Improve JSON output for empty object lists. -
- Now uses correct MIME type application/vnd.noark5-v4+json. -
- Added support for docker container images. -
- Added simple API browser implemented in JavaScript/Angular. -
- Started on archive client implemented in JavaScript/Angular. -
- Started on prototype to show the public mail journal. -
- Improved performance by disabling Sprint FileWatcher. -
- Added support for 'arkivskaper', 'saksmappe' and 'journalpost'. -
- Added support for some metadata codelists. -
- Added support for Cross-origin resource sharing (CORS). -
- Changed login method from Basic Auth to JSON Web Token (RFC 7519) - style. -
- Added support for GET-ing ny-* URLs. -
- Added support for modifying entities using PUT and eTag. -
- Added support for returning XML output on request. -
- Removed support for English field and class names, limiting ourself - to the official names. -
- ... - -
If this sound interesting to you, please contact us on IRC (#nikita -on irc.freenode.net) or email -(nikita-noark -mailing list).
+ +It would be easier to locate the movie you want to watch in +the Internet Archive, if the +metadata about each movie was more complete and accurate. In the +archiving community, a well known saying state that good metadata is a +love letter to the future. The metadata in the Internet Archive could +use a face lift for the future to love us back. Here is a proposal +for a small improvement that would make the metadata more useful +today. I've been unable to find any document describing the various +standard fields available when uploading videos to the archive, so +this proposal is based on my best quess and searching through several +of the existing movies.
+ +I have a few use cases in mind. First of all, I would like to be +able to count the number of distinct movies in the Internet Archive, +without duplicates. I would further like to identify the IMDB title +ID of the movies in the Internet Archive, to be able to look up a IMDB +title ID and know if I can fetch the video from there and share it +with my friends.
+ +Second, I would like the Butter data provider for The Internet +archive +(available +from github), to list as many of the good movies as possible. The +plugin currently do a search in the archive with the following +parameters:
+ ++collection:moviesandfilms +AND NOT collection:movie_trailers +AND -mediatype:collection +AND format:"Archive BitTorrent" +AND year ++ +
Most of the cool movies that fail to show up in Butter do so +because the 'year' field is missing. The 'year' field is populated by +the year part from the 'date' field, and should be when the movie was +released (date or year). Two such examples are +Ben Hur +from 1905 and +Caminandes +2: Gran Dillama from 2013, where the year metadata field is +missing.
+ +So, my proposal is simply, for every movie in The Internet Archive +where an IMDB title ID exist, please fill in these metadata fields +(note, they can be updated also long after the video was uploaded, but +as far as I can tell, only by the uploader): + +-
+
+
- mediatype +
- Should be 'movie' for movies. + +
- collection +
- Should contain 'moviesandfilms'. + +
- title +
- The title of the movie, without the publication year. + +
- date +
- The data or year the movie was released. This make the movie show +up in Butter, as well as make it possible to know the age of the +movie and is useful to figure out copyright status. + +
- director +
- The director of the movie. This make it easier to know if the +correct movie is found in movie databases. + +
- publisher +
- The production company making the movie. Also useful for +identifying the correct movie. + +
- links + +
- Add a link to the IMDB title page, for example like this: <a +href="http://www.imdb.com/title/tt0028496/">Movie in +IMDB</a>. This make it easier to find duplicates and allow for +counting of number of unique movies in the Archive. Other external +references, like to TMDB, could be added like this too. + +
I did consider proposing a Custom field for the IMDB title ID (for +example 'imdb_title_url', 'imdb_code' or simply 'imdb', but suspect it +will be easier to simply place it in the links free text field.
+ +I created +a +list of IMDB title IDs for several thousand movies in the Internet +Archive, but I also got a list of several thousand movies without +such IMDB title ID (and quite a few duplicates). It would be great if +this data set could be integrated into the Internet Archive metadata +to be available for everyone in the future, but with the current +policy of leaving metadata editing to the uploaders, it will take a +while before this happen. If you have uploaded movies into the +Internet Archive, you can help. Please consider following my proposal +above for your movies, to ensure that movie is properly +counted. :)
+ +The list is mostly generated using wikidata, which based on +Wikipedia articles make it possible to link between IMDB and movies in +the Internet Archive. But there are lots of movies without a +Wikipedia article, and some movies where only a collection page exist +(like for the +Caminandes example above, where there are three movies but only +one Wikidata entry).
+ +As usual, if you use Bitcoin and want to show your support of my +activities, please send Bitcoin donations to my address +15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b.
This is a copy of -an -email I posted to the nikita-noark mailing list. Please follow up -there if you would like to discuss this topic. The background is that -we are making a free software archive system based on the Norwegian -Noark -5 standard for government archives.
- -I've been wondering a bit lately how trusted timestamps could be -stored in Noark 5. -Trusted -timestamps can be used to verify that some information -(document/file/checksum/metadata) have not been changed since a -specific time in the past. This is useful to verify the integrity of -the documents in the archive.
- -Then it occured to me, perhaps the trusted timestamps could be -stored as dokument variants (ie dokumentobjekt referered to from -dokumentbeskrivelse) with the filename set to the hash it is -stamping?
- -Given a "dokumentbeskrivelse" with an associated "dokumentobjekt", -a new dokumentobjekt is associated with "dokumentbeskrivelse" with the -same attributes as the stamped dokumentobjekt except these -attributes:
- --
-
-
- format -> "RFC3161" -
- mimeType -> "application/timestamp-reply" -
- formatDetaljer -> "<source URL for timestamp service>" -
- filenavn -> "<sjekksum>.tsr" - -
This assume a service following -IETF RFC 3161 is -used, which specifiy the given MIME type for replies and the .tsr file -ending for the content of such trusted timestamp. As far as I can -tell from the Noark 5 specifications, it is OK to have several -variants/renderings of a dokument attached to a given -dokumentbeskrivelse objekt. It might be stretching it a bit to make -some of these variants represent crypto-signatures useful for -verifying the document integrity instead of representing the dokument -itself.
- -Using the source of the service in formatDetaljer allow several -timestamping services to be used. This is useful to spread the risk -of key compromise over several organisations. It would only be a -problem to trust the timestamps if all of the organisations are -compromised.
- -The following oneliner on Linux can be used to generate the tsr
-file. $input is the path to the file to checksum, and $sha256 is the
-SHA-256 checksum of the file (ie the "
- --openssl ts -query -data "$inputfile" -cert -sha256 -no_nonce \ - | curl -s -H "Content-Type: application/timestamp-query" \ - --data-binary "@-" http://zeitstempel.dfn.de > $sha256.tsr -
To verify the timestamp, you first need to download the public key -of the trusted timestamp service, for example using this command:
- -- --wget -O ca-cert.txt \ - https://pki.pca.dfn.de/global-services-ca/pub/cacert/chain.txt -
Note, the public key should be stored alongside the timestamps in -the archive to make sure it is also available 100 years from now. It -is probably a good idea to standardise how and were to store such -public keys, to make it easier to find for those trying to verify -documents 100 or 1000 years from now. :)
- -The verification itself is a simple openssl command:
- -- --openssl ts -verify -data $inputfile -in $sha256.tsr \ - -CAfile ca-cert.txt -text -
Is there any reason this approach would not work? Is it somehow against -the Noark 5 specification?
+ +A month ago, I blogged about my work to +automatically +check the copyright status of IMDB entries, and try to count the +number of movies listed in IMDB that is legal to distribute on the +Internet. I have continued to look for good data sources, and +identified a few more. The code used to extract information from +various data sources is available in +a +git repository, currently available from github.
+ +So far I have identified 3186 unique IMDB title IDs. To gain +better understanding of the structure of the data set, I created a +histogram of the year associated with each movie (typically release +year). It is interesting to notice where the peaks and dips in the +graph are located. I wonder why they are placed there. I suspect +World War II caused the dip around 1940, but what caused the peak +around 2010?
+ +I've so far identified ten sources for IMDB title IDs for movies in +the public domain or with a free license. This is the statistics +reported when running 'make stats' in the git repository:
+ ++ 249 entries ( 6 unique) with and 288 without IMDB title ID in free-movies-archive-org-butter.json + 2301 entries ( 540 unique) with and 0 without IMDB title ID in free-movies-archive-org-wikidata.json + 830 entries ( 29 unique) with and 0 without IMDB title ID in free-movies-icheckmovies-archive-mochard.json + 2109 entries ( 377 unique) with and 0 without IMDB title ID in free-movies-imdb-pd.json + 291 entries ( 122 unique) with and 0 without IMDB title ID in free-movies-letterboxd-pd.json + 144 entries ( 135 unique) with and 0 without IMDB title ID in free-movies-manual.json + 350 entries ( 1 unique) with and 801 without IMDB title ID in free-movies-publicdomainmovies.json + 4 entries ( 0 unique) with and 124 without IMDB title ID in free-movies-publicdomainreview.json + 698 entries ( 119 unique) with and 118 without IMDB title ID in free-movies-publicdomaintorrents.json + 8 entries ( 8 unique) with and 196 without IMDB title ID in free-movies-vodo.json + 3186 unique IMDB title IDs in total ++ +
The entries without IMDB title ID are candidates to increase the +data set, but might equally well be duplicates of entries already +listed with IMDB title ID in one of the other sources, or represent +movies that lack a IMDB title ID. I've seen examples of all these +situations when peeking at the entries without IMDB title ID. Based +on these data sources, the lower bound for movies listed in IMDB that +are legal to distribute on the Internet is between 3186 and 4713. + +
It would be great for improving the accuracy of this measurement, +if the various sources added IMDB title ID to their metadata. I have +tried to reach the people behind the various sources to ask if they +are interested in doing this, without any replies so far. Perhaps you +can help me get in touch with the people behind VODO, Public Domain +Torrents, Public Domain Movies and Public Domain Review to try to +convince them to add more metadata to their movie entries?
+ +Another way you could help is by adding pages to Wikipedia about +movies that are legal to distribute on the Internet. If such page +exist and include a link to both IMDB and The Internet Archive, the +script used to generate free-movies-archive-org-wikidata.json should +pick up the mapping as soon as wikidata is updates.
+ +As usual, if you use Bitcoin and want to show your support of my +activities, please send Bitcoin donations to my address +15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b.
Aftenposten -melder i dag om feil i eksamensoppgavene for eksamen i politikk og -menneskerettigheter, der teksten i bokmåls og nynorskutgaven ikke var -like. Oppgaveteksten er gjengitt i artikkelen, og jeg ble nysgjerring -på om den fri oversetterløsningen -Apertium ville gjort en bedre -jobb enn Utdanningsdirektoratet. Det kan se slik ut.
- -Her er bokmålsoppgaven fra eksamenen:
- --- -Drøft utfordringene knyttet til nasjonalstatenes og andre aktørers -rolle og muligheter til å håndtere internasjonale utfordringer, som -for eksempel flykningekrisen.
- -Vedlegge er eksempler på tekster som kan gi relevante perspektiver -på temaet:
--
- -- Flykningeregnskapet 2016, UNHCR og IDMC -
- «Grenseløst Europa for fall» A-Magasinet, 26. november 2015 -
Dette oversetter Apertium slik:
- --- -Drøft utfordringane knytte til nasjonalstatane sine og rolla til -andre aktørar og høve til å handtera internasjonale utfordringar, som -til dømes *flykningekrisen.
- -Vedleggja er døme på tekster som kan gje relevante perspektiv på -temaet:
- --
- -- *Flykningeregnskapet 2016, *UNHCR og *IDMC
-- «*Grenseløst Europa for fall» A-Magasinet, 26. november 2015
-
Ord som ikke ble forstått er markert med stjerne (*), og trenger -ekstra språksjekk. Men ingen ord er forsvunnet, slik det var i -oppgaven elevene fikk presentert på eksamen. Jeg mistenker dog at -"andre aktørers rolle og muligheter til ..." burde vært oversatt til -"rolla til andre aktørar og deira høve til ..." eller noe slikt, men -det er kanskje flisespikking. Det understreker vel bare at det alltid -trengs korrekturlesning etter automatisk oversettelse.
+ +If you care about how fault tolerant your storage is, you might +find these articles and papers interesting. They have formed how I +think of when designing a storage system.
+ +-
+
+
- USENIX :login; Redundancy +Does Not Imply Fault Tolerance. Analysis of Distributed Storage +Reactions to Single Errors and Corruptions by Aishwarya Ganesan, +Ramnatthan Alagappan, Andrea C. Arpaci-Dusseau, and Remzi +H. Arpaci-Dusseau + +
- ZDNet +Why +RAID 5 stops working in 2009 by Robin Harris + +
- ZDNet +Why +RAID 6 stops working in 2019 by Robin Harris + +
- USENIX FAST'07 +Failure +Trends in a Large Disk Drive Population by Eduardo Pinheiro, +Wolf-Dietrich Weber and Luiz AndreÌ Barroso + +
- USENIX ;login: Data +Integrity. Finding Truth in a World of Guesses and Lies by Doug +Hughes + +
- USENIX FAST'08 +An +Analysis of Data Corruption in the Storage Stack by +L. N. Bairavasundaram, G. R. Goodson, B. Schroeder, A. C. +Arpaci-Dusseau, and R. H. Arpaci-Dusseau + +
- USENIX FAST'07 Disk +failures in the real world: what does an MTTF of 1,000,000 hours mean +to you? by B. Schroeder and G. A. Gibson. + +
- USENIX ;login: Are +Disks the Dominant Contributor for Storage Failures? A Comprehensive +Study of Storage Subsystem Failure Characteristics by Weihang +Jiang, Chongfeng Hu, Yuanyuan Zhou, and Arkady Kanevsky + +
- SIGMETRICS 2007 +An +analysis of latent sector errors in disk drives by +L. N. Bairavasundaram, G. R. Goodson, S. Pasupathy, and J. Schindler + +
Several of these research papers are based on data collected from +hundred thousands or millions of disk, and their findings are eye +opening. The short story is simply do not implicitly trust RAID or +redundant storage systems. Details matter. And unfortunately there +are few options on Linux addressing all the identified issues. Both +ZFS and Btrfs are doing a fairly good job, but have legal and +practical issues on their own. I wonder how cluster file systems like +Ceph do in this regard. After all, there is an old saying, you know +you have a distributed system when the crash of a computer you have +never heard of stops you from getting any work done. The same holds +true if fault tolerance do not work.
+ +Just remember, in the end, it do not matter how redundant, or how +fault tolerant your storage is, if you do not continuously monitor its +status to detect and replace failed disks.
+ +As usual, if you use Bitcoin and want to show your support of my +activities, please send Bitcoin donations to my address +15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b.
I disse dager, med frist 1. mai, har Riksarkivaren ute en høring på -sin forskrift. Som en kan se er det ikke mye tid igjen før fristen -som går ut på søndag. Denne forskriften er det som lister opp hvilke -formater det er greit å arkivere i -Noark -5-løsninger i Norge.
- -Jeg fant høringsdokumentene hos -Norsk -Arkivråd etter å ha blitt tipset på epostlisten til -fri -programvareprosjektet Nikita Noark5-Core, som lager et Noark 5 -Tjenestegresesnitt. Jeg er involvert i Nikita-prosjektet og takket -være min interesse for tjenestegrensesnittsprosjektet har jeg lest en -god del Noark 5-relaterte dokumenter, og til min overraskelse oppdaget -at standard epost ikke er på listen over godkjente formater som kan -arkiveres. Høringen med frist søndag er en glimrende mulighet til å -forsøke å gjøre noe med det. Jeg holder på med -egen -høringsuttalelse, og lurer på om andre er interessert i å støtte -forslaget om å tillate arkivering av epost som epost i arkivet.
- -Er du igang med å skrive egen høringsuttalelse allerede? I så fall -kan du jo vurdere å ta med en formulering om epost-lagring. Jeg tror -ikke det trengs så mye. Her et kort forslag til tekst:
- -- -- -Viser til høring sendt ut 2017-02-17 (Riksarkivarens referanse - 2016/9840 HELHJO), og tillater oss å sende inn noen innspill om - revisjon av Forskrift om utfyllende tekniske og arkivfaglige - bestemmelser om behandling av offentlige arkiver (Riksarkivarens - forskrift).
- -Svært mye av vår kommuikasjon foregår i dag på e-post. Vi - foreslår derfor at Internett-e-post, slik det er beskrevet i IETF - RFC 5322, - https://tools.ietf.org/html/rfc5322. bør - inn som godkjent dokumentformat. Vi foreslår at forskriftens - oversikt over godkjente dokumentformater ved innlevering i § 5-16 - endres til å ta med Internett-e-post.
- -
Som del av arbeidet med tjenestegrensesnitt har vi testet hvordan -epost kan lagres i en Noark 5-struktur, og holder på å skrive et -forslag om hvordan dette kan gjøres som vil bli sendt over til -arkivverket så snart det er ferdig. De som er interesserte kan -følge -fremdriften på web.
- -Oppdatering 2017-04-28: I dag ble høringuttalelsen jeg skrev - sendt - inn av foreningen NUUG.
+ +I was surprised today to learn that a friend in academia did not +know there are easily available web services available for writing +LaTeX documents as a team. I thought it was common knowledge, but to +make sure at least my readers are aware of it, I would like to mention +these useful services for writing LaTeX documents. Some of them even +provide a WYSIWYG editor to ease writing even further.
+ +There are two commercial services available, +ShareLaTeX and +Overleaf. They are very easy to +use. Just start a new document, select which publisher to write for +(ie which LaTeX style to use), and start writing. Note, these two +have announced their intention to join forces, so soon it will only be +one joint service. I've used both for different documents, and they +work just fine. While +ShareLaTeX is free +software, while the latter is not. According to a +announcement from Overleaf, they plan to keep the ShareLaTeX code +base maintained as free software.
+ +But these two are not the only alternatives. +Fidus Writer is another free +software solution with the +source available on github. I have not used it myself. Several +others can be found on the nice +alterntiveTo +web service. + +If you like Google Docs or Etherpad, but would like to write +documents in LaTeX, you should check out these services. You can even +host your own, if you want to. :)
+ +As usual, if you use Bitcoin and want to show your support of my +activities, please send Bitcoin donations to my address +15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b.
Jeg oppdaget i dag at nettstedet som -publiserer offentlige postjournaler fra statlige etater, OEP, har -begynt å blokkerer enkelte typer webklienter fra å få tilgang. Vet -ikke hvor mange det gjelder, men det gjelder i hvert fall libwww-perl -og curl. For å teste selv, kjør følgende:
- -- --% curl -v -s https://www.oep.no/pub/report.xhtml?reportId=3 2>&1 |grep '< HTTP' -< HTTP/1.1 404 Not Found -% curl -v -s --header 'User-Agent:Opera/12.0' https://www.oep.no/pub/report.xhtml?reportId=3 2>&1 |grep '< HTTP' -< HTTP/1.1 200 OK -% -
Her kan en se at tjenesten gir «404 Not Found» for curl i -standardoppsettet, mens den gir «200 OK» hvis curl hevder å være Opera -versjon 12.0. Offentlig elektronisk postjournal startet blokkeringen -2017-03-02.
- -Blokkeringen vil gjøre det litt vanskeligere å maskinelt hente -informasjon fra oep.no. Kan blokkeringen være gjort for å hindre -automatisert innsamling av informasjon fra OEP, slik Pressens -Offentlighetsutvalg gjorde for å dokumentere hvordan departementene -hindrer innsyn i -rapporten -«Slik hindrer departementer innsyn» som ble publiserte i januar -2017. Det virker usannsynlig, da det jo er trivielt å bytte -User-Agent til noe nytt.
- -Finnes det juridisk grunnlag for det offentlige å diskriminere -webklienter slik det gjøres her? Der tilgang gis eller ikke alt etter -hva klienten sier at den heter? Da OEP eies av DIFI og driftes av -Basefarm, finnes det kanskje noen dokumenter sendt mellom disse to -aktørene man kan be om innsyn i for å forstå hva som har skjedd. Men -postjournalen -til DIFI viser kun to dokumenter det siste året mellom DIFI og -Basefarm. -Mimes brønn neste, -tenker jeg.
+ +Recently, I needed to automatically check the copyright status of a +set of The Internet Movie database +(IMDB) entries, to figure out which one of the movies they refer +to can be freely distributed on the Internet. This proved to be +harder than it sounds. IMDB for sure list movies without any +copyright protection, where the copyright protection has expired or +where the movie is lisenced using a permissive license like one from +Creative Commons. These are mixed with copyright protected movies, +and there seem to be no way to separate these classes of movies using +the information in IMDB.
+ +First I tried to look up entries manually in IMDB, +Wikipedia and +The Internet Archive, to get a +feel how to do this. It is hard to know for sure using these sources, +but it should be possible to be reasonable confident a movie is "out +of copyright" with a few hours work per movie. As I needed to check +almost 20,000 entries, this approach was not sustainable. I simply +can not work around the clock for about 6 years to check this data +set.
+ +I asked the people behind The Internet Archive if they could +introduce a new metadata field in their metadata XML for IMDB ID, but +was told that they leave it completely to the uploaders to update the +metadata. Some of the metadata entries had IMDB links in the +description, but I found no way to download all metadata files in bulk +to locate those ones and put that approach aside.
+ +In the process I noticed several Wikipedia articles about movies +had links to both IMDB and The Internet Archive, and it occured to me +that I could use the Wikipedia RDF data set to locate entries with +both, to at least get a lower bound on the number of movies on The +Internet Archive with a IMDB ID. This is useful based on the +assumption that movies distributed by The Internet Archive can be +legally distributed on the Internet. With some help from the RDF +community (thank you DanC), I was able to come up with this query to +pass to the SPARQL interface on +Wikidata: + +
+SELECT ?work ?imdb ?ia ?when ?label +WHERE +{ + ?work wdt:P31/wdt:P279* wd:Q11424. + ?work wdt:P345 ?imdb. + ?work wdt:P724 ?ia. + OPTIONAL { + ?work wdt:P577 ?when. + ?work rdfs:label ?label. + FILTER(LANG(?label) = "en"). + } +} ++ +
If I understand the query right, for every film entry anywhere in +Wikpedia, it will return the IMDB ID and The Internet Archive ID, and +when the movie was released and its English title, if either or both +of the latter two are available. At the moment the result set contain +2338 entries. Of course, it depend on volunteers including both +correct IMDB and The Internet Archive IDs in the wikipedia articles +for the movie. It should be noted that the result will include +duplicates if the movie have entries in several languages. There are +some bogus entries, either because The Internet Archive ID contain a +typo or because the movie is not available from The Internet Archive. +I did not verify the IMDB IDs, as I am unsure how to do that +automatically.
+ +I wrote a small python script to extract the data set from Wikidata +and check if the XML metadata for the movie is available from The +Internet Archive, and after around 1.5 hour it produced a list of 2097 +free movies and their IMDB ID. In total, 171 entries in Wikidata lack +the refered Internet Archive entry. I assume the 70 "disappearing" +entries (ie 2338-2097-171) are duplicate entries.
+ +This is not too bad, given that The Internet Archive report to +contain 5331 +feature films at the moment, but it also mean more than 3000 +movies are missing on Wikipedia or are missing the pair of references +on Wikipedia.
+ +I was curious about the distribution by release year, and made a +little graph to show how the amount of free movies is spread over the +years:
+ +
I expect the relative distribution of the remaining 3000 movies to +be similar.
+ +If you want to help, and want to ensure Wikipedia can be used to +cross reference The Internet Archive and The Internet Movie Database, +please make sure entries like this are listed under the "External +links" heading on the Wikipedia article for the movie:
+ ++* {{Internet Archive film|id=FightingLady}} +* {{IMDb title|id=0036823|title=The Fighting Lady}} ++ +
Please verify the links on the final page, to make sure you did not +introduce a typo.
+ +Here is the complete list, if you want to correct the 171 +identified Wikipedia entries with broken links to The Internet +Archive: Q1140317, +Q458656, +Q458656, +Q470560, +Q743340, +Q822580, +Q480696, +Q128761, +Q1307059, +Q1335091, +Q1537166, +Q1438334, +Q1479751, +Q1497200, +Q1498122, +Q865973, +Q834269, +Q841781, +Q841781, +Q1548193, +Q499031, +Q1564769, +Q1585239, +Q1585569, +Q1624236, +Q4796595, +Q4853469, +Q4873046, +Q915016, +Q4660396, +Q4677708, +Q4738449, +Q4756096, +Q4766785, +Q880357, +Q882066, +Q882066, +Q204191, +Q204191, +Q1194170, +Q940014, +Q946863, +Q172837, +Q573077, +Q1219005, +Q1219599, +Q1643798, +Q1656352, +Q1659549, +Q1660007, +Q1698154, +Q1737980, +Q1877284, +Q1199354, +Q1199354, +Q1199451, +Q1211871, +Q1212179, +Q1238382, +Q4906454, +Q320219, +Q1148649, +Q645094, +Q5050350, +Q5166548, +Q2677926, +Q2698139, +Q2707305, +Q2740725, +Q2024780, +Q2117418, +Q2138984, +Q1127992, +Q1058087, +Q1070484, +Q1080080, +Q1090813, +Q1251918, +Q1254110, +Q1257070, +Q1257079, +Q1197410, +Q1198423, +Q706951, +Q723239, +Q2079261, +Q1171364, +Q617858, +Q5166611, +Q5166611, +Q324513, +Q374172, +Q7533269, +Q970386, +Q976849, +Q7458614, +Q5347416, +Q5460005, +Q5463392, +Q3038555, +Q5288458, +Q2346516, +Q5183645, +Q5185497, +Q5216127, +Q5223127, +Q5261159, +Q1300759, +Q5521241, +Q7733434, +Q7736264, +Q7737032, +Q7882671, +Q7719427, +Q7719444, +Q7722575, +Q2629763, +Q2640346, +Q2649671, +Q7703851, +Q7747041, +Q6544949, +Q6672759, +Q2445896, +Q12124891, +Q3127044, +Q2511262, +Q2517672, +Q2543165, +Q426628, +Q426628, +Q12126890, +Q13359969, +Q13359969, +Q2294295, +Q2294295, +Q2559509, +Q2559912, +Q7760469, +Q6703974, +Q4744, +Q7766962, +Q7768516, +Q7769205, +Q7769988, +Q2946945, +Q3212086, +Q3212086, +Q18218448, +Q18218448, +Q18218448, +Q6909175, +Q7405709, +Q7416149, +Q7239952, +Q7317332, +Q7783674, +Q7783704, +Q7857590, +Q3372526, +Q3372642, +Q3372816, +Q3372909, +Q7959649, +Q7977485, +Q7992684, +Q3817966, +Q3821852, +Q3420907, +Q3429733, +Q774474
+ +As usual, if you use Bitcoin and want to show your support of my +activities, please send Bitcoin donations to my address +15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b.
The Nikita -Noark 5 core project is implementing the Norwegian standard for -keeping an electronic archive of government documents. -The -Noark 5 standard document the requirement for data systems used by -the archives in the Norwegian government, and the Noark 5 web interface -specification document a REST web service for storing, searching and -retrieving documents and metadata in such archive. I've been involved -in the project since a few weeks before Christmas, when the Norwegian -Unix User Group -announced -it supported the project. I believe this is an important project, -and hope it can make it possible for the government archives in the -future to use free software to keep the archives we citizens depend -on. But as I do not hold such archive myself, personally my first use -case is to store and analyse public mail journal metadata published -from the government. I find it useful to have a clear use case in -mind when developing, to make sure the system scratches one of my -itches.
- -If you would like to help make sure there is a free software -alternatives for the archives, please join our IRC channel -(#nikita on -irc.freenode.net) and -the -project mailing list.
- -When I got involved, the web service could store metadata about -documents. But a few weeks ago, a new milestone was reached when it -became possible to store full text documents too. Yesterday, I -completed an implementation of a command line tool -archive-pdf to upload a PDF file to the archive using this -API. The tool is very simple at the moment, and find existing -fonds, series and -files while asking the user to select which one to use if more than -one exist. Once a file is identified, the PDF is associated with the -file and uploaded, using the title extracted from the PDF itself. The -process is fairly similar to visiting the archive, opening a cabinet, -locating a file and storing a piece of paper in the archive. Here is -a test run directly after populating the database with test data using -our API tester:
- -- --~/src//noark5-tester$ ./archive-pdf mangelmelding/mangler.pdf -using arkiv: Title of the test fonds created 2017-03-18T23:49:32.103446 -using arkivdel: Title of the test series created 2017-03-18T23:49:32.103446 - - 0 - Title of the test case file created 2017-03-18T23:49:32.103446 - 1 - Title of the test file created 2017-03-18T23:49:32.103446 -Select which mappe you want (or search term): 0 -Uploading mangelmelding/mangler.pdf - PDF title: Mangler i spesifikasjonsdokumentet for NOARK 5 Tjenestegrensesnitt - File 2017/1: Title of the test case file created 2017-03-18T23:49:32.103446 -~/src//noark5-tester$ -
You can see here how the fonds (arkiv) and serie (arkivdel) only had -one option, while the user need to choose which file (mappe) to use -among the two created by the API tester. The archive-pdf -tool can be found in the git repository for the API tester.
- -In the project, I have been mostly working on -the API -tester so far, while getting to know the code base. The API -tester currently use -the HATEOAS links -to traverse the entire exposed service API and verify that the exposed -operations and objects match the specification, as well as trying to -create objects holding metadata and uploading a simple XML file to -store. The tester has proved very useful for finding flaws in our -implementation, as well as flaws in the reference site and the -specification.
- -The test document I uploaded is a summary of all the specification -defects we have collected so far while implementing the web service. -There are several unclear and conflicting parts of the specification, -and we have -started -writing down the questions we get from implementing it. We use a -format inspired by how The -Austin Group collect defect reports for the POSIX standard with -their -instructions for the MANTIS defect tracker system, in lack of an official way to structure defect reports for Noark 5 (our first submitted defect report was a request for a procedure for submitting defect reports :). - -
The Nikita project is implemented using Java and Spring, and is -fairly easy to get up and running using Docker containers for those -that want to test the current code base. The API tester is -implemented in Python.
+ +I find it fascinating how many of the people being locked inside +the proposed border wall between USA and Mexico support the idea. The +proposal to keep Mexicans out reminds me of +the +propaganda twist from the East Germany government calling the wall +the âAntifascist Bulwarkâ after erecting the Berlin Wall, claiming +that the wall was erected to keep enemies from creeping into East +Germany, while it was obvious to the people locked inside it that it +was erected to keep the people from escaping.
+ +Do the people in USA supporting this wall really believe it is a +one way wall, only keeping people on the outside from getting in, +while not keeping people in the inside from getting out?
+ +As usual, if you use Bitcoin and want to show your support of my +activities, please send Bitcoin donations to my address +15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b.
Over the years, administrating thousand of NFS mounting linux -computers at the time, I often needed a way to detect if the machine -was experiencing NFS hang. If you try to use df or look at a -file or directory affected by the hang, the process (and possibly the -shell) will hang too. So you want to be able to detect this without -risking the detection process getting stuck too. It has not been -obvious how to do this. When the hang has lasted a while, it is -possible to find messages like these in dmesg:
- --nfs: server nfsserver not responding, still trying -- -
nfs: server nfsserver OK -
It is hard to know if the hang is still going on, and it is hard to -be sure looking in dmesg is going to work. If there are lots of other -messages in dmesg the lines might have rotated out of site before they -are noticed.
- -While reading through the nfs client implementation in linux kernel -code, I came across some statistics that seem to give a way to detect -it. The om_timeouts sunrpc value in the kernel will increase every -time the above log entry is inserted into dmesg. And after digging a -bit further, I discovered that this value show up in -/proc/self/mountstats on Linux.
- -The mountstats content seem to be shared between files using the -same file system context, so it is enough to check one of the -mountstats files to get the state of the mount point for the machine. -I assume this will not show lazy umounted NFS points, nor NFS mount -points in a different process context (ie with a different filesystem -view), but that does not worry me.
- -The content for a NFS mount point look similar to this:
- -- --[...] -device /dev/mapper/Debian-var mounted on /var with fstype ext3 -device nfsserver:/mnt/nfsserver/home0 mounted on /mnt/nfsserver/home0 with fstype nfs statvers=1.1 - opts: rw,vers=3,rsize=65536,wsize=65536,namlen=255,acregmin=3,acregmax=60,acdirmin=30,acdirmax=60,soft,nolock,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=129.240.3.145,mountvers=3,mountport=4048,mountproto=udp,local_lock=all - age: 7863311 - caps: caps=0x3fe7,wtmult=4096,dtsize=8192,bsize=0,namlen=255 - sec: flavor=1,pseudoflavor=1 - events: 61063112 732346265 1028140 35486205 16220064 8162542 761447191 71714012 37189 3891185 45561809 110486139 4850138 420353 15449177 296502 52736725 13523379 0 52182 9016896 1231 0 0 0 0 0 - bytes: 166253035039 219519120027 0 0 40783504807 185466229638 11677877 45561809 - RPC iostats version: 1.0 p/v: 100003/3 (nfs) - xprt: tcp 925 1 6810 0 0 111505412 111480497 109 2672418560317 0 248 53869103 22481820 - per-op statistics - NULL: 0 0 0 0 0 0 0 0 - GETATTR: 61063106 61063108 0 9621383060 6839064400 453650 77291321 78926132 - SETATTR: 463469 463470 0 92005440 66739536 63787 603235 687943 - LOOKUP: 17021657 17021657 0 3354097764 4013442928 57216 35125459 35566511 - ACCESS: 14281703 14290009 5 2318400592 1713803640 1709282 4865144 7130140 - READLINK: 125 125 0 20472 18620 0 1112 1118 - READ: 4214236 4214237 0 715608524 41328653212 89884 22622768 22806693 - WRITE: 8479010 8494376 22 187695798568 1356087148 178264904 51506907 231671771 - CREATE: 171708 171708 0 38084748 46702272 873 1041833 1050398 - MKDIR: 3680 3680 0 773980 993920 26 23990 24245 - SYMLINK: 903 903 0 233428 245488 6 5865 5917 - MKNOD: 80 80 0 20148 21760 0 299 304 - REMOVE: 429921 429921 0 79796004 61908192 3313 2710416 2741636 - RMDIR: 3367 3367 0 645112 484848 22 5782 6002 - RENAME: 466201 466201 0 130026184 121212260 7075 5935207 5961288 - LINK: 289155 289155 0 72775556 67083960 2199 2565060 2585579 - READDIR: 2933237 2933237 0 516506204 13973833412 10385 3190199 3297917 - READDIRPLUS: 1652839 1652839 0 298640972 6895997744 84735 14307895 14448937 - FSSTAT: 6144 6144 0 1010516 1032192 51 9654 10022 - FSINFO: 2 2 0 232 328 0 1 1 - PATHCONF: 1 1 0 116 140 0 0 0 - COMMIT: 0 0 0 0 0 0 0 0 - -device binfmt_misc mounted on /proc/sys/fs/binfmt_misc with fstype binfmt_misc -[...] -
The key number to look at is the third number in the per-op list. -It is the number of NFS timeouts experiences per file system -operation. Here 22 write timeouts and 5 access timeouts. If these -numbers are increasing, I believe the machine is experiencing NFS -hang. Unfortunately the timeout value do not start to increase right -away. The NFS operations need to time out first, and this can take a -while. The exact timeout value depend on the setup. For example the -defaults for TCP and UDP mount points are quite different, and the -timeout value is affected by the soft, hard, timeo and retrans NFS -mount options.
- -The only way I have been able to get working on Debian and RedHat
-Enterprise Linux for getting the timeout count is to peek in /proc/.
-But according to
-
Is there a better way to figure out if a Linux NFS client is -experiencing NFS hangs? Is there a way to detect which processes are -affected? Is there a way to get the NFS mount going quickly once the -network problem causing the NFS hang has been cleared? I would very -much welcome some clues, as we regularly run into NFS hangs.
+ +At my nearby maker space, +Sonen, I heard the story that it +was easier to generate gcode files for theyr 3D printers (Ultimake 2+) +on Windows and MacOS X than Linux, because the software involved had +to be manually compiled and set up on Linux while premade packages +worked out of the box on Windows and MacOS X. I found this annoying, +as the software involved, +Cura, is free software +and should be trivial to get up and running on Linux if someone took +the time to package it for the relevant distributions. I even found +a request for adding into +Debian from 2013, which had seem some activity over the years but +never resulted in the software showing up in Debian. So a few days +ago I offered my help to try to improve the situation.
+ +Now I am very happy to see that all the packages required by a +working Cura in Debian are uploaded into Debian and waiting in the NEW +queue for the ftpmasters to have a look. You can track the progress +on +the +status page for the 3D printer team.
+ +The uploaded packages are a bit behind upstream, and was uploaded +now to get slots in the NEW +queue while we work up updating the packages to the latest +upstream version.
+ +On a related note, two competitors for Cura, which I found harder +to use and was unable to configure correctly for Ultimaker 2+ in the +short time I spent on it, are already in Debian. If you are looking +for 3D printer "slicers" and want something already available in +Debian, check out +slic3r and +slic3r-prusa. +The latter is a fork of the former.
+ +As usual, if you use Bitcoin and want to show your support of my +activities, please send Bitcoin donations to my address +15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b.
So the new president in the United States of America claim to be -surprised to discover that he was wiretapped during the election -before he was elected president. He even claim this must be illegal. -Well, doh, if it is one thing the confirmations from Snowden -documented, it is that the entire population in USA is wiretapped, one -way or another. Of course the president candidates were wiretapped, -alongside the senators, judges and the rest of the people in USA.
- -Next, the Federal Bureau of Investigation ask the Department of -Justice to go public rejecting the claims that Donald Trump was -wiretapped illegally. I fail to see the relevance, given that I am -sure the surveillance industry in USA believe they have all the legal -backing they need to conduct mass surveillance on the entire -world.
- -There is even the director of the FBI stating that he never saw an -order requesting wiretapping of Donald Trump. That is not very -surprising, given how the FISA court work, with all its activity being -secret. Perhaps he only heard about it?
- -What I find most sad in this story is how Norwegian journalists -present it. In a news reports the other day in the radio from the -Norwegian National broadcasting Company (NRK), I heard the journalist -claim that 'the FBI denies any wiretapping', while the reality is that -'the FBI denies any illegal wiretapping'. There is a fundamental and -important difference, and it make me sad that the journalists are -unable to grasp it.
- -Update 2017-03-13: Look like -The -Intercept report that US Senator Rand Paul confirm what I state above.
+ +Som vanlig, hvis du bruker Bitcoin og ønsker å vise din støtte til +det jeg driver med, setter jeg pris på om du sender Bitcoin-donasjoner +til min adresse +15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b.
Tags
-
-
- 3d-printer (13) +
- 3d-printer (14)
- amiga (1) @@ -1029,27 +1207,27 @@ Intercept report that US Senator Rand Paul confirm what I state above.
- chrpath (2) -
- debian (149) +
- debian (154)
- debian edu (158) -
- debian-handbook (3) +
- debian-handbook (4)
- digistan (10) -
- dld (16) +
- dld (17)
- docbook (24)
- drivstoffpriser (4) -
- english (349) +
- english (362)
- fiksgatami (23)
- fildeling (12) -
- freeculture (30) +
- freeculture (31)
- freedombox (9) @@ -1065,6 +1243,8 @@ Intercept report that US Senator Rand Paul confirm what I state above.
- ldap (9) +
- lego (4) +
- lenker (8)
- lsdvd (2) @@ -1077,7 +1257,7 @@ Intercept report that US Senator Rand Paul confirm what I state above.
- nice free software (9) -
- norsk (291) +
- norsk (293)
- nuug (189) @@ -1085,11 +1265,11 @@ Intercept report that US Senator Rand Paul confirm what I state above.
- open311 (2) -
- opphavsrett (64) +
- opphavsrett (69) -
- personvern (100) +
- personvern (104) -
- raid (1) +
- raid (2)
- reactos (1) @@ -1115,17 +1295,19 @@ Intercept report that US Senator Rand Paul confirm what I state above.
- stavekontroll (6) -
- stortinget (11) +
- stortinget (12) -
- surveillance (48) +
- surveillance (52) -
- sysadmin (3) +
- sysadmin (4)
- usenix (2) -
- valg (8) +
- valg (9) + +
- verkidetfri (8) -
- video (59) +
- video (60)
- vitenskap (4)