]> pere.pagekite.me Git - homepage.git/blob - blog/index.rss
Generated.
[homepage.git] / blog / index.rss
1 <?xml version="1.0" encoding="utf-8"?>
2 <rss version='2.0' xmlns:lj='http://www.livejournal.org/rss/lj/1.0/' xmlns:atom="http://www.w3.org/2005/Atom">
3 <channel>
4 <title>Petter Reinholdtsen</title>
5 <description></description>
6 <link>http://people.skolelinux.org/pere/blog/</link>
7 <atom:link href="http://people.skolelinux.org/pere/blog/index.rss" rel="self" type="application/rss+xml" />
8
9 <item>
10 <title>How hard can æ, ø and å be?</title>
11 <link>http://people.skolelinux.org/pere/blog/How_hard_can______and___be_.html</link>
12 <guid isPermaLink="true">http://people.skolelinux.org/pere/blog/How_hard_can______and___be_.html</guid>
13 <pubDate>Sun, 11 Feb 2018 17:10:00 +0100</pubDate>
14 <description>&lt;img src=&quot;http://people.skolelinux.org/pere/blog/images/2018-02-11-peppes-unicode.jpeg&quot; align=&quot;right&quot;/&gt;
15
16 &lt;p&gt;We write 2018, and it is 30 years since Unicode was introduced.
17 Most of us in Norway have come to expect the use of our alphabet to
18 just work with any computer system. But it is apparently beyond reach
19 of the computers printing recites at a restaurant. Recently I visited
20 a Peppes pizza resturant, and noticed a few details on the recite.
21 Notice how &#39;ø&#39; and &#39;å&#39; are replaced with strange symbols in
22 &#39;Servitør&#39;, &#39;Å BETALE&#39;, &#39;Beløp pr. gjest&#39;, &#39;Takk for besøket.&#39; and &#39;Vi
23 gleder oss til å se deg igjen&#39;.&lt;/p&gt;
24
25 &lt;p&gt;I would say that this state is passed sad and over in embarrassing.&lt;/p&gt;
26
27 &lt;p&gt;I removed personal and private information to be nice.&lt;/p&gt;
28 </description>
29 </item>
30
31 <item>
32 <title>Legal to share more than 11,000 movies listed on IMDB?</title>
33 <link>http://people.skolelinux.org/pere/blog/Legal_to_share_more_than_11_000_movies_listed_on_IMDB_.html</link>
34 <guid isPermaLink="true">http://people.skolelinux.org/pere/blog/Legal_to_share_more_than_11_000_movies_listed_on_IMDB_.html</guid>
35 <pubDate>Sun, 7 Jan 2018 23:30:00 +0100</pubDate>
36 <description>&lt;p&gt;I&#39;ve continued to track down list of movies that are legal to
37 distribute on the Internet, and identified more than 11,000 title IDs
38 in The Internet Movie Database (IMDB) so far. Most of them (57%) are
39 feature films from USA published before 1923. I&#39;ve also tracked down
40 more than 24,000 movies I have not yet been able to map to IMDB title
41 ID, so the real number could be a lot higher. According to the front
42 web page for &lt;a href=&quot;https://retrofilmvault.com/&quot;&gt;Retro Film
43 Vault&lt;/A&gt;, there are 44,000 public domain films, so I guess there are
44 still some left to identify.&lt;/p&gt;
45
46 &lt;p&gt;The complete data set is available from
47 &lt;a href=&quot;https://github.com/petterreinholdtsen/public-domain-free-imdb&quot;&gt;a
48 public git repository&lt;/a&gt;, including the scripts used to create it.
49 Most of the data is collected using web scraping, for example from the
50 &quot;product catalog&quot; of companies selling copies of public domain movies,
51 but any source I find believable is used. I&#39;ve so far had to throw
52 out three sources because I did not trust the public domain status of
53 the movies listed.&lt;/p&gt;
54
55 &lt;p&gt;Anyway, this is the summary of the 28 collected data sources so
56 far:&lt;/p&gt;
57
58 &lt;p&gt;&lt;pre&gt;
59 2352 entries ( 66 unique) with and 15983 without IMDB title ID in free-movies-archive-org-search.json
60 2302 entries ( 120 unique) with and 0 without IMDB title ID in free-movies-archive-org-wikidata.json
61 195 entries ( 63 unique) with and 200 without IMDB title ID in free-movies-cinemovies.json
62 89 entries ( 52 unique) with and 38 without IMDB title ID in free-movies-creative-commons.json
63 344 entries ( 28 unique) with and 655 without IMDB title ID in free-movies-fesfilm.json
64 668 entries ( 209 unique) with and 1064 without IMDB title ID in free-movies-filmchest-com.json
65 830 entries ( 21 unique) with and 0 without IMDB title ID in free-movies-icheckmovies-archive-mochard.json
66 19 entries ( 19 unique) with and 0 without IMDB title ID in free-movies-imdb-c-expired-gb.json
67 6822 entries ( 6669 unique) with and 0 without IMDB title ID in free-movies-imdb-c-expired-us.json
68 137 entries ( 0 unique) with and 0 without IMDB title ID in free-movies-imdb-externlist.json
69 1205 entries ( 57 unique) with and 0 without IMDB title ID in free-movies-imdb-pd.json
70 84 entries ( 20 unique) with and 167 without IMDB title ID in free-movies-infodigi-pd.json
71 158 entries ( 135 unique) with and 0 without IMDB title ID in free-movies-letterboxd-looney-tunes.json
72 113 entries ( 4 unique) with and 0 without IMDB title ID in free-movies-letterboxd-pd.json
73 182 entries ( 100 unique) with and 0 without IMDB title ID in free-movies-letterboxd-silent.json
74 229 entries ( 87 unique) with and 1 without IMDB title ID in free-movies-manual.json
75 44 entries ( 2 unique) with and 64 without IMDB title ID in free-movies-openflix.json
76 291 entries ( 33 unique) with and 474 without IMDB title ID in free-movies-profilms-pd.json
77 211 entries ( 7 unique) with and 0 without IMDB title ID in free-movies-publicdomainmovies-info.json
78 1232 entries ( 57 unique) with and 1875 without IMDB title ID in free-movies-publicdomainmovies-net.json
79 46 entries ( 13 unique) with and 81 without IMDB title ID in free-movies-publicdomainreview.json
80 698 entries ( 64 unique) with and 118 without IMDB title ID in free-movies-publicdomaintorrents.json
81 1758 entries ( 882 unique) with and 3786 without IMDB title ID in free-movies-retrofilmvault.json
82 16 entries ( 0 unique) with and 0 without IMDB title ID in free-movies-thehillproductions.json
83 63 entries ( 16 unique) with and 141 without IMDB title ID in free-movies-vodo.json
84 11583 unique IMDB title IDs in total, 8724 only in one list, 24647 without IMDB title ID
85 &lt;/pre&gt;&lt;/p&gt;
86
87 &lt;p&gt; I keep finding more data sources. I found the cinemovies source
88 just a few days ago, and as you can see from the summary, it extended
89 my list with 63 movies. Check out the mklist-* scripts in the git
90 repository if you are curious how the lists are created. Many of the
91 titles are extracted using searches on IMDB, where I look for the
92 title and year, and accept search results with only one movie listed
93 if the year matches. This allow me to automatically use many lists of
94 movies without IMDB title ID references at the cost of increasing the
95 risk of wrongly identify a IMDB title ID as public domain. So far my
96 random manual checks have indicated that the method is solid, but I
97 really wish all lists of public domain movies would include unique
98 movie identifier like the IMDB title ID. It would make the job of
99 counting movies in the public domain a lot easier.&lt;/p&gt;
100
101 &lt;p&gt;As usual, if you use Bitcoin and want to show your support of my
102 activities, please send Bitcoin donations to my address
103 &lt;b&gt;&lt;a href=&quot;bitcoin:15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b&quot;&gt;15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b&lt;/a&gt;&lt;/b&gt;.&lt;/p&gt;
104 </description>
105 </item>
106
107 <item>
108 <title>Kommentarer til «Evaluation of (il)legality» for Popcorn Time</title>
109 <link>http://people.skolelinux.org/pere/blog/Kommentarer_til__Evaluation_of__il_legality__for_Popcorn_Time.html</link>
110 <guid isPermaLink="true">http://people.skolelinux.org/pere/blog/Kommentarer_til__Evaluation_of__il_legality__for_Popcorn_Time.html</guid>
111 <pubDate>Wed, 20 Dec 2017 11:40:00 +0100</pubDate>
112 <description>&lt;p&gt;I går var jeg i Follo tingrett som sakkyndig vitne og presenterte
113 mine undersøkelser rundt
114 &lt;a href=&quot;https://github.com/petterreinholdtsen/public-domain-free-imdb&quot;&gt;telling
115 av filmverk i det fri&lt;/a&gt;, relatert til
116 &lt;a href=&quot;https://www.nuug.no/&quot;&gt;foreningen NUUG&lt;/a&gt;s involvering i
117 &lt;a href=&quot;https://www.nuug.no/news/tags/dns-domenebeslag/&quot;&gt;saken om
118 Økokrims beslag og senere inndragning av DNS-domenet
119 popcorn-time.no&lt;/a&gt;. Jeg snakket om flere ting, men mest om min
120 vurdering av hvordan filmbransjen har målt hvor ulovlig Popcorn Time
121 er. Filmbransjens måling er så vidt jeg kan se videreformidlet uten
122 endringer av norsk politi, og domstolene har lagt målingen til grunn
123 når de har vurdert Popcorn Time både i Norge og i utlandet (tallet
124 99% er referert også i utenlandske domsavgjørelser).&lt;/p&gt;
125
126 &lt;p&gt;I forkant av mitt vitnemål skrev jeg et notat, mest til meg selv,
127 med de punktene jeg ønsket å få frem. Her er en kopi av notatet jeg
128 skrev og ga til aktoratet. Merkelig nok ville ikke dommerene ha
129 notatet, så hvis jeg forsto rettsprosessen riktig ble kun
130 histogram-grafen lagt inn i dokumentasjonen i saken. Dommerne var
131 visst bare interessert i å forholde seg til det jeg sa i retten,
132 ikke det jeg hadde skrevet i forkant. Uansett så antar jeg at flere
133 enn meg kan ha glede av teksten, og publiserer den derfor her.
134 Legger ved avskrift av dokument 09,13, som er det sentrale
135 dokumentet jeg kommenterer.&lt;/p&gt;
136
137 &lt;p&gt;&lt;strong&gt;Kommentarer til «Evaluation of (il)legality» for Popcorn
138 Time&lt;/strong&gt;&lt;/p&gt;
139
140 &lt;p&gt;&lt;strong&gt;Oppsummering&lt;/strong&gt;&lt;/p&gt;
141
142 &lt;p&gt;Målemetoden som Økokrim har lagt til grunn når de påstår at 99% av
143 filmene tilgjengelig fra Popcorn Time deles ulovlig har
144 svakheter.&lt;/p&gt;
145
146 &lt;p&gt;De eller den som har vurdert hvorvidt filmer kan lovlig deles har
147 ikke lyktes med å identifisere filmer som kan deles lovlig og har
148 tilsynelatende antatt at kun veldig gamle filmer kan deles lovlig.
149 Økokrim legger til grunn at det bare finnes èn film, Charlie
150 Chaplin-filmen «The Circus» fra 1928, som kan deles fritt blant de
151 som ble observert tilgjengelig via ulike Popcorn Time-varianter.
152 Jeg finner tre flere blant de observerte filmene: «The Brain That
153 Wouldn&#39;t Die» fra 1962, «God’s Little Acre» fra 1958 og «She Wore a
154 Yellow Ribbon» fra 1949. Det er godt mulig det finnes flere. Det
155 finnes dermed minst fire ganger så mange filmer som lovlig kan deles
156 på Internett i datasettet Økokrim har lagt til grunn når det påstås
157 at mindre enn 1 % kan deles lovlig.&lt;/p&gt;
158
159 &lt;p&gt;Dernest, utplukket som gjøres ved søk på tilfeldige ord hentet fra
160 ordlisten til Dale-Chall avviker fra årsfordelingen til de brukte
161 filmkatalogene som helhet, hvilket påvirker fordelingen mellom
162 filmer som kan lovlig deles og filmer som ikke kan lovlig deles. I
163 tillegg gir valg av øvre del (de fem første) av søkeresultatene et
164 avvik fra riktig årsfordeling, hvilket påvirker fordelingen av verk
165 i det fri i søkeresultatet.&lt;/p&gt;
166
167 &lt;p&gt;Det som måles er ikke (u)lovligheten knyttet til bruken av Popcorn
168 Time, men (u)lovligheten til innholdet i bittorrent-filmkataloger
169 som vedlikeholdes uavhengig av Popcorn Time.&lt;/p&gt;
170
171 &lt;p&gt;Omtalte dokumenter: 09,12, &lt;a href=&quot;#dok-09-13&quot;&gt;09,13&lt;/a&gt;, 09,14,
172 09,18, 09,19, 09,20.&lt;/p&gt;
173
174 &lt;p&gt;&lt;strong&gt;Utfyllende kommentarer&lt;/strong&gt;&lt;/p&gt;
175
176 &lt;p&gt;Økokrim har forklart domstolene at minst 99% av alt som er
177 tilgjengelig fra ulike Popcorn Time-varianter deles ulovlig på
178 Internet. Jeg ble nysgjerrig på hvordan de er kommet frem til dette
179 tallet, og dette notatet er en samling kommentarer rundt målingen
180 Økokrim henviser til. Litt av bakgrunnen for at jeg valgte å se på
181 saken er at jeg er interessert i å identifisere og telle hvor mange
182 kunstneriske verk som er falt i det fri eller av andre grunner kan
183 lovlig deles på Internett, og dermed var interessert i hvordan en
184 hadde funnet den ene prosenten som kanskje deles lovlig.&lt;/p&gt;
185
186 &lt;p&gt;Andelen på 99% kommer fra et ukreditert og udatert notatet som tar
187 mål av seg å dokumentere en metode for å måle hvor (u)lovlig ulike
188 Popcorn Time-varianter er.&lt;/p&gt;
189
190 &lt;p&gt;Raskt oppsummert, så forteller metodedokumentet at på grunn av at
191 det ikke er mulig å få tak i komplett liste over alle filmtitler
192 tilgjengelig via Popcorn Time, så lages noe som skal være et
193 representativt utvalg ved å velge 50 søkeord større enn tre tegn fra
194 ordlisten kjent som Dale-Chall. For hvert søkeord gjøres et søk og
195 de første fem filmene i søkeresultatet samles inn inntil 100 unike
196 filmtitler er funnet. Hvis 50 søkeord ikke var tilstrekkelig for å
197100 unike filmtitler ble flere filmer fra hvert søkeresultat lagt
198 til. Hvis dette heller ikke var tilstrekkelig, så ble det hentet ut
199 og søkt på flere tilfeldig valgte søkeord inntil 100 unike
200 filmtitler var identifisert.&lt;/p&gt;
201
202 &lt;p&gt;Deretter ble for hver av filmtitlene «vurdert hvorvidt det var
203 rimelig å forvente om at verket var vernet av copyright, ved å se på
204 om filmen var tilgjengelig i IMDB, samt se på regissør,
205 utgivelsesår, når det var utgitt for bestemte markedsområder samt
206 hvilke produksjons- og distribusjonsselskap som var registrert» (min
207 oversettelse).&lt;/p&gt;
208
209 &lt;p&gt;Metoden er gjengitt både i de ukrediterte dokumentene 09,13 og
210 09,19, samt beskrevet fra side 47 i dokument 09,20, lysark datert
211 2017-02-01. Sistnevnte er kreditert Geerart Bourlon fra Motion
212 Picture Association EMEA. Metoden virker å ha flere svakheter som
213 gir resultatene en slagside. Den starter med å slå fast at det ikke
214 er mulig å hente ut en komplett liste over alle filmtitler som er
215 tilgjengelig, og at dette er bakgrunnen for metodevalget. Denne
216 forutsetningen er ikke i tråd med det som står i dokument 09,12, som
217 ikke heller har oppgitt forfatter og dato. Dokument 09,12 forteller
218 hvordan hele kataloginnholdet ble lasted ned og talt opp. Dokument
219 09,12 er muligens samme rapport som ble referert til i dom fra Oslo
220 Tingrett 2017-11-03
221 (&lt;a href=&quot;https://www.domstol.no/no/Enkelt-domstol/Oslo--tingrett/Nyheter/ma-sperre-for-popcorn-time/&quot;&gt;sak
222 17-093347TVI-OTIR/05&lt;/a&gt;) som rapport av 1. juni 2017 av Alexander
223 Kind Petersen, men jeg har ikke sammenlignet dokumentene ord for ord
224 for å kontrollere dette.&lt;/p&gt;
225
226 &lt;p&gt;IMDB er en forkortelse for The Internet Movie Database, en
227 anerkjent kommersiell nettjeneste som brukes aktivt av både
228 filmbransjen og andre til å holde rede på hvilke spillefilmer (og
229 endel andre filmer) som finnes eller er under produksjon, og
230 informasjon om disse filmene. Datakvaliteten er høy, med få feil og
231 få filmer som mangler. IMDB viser ikke informasjon om
232 opphavsrettslig status for filmene på infosiden for hver film. Som
233 del av IMDB-tjenesten finnes det lister med filmer laget av
234 frivillige som lister opp det som antas å være verk i det fri.&lt;/p&gt;
235
236 &lt;p&gt;Det finnes flere kilder som kan brukes til å finne filmer som er
237 allemannseie (public domain) eller har bruksvilkår som gjør det
238 lovlig for alleå dele dem på Internett. Jeg har de siste ukene
239 forsøkt å samle og krysskoble disse listene for å forsøke å telle
240 antall filmer i det fri. Ved å ta utgangspunkt i slike lister (og
241 publiserte filmer for Internett-arkivets del), har jeg så langt
242 klart å identifisere over 11 000 filmer, hovedsaklig spillefilmer.
243
244 &lt;p&gt;De aller fleste oppføringene er hentet fra IMDB selv, basert på det
245 faktum at alle filmer laget i USA før 1923 er falt i det fri.
246 Tilsvarende tidsgrense for Storbritannia er 1912-07-01, men dette
247 utgjør bare veldig liten del av spillefilmene i IMDB (19 totalt).
248 En annen stor andel kommer fra Internett-arkivet, der jeg har
249 identifisert filmer med referanse til IMDB. Internett-arkivet, som
250 holder til i USA, har som
251 &lt;a href=&quot;https://archive.org/about/terms.php&quot;&gt;policy å kun publisere
252 filmer som det er lovlig å distribuere&lt;/a&gt;. Jeg har under arbeidet
253 kommet over flere filmer som har blitt fjernet fra
254 Internett-arkivet, hvilket gjør at jeg konkluderer med at folkene
255 som kontrollerer Internett-arkivet har et aktivt forhold til å kun
256 ha lovlig innhold der, selv om det i stor grad er drevet av
257 frivillige. En annen stor liste med filmer kommer fra det
258 kommersielle selskapet Retro Film Vault, som selger allemannseide
259 filmer til TV- og filmbransjen, Jeg har også benyttet meg av lister
260 over filmer som hevdes å være allemannseie, det være seg Public
261 Domain Review, Public Domain Torrents og Public Domain Movies (.net
262 og .info), samt lister over filmer med Creative Commons-lisensiering
263 fra Wikipedia, VODO og The Hill Productions. Jeg har gjort endel
264 stikkontroll ved å vurdere filmer som kun omtales på en liste. Der
265 jeg har funnet feil som har gjort meg i tvil om vurderingen til de
266 som har laget listen har jeg forkastet listen fullstendig (gjelder
267 en liste fra IMDB).&lt;/p&gt;
268
269 &lt;p&gt;Ved å ta utgangspunkt i verk som kan antas å være lovlig delt på
270 Internett (fra blant annet Internett-arkivet, Public Domain
271 Torrents, Public Domain Reivew og Public Domain Movies), og knytte
272 dem til oppføringer i IMDB, så har jeg så langt klart å identifisere
273 over 11 000 filmer (hovedsaklig spillefilmer) det er grunn til å tro
274 kan lovlig distribueres av alle på Internett. Som ekstra kilder er
275 det brukt lister over filmer som antas/påstås å være allemannseie.
276 Disse kildene kommer fra miljøer som jobber for å gjøre tilgjengelig
277 for almennheten alle verk som er falt i det fri eller har
278 bruksvilkår som tillater deling.
279
280 &lt;p&gt;I tillegg til de over 11 000 filmene der tittel-ID i IMDB er
281 identifisert, har jeg funnet mer enn 20 000 oppføringer der jeg ennå
282 ikke har hatt kapasitet til å spore opp tittel-ID i IMDB. Noen av
283 disse er nok duplikater av de IMDB-oppføringene som er identifisert
284 så langt, men neppe alle. Retro Film Vault hevder å ha 44 000
285 filmverk i det fri i sin katalog, så det er mulig at det reelle
286 tallet er betydelig høyere enn de jeg har klart å identifisere så
287 langt. Konklusjonen er at tallet 11 000 er nedre grense for hvor
288 mange filmer i IMDB som kan lovlig deles på Internett. I følge &lt;a
289 href=&quot;http://www.imdb.com/stats&quot;&gt;statistikk fra IMDB&lt;/a&gt; er det 4.6
290 millioner titler registrert, hvorav 3 millioner er TV-serieepisoder.
291 Jeg har ikke funnet ut hvordan de fordeler seg per år.&lt;/p&gt;
292
293 &lt;p&gt;Hvis en fordeler på år alle tittel-IDene i IMDB som hevdes å lovlig
294 kunne deles på Internett, får en følgende histogram:&lt;/p&gt;
295
296 &lt;p align=&quot;center&quot;&gt;&lt;img width=&quot;80%&quot; src=&quot;http://people.skolelinux.org/pere/blog/images/2017-12-20-histogram-year.png&quot;&gt;&lt;/p&gt;
297
298 &lt;p&gt;En kan i histogrammet se at effekten av manglende registrering
299 eller fornying av registrering er at mange filmer gitt ut i USA før
300 1978 er allemannseie i dag. I tillegg kan en se at det finnes flere
301 filmer gitt ut de siste årene med bruksvilkår som tillater deling,
302 muligens på grunn av fremveksten av
303 &lt;a href=&quot;https://creativecommons.org/&quot;&gt;Creative
304 Commons&lt;/a&gt;-bevegelsen..&lt;/p&gt;
305
306 &lt;p&gt;For maskinell analyse av katalogene har jeg laget et lite program
307 som kobler seg til bittorrent-katalogene som brukes av ulike Popcorn
308 Time-varianter og laster ned komplett liste over filmer i
309 katalogene, noe som bekrefter at det er mulig å hente ned komplett
310 liste med alle filmtitler som er tilgjengelig. Jeg har sett på fire
311 bittorrent-kataloger. Den ene brukes av klienten tilgjengelig fra
312 www.popcorntime.sh og er navngitt &#39;sh&#39; i dette dokumentet. Den
313 andre brukes i følge dokument 09,12 av klienten tilgjengelig fra
314 popcorntime.ag og popcorntime.sh og er navngitt &#39;yts&#39; i dette
315 dokumentet. Den tredje brukes av websidene tilgjengelig fra
316 popcorntime-online.tv og er navngitt &#39;apidomain&#39; i dette dokumentet.
317 Den fjerde brukes av klienten tilgjenglig fra popcorn-time.to i
318 følge dokument 09,12, og er navngitt &#39;ukrfnlge&#39; i dette
319 dokumentet.&lt;/p&gt;
320
321 &lt;p&gt;Metoden Økokrim legger til grunn skriver i sitt punkt fire at
322 skjønn er en egnet metode for å finne ut om en film kan lovlig deles
323 på Internett eller ikke, og sier at det ble «vurdert hvorvidt det
324 var rimelig å forvente om at verket var vernet av copyright». For
325 det første er det ikke nok å slå fast om en film er «vernet av
326 copyright» for å vite om det er lovlig å dele den på Internett eller
327 ikke, da det finnes flere filmer med opphavsrettslige bruksvilkår
328 som tillater deling på Internett. Eksempler på dette er Creative
329 Commons-lisensierte filmer som Citizenfour fra 2014 og Sintel fra
330 2010. I tillegg til slike finnes det flere filmer som nå er
331 allemannseie (public domain) på grunn av manglende registrering
332 eller fornying av registrering selv om både regisør,
333 produksjonsselskap og distributør ønsker seg vern. Eksempler på
334 dette er Plan 9 from Outer Space fra 1959 og Night of the Living
335 Dead fra 1968. Alle filmer fra USA som var allemannseie før
336 1989-03-01 forble i det fri da Bern-konvensjonen, som tok effekt i
337 USA på det tidspunktet, ikke ble gitt tilbakevirkende kraft. Hvis
338 det er noe
339 &lt;a href=&quot;http://www.latimes.com/local/lanow/la-me-ln-happy-birthday-song-lawsuit-decision-20150922-story.html&quot;&gt;historien
340 om sangen «Happy birthday»&lt;/a&gt; forteller oss, der betaling for bruk
341 har vært krevd inn i flere tiår selv om sangen ikke egentlig var
342 vernet av åndsverksloven, så er det at hvert enkelt verk må vurderes
343 nøye og i detalj før en kan slå fast om verket er allemannseie eller
344 ikke, det holder ikke å tro på selverklærte rettighetshavere. Flere
345 eksempel på verk i det fri som feilklassifiseres som vernet er fra
346 dokument 09,18, som lister opp søkeresultater for klienten omtalt
347 som popcorntime.sh og i følge notatet kun inneholder en film (The
348 Circus fra 1928) som under tvil kan antas å være allemannseie.&lt;/p&gt;
349
350 &lt;p&gt;Ved rask gjennomlesning av dokument 09,18, som inneholder
351 skjermbilder fra bruk av en Popcorn Time-variant, fant jeg omtalt
352 både filmen «The Brain That Wouldn&#39;t Die» fra 1962 som er
353 &lt;a href=&quot;https://archive.org/details/brain_that_wouldnt_die&quot;&gt;tilgjengelig
354 fra Internett-arkivet&lt;/a&gt; og som
355 &lt;a href=&quot;https://en.wikipedia.org/wiki/List_of_films_in_the_public_domain_in_the_United_States&quot;&gt;i
356 følge Wikipedia er allemannseie i USA&lt;/a&gt; da den ble gitt ut i
357 1962 uten &#39;copyright&#39;-merking, og filmen «God’s Little Acre» fra
358 1958 &lt;a href=&quot;https://en.wikipedia.org/wiki/God%27s_Little_Acre_%28film%29&quot;&gt;som
359 er lagt ut på Wikipedia&lt;/a&gt;, der det fortelles at
360 sort/hvit-utgaven er allemannseie. Det fremgår ikke fra dokument
361 09,18 om filmen omtalt der er sort/hvit-utgaven. Av
362 kapasitetsårsaker og på grunn av at filmoversikten i dokument 09,18
363 ikke er maskinlesbart har jeg ikke forsøkt å sjekke alle filmene som
364 listes opp der om mot liste med filmer som er antatt lovlig kan
365 distribueres på Internet.&lt;/p&gt;
366
367 &lt;p&gt;Ved maskinell gjennomgang av listen med IMDB-referanser under
368 regnearkfanen &#39;Unique titles&#39; i dokument 09.14, fant jeg i tillegg
369 filmen «She Wore a Yellow Ribbon» fra 1949) som nok også er
370 feilklassifisert. Filmen «She Wore a Yellow Ribbon» er tilgjengelig
371 fra Internett-arkivet og markert som allemannseie der. Det virker
372 dermed å være minst fire ganger så mange filmer som kan lovlig deles
373 på Internett enn det som er lagt til grunn når en påstår at minst
374 99% av innholdet er ulovlig. Jeg ser ikke bort fra at nærmere
375 undersøkelser kan avdekke flere. Poenget er uansett at metodens
376 punkt om «rimelig å forvente om at verket var vernet av copyright»
377 gjør metoden upålitelig.&lt;/p&gt;
378
379 &lt;p&gt;Den omtalte målemetoden velger ut tilfeldige søketermer fra
380 ordlisten Dale-Chall. Den ordlisten inneholder 3000 enkle engelske
381 som fjerdeklassinger i USA er forventet å forstå. Det fremgår ikke
382 hvorfor akkurat denne ordlisten er valgt, og det er uklart for meg
383 om den er egnet til å få et representativt utvalg av filmer. Mange
384 av ordene gir tomt søkeresultat. Ved å simulerte tilsvarende søk
385 ser jeg store avvik fra fordelingen i katalogen for enkeltmålinger.
386 Dette antyder at enkeltmålinger av 100 filmer slik målemetoden
387 beskriver er gjort, ikke er velegnet til å finne andel ulovlig
388 innhold i bittorrent-katalogene.&lt;/p&gt;
389
390 &lt;p&gt;En kan motvirke dette store avviket for enkeltmålinger ved å gjøre
391 mange søk og slå sammen resultatet. Jeg har testet ved å
392 gjennomføre 100 enkeltmålinger (dvs. måling av (100x100=) 10 000
393 tilfeldig valgte filmer) som gir mindre, men fortsatt betydelig
394 avvik, i forhold til telling av filmer pr år i hele katalogen.&lt;/p&gt;
395
396 &lt;p&gt;Målemetoden henter ut de fem øverste i søkeresultatet.
397 Søkeresultatene er sortert på antall bittorrent-klienter registrert
398 som delere i katalogene, hvilket kan gi en slagside mot hvilke
399 filmer som er populære blant de som bruker bittorrent-katalogene,
400 uten at det forteller noe om hvilket innhold som er tilgjengelig
401 eller hvilket innhold som deles med Popcorn Time-klienter. Jeg har
402 forsøkt å måle hvor stor en slik slagside eventuelt er ved å
403 sammenligne fordelingen hvis en tar de 5 nederste i søkeresultatet i
404 stedet. Avviket for disse to metodene for endel kataloger er godt
405 synlig på histogramet. Her er histogram over filmer funnet i den
406 komplette katalogen (grønn strek), og filmer funnet ved søk etter
407 ord i Dale-Chall. Grafer merket &#39;top&#39; henter fra de 5 første i
408 søkeresultatet, mens de merket &#39;bottom&#39; henter fra de 5 siste. En
409 kan her se at resultatene påvirkes betydelig av hvorvidt en ser på
410 de første eller de siste filmene i et søketreff.&lt;/p&gt;
411
412 &lt;p align=&quot;center&quot;&gt;
413 &lt;img width=&quot;40%&quot; src=&quot;http://people.skolelinux.org/pere/blog/images/2017-12-20-histogram-year-sh-top.png&quot;/&gt;
414 &lt;img width=&quot;40%&quot; src=&quot;http://people.skolelinux.org/pere/blog/images/2017-12-20-histogram-year-sh-bottom.png&quot;/&gt;
415 &lt;br&gt;
416 &lt;img width=&quot;40%&quot; src=&quot;http://people.skolelinux.org/pere/blog/images/2017-12-20-histogram-year-yts-top.png&quot;/&gt;
417 &lt;img width=&quot;40%&quot; src=&quot;http://people.skolelinux.org/pere/blog/images/2017-12-20-histogram-year-yts-bottom.png&quot;/&gt;
418 &lt;br&gt;
419 &lt;img width=&quot;40%&quot; src=&quot;http://people.skolelinux.org/pere/blog/images/2017-12-20-histogram-year-ukrfnlge-top.png&quot;/&gt;
420 &lt;img width=&quot;40%&quot; src=&quot;http://people.skolelinux.org/pere/blog/images/2017-12-20-histogram-year-ukrfnlge-bottom.png&quot;/&gt;
421 &lt;br&gt;
422 &lt;img width=&quot;40%&quot; src=&quot;http://people.skolelinux.org/pere/blog/images/2017-12-20-histogram-year-apidomain-top.png&quot;/&gt;
423 &lt;img width=&quot;40%&quot; src=&quot;http://people.skolelinux.org/pere/blog/images/2017-12-20-histogram-year-apidomain-bottom.png&quot;/&gt;
424 &lt;/p&gt;
425
426 &lt;p&gt;Det er verdt å bemerke at de omtalte bittorrent-katalogene ikke er
427 laget for bruk med Popcorn Time. Eksempelvis tilhører katalogen
428 YTS, som brukes av klientet som ble lastes ned fra popcorntime.sh,
429 et selvstendig fildelings-relatert nettsted YTS.AG med et separat
430 brukermiljø. Målemetoden foreslått av Økokrim måler dermed ikke
431 (u)lovligheten rundt bruken av Popcorn Time, men (u)lovligheten til
432 innholdet i disse katalogene.&lt;/p&gt;
433
434 &lt;hr&gt;
435
436 &lt;p id=&quot;dok-09-13&quot;&gt;Metoden fra Økokrims dokument 09,13 i straffesaken
437 om DNS-beslag.&lt;/p&gt;
438
439 &lt;p&gt;&lt;strong&gt;1. Evaluation of (il)legality&lt;/strong&gt;&lt;/p&gt;
440
441 &lt;p&gt;&lt;strong&gt;1.1. Methodology&lt;/strong&gt;
442
443 &lt;p&gt;Due to its technical configuration, Popcorn Time applications don&#39;t
444 allow to make a full list of all titles made available. In order to
445 evaluate the level of illegal operation of PCT, the following
446 methodology was applied:&lt;/p&gt;
447
448 &lt;ol&gt;
449
450 &lt;li&gt;A random selection of 50 keywords, greater than 3 letters, was
451 made from the Dale-Chall list that contains 3000 simple English
452 words1. The selection was made by using a Random Number
453 Generator2.&lt;/li&gt;
454
455 &lt;li&gt;For each keyword, starting with the first randomly selected
456 keyword, a search query was conducted in the movie section of the
457 respective Popcorn Time application. For each keyword, the first
458 five results were added to the title list until the number of 100
459 unique titles was reached (duplicates were removed).&lt;/li&gt;
460
461 &lt;li&gt;For one fork, .CH, insufficient titles were generated via this
462 approach to reach 100 titles. This was solved by adding any
463 additional query results above five for each of the 50 keywords.
464 Since this still was not enough, another 42 random keywords were
465 selected to finally reach 100 titles.&lt;/li&gt;
466
467 &lt;li&gt;It was verified whether or not there is a reasonable expectation
468 that the work is copyrighted by checking if they are available on
469 IMDb, also verifying the director, the year when the title was
470 released, the release date for a certain market, the production
471 company/ies of the title and the distribution company/ies.&lt;/li&gt;
472
473 &lt;/ol&gt;
474
475 &lt;p&gt;&lt;strong&gt;1.2. Results&lt;/strong&gt;&lt;/p&gt;
476
477 &lt;p&gt;Between 6 and 9 June 2016, four forks of Popcorn Time were
478 investigated: popcorn-time.to, popcorntime.ag, popcorntime.sh and
479 popcorntime.ch. An excel sheet with the results is included in
480 Appendix 1. Screenshots were secured in separate Appendixes for each
481 respective fork, see Appendix 2-5.&lt;/p&gt;
482
483 &lt;p&gt;For each fork, out of 100, de-duplicated titles it was possible to
484 retrieve data according to the parameters set out above that indicate
485 that the title is commercially available. Per fork, there was 1 title
486 that presumably falls within the public domain, i.e. the 1928 movie
487 &quot;The Circus&quot; by and with Charles Chaplin.&lt;/p&gt;
488
489 &lt;p&gt;Based on the above it is reasonable to assume that 99% of the movie
490 content of each fork is copyright protected and is made available
491 illegally.&lt;/p&gt;
492
493 &lt;p&gt;This exercise was not repeated for TV series, but considering that
494 besides production companies and distribution companies also
495 broadcasters may have relevant rights, it is reasonable to assume that
496 at least a similar level of infringement will be established.&lt;/p&gt;
497
498 &lt;p&gt;Based on the above it is reasonable to assume that 99% of all the
499 content of each fork is copyright protected and are made available
500 illegally.&lt;/p&gt;
501 </description>
502 </item>
503
504 <item>
505 <title>Cura, the nice 3D print slicer, is now in Debian Unstable</title>
506 <link>http://people.skolelinux.org/pere/blog/Cura__the_nice_3D_print_slicer__is_now_in_Debian_Unstable.html</link>
507 <guid isPermaLink="true">http://people.skolelinux.org/pere/blog/Cura__the_nice_3D_print_slicer__is_now_in_Debian_Unstable.html</guid>
508 <pubDate>Sun, 17 Dec 2017 07:00:00 +0100</pubDate>
509 <description>&lt;p&gt;After several months of working and waiting, I am happy to report
510 that the nice and user friendly 3D printer slicer software Cura just
511 entered Debian Unstable. It consist of five packages,
512 &lt;a href=&quot;https://tracker.debian.org/pkg/cura&quot;&gt;cura&lt;/a&gt;,
513 &lt;a href=&quot;https://tracker.debian.org/pkg/cura-engine&quot;&gt;cura-engine&lt;/a&gt;,
514 &lt;a href=&quot;https://tracker.debian.org/pkg/libarcus&quot;&gt;libarcus&lt;/a&gt;,
515 &lt;a href=&quot;https://tracker.debian.org/pkg/fdm-materials&quot;&gt;fdm-materials&lt;/a&gt;,
516 &lt;a href=&quot;https://tracker.debian.org/pkg/libsavitar&quot;&gt;libsavitar&lt;/a&gt; and
517 &lt;a href=&quot;https://tracker.debian.org/pkg/uranium&quot;&gt;uranium&lt;/a&gt;. The last
518 two, uranium and cura, entered Unstable yesterday. This should make
519 it easier for Debian users to print on at least the Ultimaker class of
520 3D printers. My nearest 3D printer is an Ultimaker 2+, so it will
521 make life easier for at least me. :)&lt;/p&gt;
522
523 &lt;p&gt;The work to make this happen was done by Gregor Riepl, and I was
524 happy to assist him in sponsoring the packages. With the introduction
525 of Cura, Debian is up to three 3D printer slicers at your service,
526 Cura, Slic3r and Slic3r Prusa. If you own or have access to a 3D
527 printer, give it a go. :)&lt;/p&gt;
528
529 &lt;p&gt;The 3D printer software is maintained by the 3D printer Debian
530 team, flocking together on the
531 &lt;a href=&quot;http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/3dprinter-general&quot;&gt;3dprinter-general&lt;/a&gt;
532 mailing list and the
533 &lt;a href=&quot;irc://irc.debian.org/#debian-3dprinting&quot;&gt;#debian-3dprinting&lt;/a&gt;
534 IRC channel.&lt;/p&gt;
535
536 &lt;p&gt;The next step for Cura in Debian is to update the cura package to
537 version 3.0.3 and then update the entire set of packages to version
538 3.1.0 which showed up the last few days.&lt;/p&gt;
539 </description>
540 </item>
541
542 <item>
543 <title>Idea for finding all public domain movies in the USA</title>
544 <link>http://people.skolelinux.org/pere/blog/Idea_for_finding_all_public_domain_movies_in_the_USA.html</link>
545 <guid isPermaLink="true">http://people.skolelinux.org/pere/blog/Idea_for_finding_all_public_domain_movies_in_the_USA.html</guid>
546 <pubDate>Wed, 13 Dec 2017 10:15:00 +0100</pubDate>
547 <description>&lt;p&gt;While looking at
548 &lt;a href=&quot;http://onlinebooks.library.upenn.edu/cce/&quot;&gt;the scanned copies
549 for the copyright renewal entries for movies published in the USA&lt;/a&gt;,
550 an idea occurred to me. The number of renewals are so few per year, it
551 should be fairly quick to transcribe them all and add references to
552 the corresponding IMDB title ID. This would give the (presumably)
553 complete list of movies published 28 years earlier that did _not_
554 enter the public domain for the transcribed year. By fetching the
555 list of USA movies published 28 years earlier and subtract the movies
556 with renewals, we should be left with movies registered in IMDB that
557 are now in the public domain. For the year 1955 (which is the one I
558 have looked at the most), the total number of pages to transcribe is
559 21. For the 28 years from 1950 to 1978, it should be in the range
560 500-600 pages. It is just a few days of work, and spread among a
561 small group of people it should be doable in a few weeks of spare
562 time.&lt;/p&gt;
563
564 &lt;p&gt;A typical copyright renewal entry look like this (the first one
565 listed for 1955):&lt;/p&gt;
566
567 &lt;p&gt;&lt;blockquote&gt;
568 ADAM AND EVIL, a photoplay in seven reels by Metro-Goldwyn-Mayer
569 Distribution Corp. (c) 17Aug27; L24293. Loew&#39;s Incorporated (PWH);
570 10Jun55; R151558.
571 &lt;/blockquote&gt;&lt;/p&gt;
572
573 &lt;p&gt;The movie title as well as registration and renewal dates are easy
574 enough to locate by a program (split on first comma and look for
575 DDmmmYY). The rest of the text is not required to find the movie in
576 IMDB, but is useful to confirm the correct movie is found. I am not
577 quite sure what the L and R numbers mean, but suspect they are
578 reference numbers into the archive of the US Copyright Office.&lt;/p&gt;
579
580 &lt;p&gt;Tracking down the equivalent IMDB title ID is probably going to be
581 a manual task, but given the year it is fairly easy to search for the
582 movie title using for example
583 &lt;a href=&quot;http://www.imdb.com/find?q=adam+and+evil+1927&amp;s=all&quot;&gt;http://www.imdb.com/find?q=adam+and+evil+1927&amp;s=all&lt;/a&gt;.
584 Using this search, I find that the equivalent IMDB title ID for the
585 first renewal entry from 1955 is
586 &lt;a href=&quot;http://www.imdb.com/title/tt0017588/&quot;&gt;http://www.imdb.com/title/tt0017588/&lt;/a&gt;.&lt;/p&gt;
587
588 &lt;p&gt;I suspect the best way to do this would be to make a specialised
589 web service to make it easy for contributors to transcribe and track
590 down IMDB title IDs. In the web service, once a entry is transcribed,
591 the title and year could be extracted from the text, a search in IMDB
592 conducted for the user to pick the equivalent IMDB title ID right
593 away. By spreading out the work among volunteers, it would also be
594 possible to make at least two persons transcribe the same entries to
595 be able to discover any typos introduced. But I will need help to
596 make this happen, as I lack the spare time to do all of this on my
597 own. If you would like to help, please get in touch. Perhaps you can
598 draft a web service for crowd sourcing the task?&lt;/p&gt;
599
600 &lt;p&gt;Note, Project Gutenberg already have some
601 &lt;a href=&quot;http://www.gutenberg.org/ebooks/search/?query=copyright+office+renewals&quot;&gt;transcribed
602 copies of the US Copyright Office renewal protocols&lt;/a&gt;, but I have
603 not been able to find any film renewals there, so I suspect they only
604 have copies of renewal for written works. I have not been able to find
605 any transcribed versions of movie renewals so far. Perhaps they exist
606 somewhere?&lt;/p&gt;
607
608 &lt;p&gt;I would love to figure out methods for finding all the public
609 domain works in other countries too, but it is a lot harder. At least
610 for Norway and Great Britain, such work involve tracking down the
611 people involved in making the movie and figuring out when they died.
612 It is hard enough to figure out who was part of making a movie, but I
613 do not know how to automate such procedure without a registry of every
614 person involved in making movies and their death year.&lt;/p&gt;
615
616 &lt;p&gt;As usual, if you use Bitcoin and want to show your support of my
617 activities, please send Bitcoin donations to my address
618 &lt;b&gt;&lt;a href=&quot;bitcoin:15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b&quot;&gt;15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b&lt;/a&gt;&lt;/b&gt;.&lt;/p&gt;
619 </description>
620 </item>
621
622 <item>
623 <title>Is the short movie «Empty Socks» from 1927 in the public domain or not?</title>
624 <link>http://people.skolelinux.org/pere/blog/Is_the_short_movie__Empty_Socks__from_1927_in_the_public_domain_or_not_.html</link>
625 <guid isPermaLink="true">http://people.skolelinux.org/pere/blog/Is_the_short_movie__Empty_Socks__from_1927_in_the_public_domain_or_not_.html</guid>
626 <pubDate>Tue, 5 Dec 2017 12:30:00 +0100</pubDate>
627 <description>&lt;p&gt;Three years ago, a presumed lost animation film,
628 &lt;a href=&quot;https://en.wikipedia.org/wiki/Empty_Socks&quot;&gt;Empty Socks from
629 1927&lt;/a&gt;, was discovered in the Norwegian National Library. At the
630 time it was discovered, it was generally assumed to be copyrighted by
631 The Walt Disney Company, and I blogged about
632 &lt;a href=&quot;http://people.skolelinux.org/pere/blog/Opphavsretts_status_for__Empty_Socks__fra_1927_.html&quot;&gt;my
633 reasoning to conclude&lt;/a&gt; that it would would enter the Norwegian
634 equivalent of the public domain in 2053, based on my understanding of
635 Norwegian Copyright Law. But a few days ago, I came across
636 &lt;a href=&quot;http://www.toonzone.net/forums/threads/exposed-disneys-repurchase-of-oswald-the-rabbit-a-sham.4792291/&quot;&gt;a
637 blog post claiming the movie was already in the public domain&lt;/a&gt;, at
638 least in USA. The reasoning is as follows: The film was released in
639 November or Desember 1927 (sources disagree), and presumably
640 registered its copyright that year. At that time, right holders of
641 movies registered by the copyright office received government
642 protection for there work for 28 years. After 28 years, the copyright
643 had to be renewed if the wanted the government to protect it further.
644 The blog post I found claim such renewal did not happen for this
645 movie, and thus it entered the public domain in 1956. Yet someone
646 claim the copyright was renewed and the movie is still copyright
647 protected. Can anyone help me to figure out which claim is correct?
648 I have not been able to find Empty Socks in Catalog of copyright
649 entries. Ser.3 pt.12-13 v.9-12 1955-1958 Motion Pictures
650 &lt;a href=&quot;http://onlinebooks.library.upenn.edu/cce/1955r.html#film&quot;&gt;available
651 from the University of Pennsylvania&lt;/a&gt;, neither in
652 &lt;a href=&quot;https://babel.hathitrust.org/cgi/pt?id=mdp.39015084451130;page=root;view=image;size=100;seq=83;num=45&quot;&gt;page
653 45 for the first half of 1955&lt;/a&gt;, nor in
654 &lt;a href=&quot;https://babel.hathitrust.org/cgi/pt?id=mdp.39015084451130;page=root;view=image;size=100;seq=175;num=119&quot;&gt;page
655 119 for the second half of 1955&lt;/a&gt;. It is of course possible that
656 the renewal entry was left out of the printed catalog by mistake. Is
657 there some way to rule out this possibility? Please help, and update
658 the wikipedia page with your findings.
659
660 &lt;p&gt;As usual, if you use Bitcoin and want to show your support of my
661 activities, please send Bitcoin donations to my address
662 &lt;b&gt;&lt;a href=&quot;bitcoin:15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b&quot;&gt;15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b&lt;/a&gt;&lt;/b&gt;.&lt;/p&gt;
663 </description>
664 </item>
665
666 <item>
667 <title>Metadata proposal for movies on the Internet Archive</title>
668 <link>http://people.skolelinux.org/pere/blog/Metadata_proposal_for_movies_on_the_Internet_Archive.html</link>
669 <guid isPermaLink="true">http://people.skolelinux.org/pere/blog/Metadata_proposal_for_movies_on_the_Internet_Archive.html</guid>
670 <pubDate>Tue, 28 Nov 2017 12:00:00 +0100</pubDate>
671 <description>&lt;p&gt;It would be easier to locate the movie you want to watch in
672 &lt;a href=&quot;https://www.archive.org/&quot;&gt;the Internet Archive&lt;/a&gt;, if the
673 metadata about each movie was more complete and accurate. In the
674 archiving community, a well known saying state that good metadata is a
675 love letter to the future. The metadata in the Internet Archive could
676 use a face lift for the future to love us back. Here is a proposal
677 for a small improvement that would make the metadata more useful
678 today. I&#39;ve been unable to find any document describing the various
679 standard fields available when uploading videos to the archive, so
680 this proposal is based on my best quess and searching through several
681 of the existing movies.&lt;/p&gt;
682
683 &lt;p&gt;I have a few use cases in mind. First of all, I would like to be
684 able to count the number of distinct movies in the Internet Archive,
685 without duplicates. I would further like to identify the IMDB title
686 ID of the movies in the Internet Archive, to be able to look up a IMDB
687 title ID and know if I can fetch the video from there and share it
688 with my friends.&lt;/p&gt;
689
690 &lt;p&gt;Second, I would like the Butter data provider for The Internet
691 archive
692 (&lt;a href=&quot;https://github.com/butterproviders/butter-provider-archive&quot;&gt;available
693 from github&lt;/a&gt;), to list as many of the good movies as possible. The
694 plugin currently do a search in the archive with the following
695 parameters:&lt;/p&gt;
696
697 &lt;p&gt;&lt;pre&gt;
698 collection:moviesandfilms
699 AND NOT collection:movie_trailers
700 AND -mediatype:collection
701 AND format:&quot;Archive BitTorrent&quot;
702 AND year
703 &lt;/pre&gt;&lt;/p&gt;
704
705 &lt;p&gt;Most of the cool movies that fail to show up in Butter do so
706 because the &#39;year&#39; field is missing. The &#39;year&#39; field is populated by
707 the year part from the &#39;date&#39; field, and should be when the movie was
708 released (date or year). Two such examples are
709 &lt;a href=&quot;https://archive.org/details/SidneyOlcottsBen-hur1905&quot;&gt;Ben Hur
710 from 1905&lt;/a&gt; and
711 &lt;a href=&quot;https://archive.org/details/Caminandes2GranDillama&quot;&gt;Caminandes
712 2: Gran Dillama from 2013&lt;/a&gt;, where the year metadata field is
713 missing.&lt;/p&gt;
714
715 So, my proposal is simply, for every movie in The Internet Archive
716 where an IMDB title ID exist, please fill in these metadata fields
717 (note, they can be updated also long after the video was uploaded, but
718 as far as I can tell, only by the uploader):
719
720 &lt;dl&gt;
721
722 &lt;dt&gt;mediatype&lt;/dt&gt;
723 &lt;dd&gt;Should be &#39;movie&#39; for movies.&lt;/dd&gt;
724
725 &lt;dt&gt;collection&lt;/dt&gt;
726 &lt;dd&gt;Should contain &#39;moviesandfilms&#39;.&lt;/dd&gt;
727
728 &lt;dt&gt;title&lt;/dt&gt;
729 &lt;dd&gt;The title of the movie, without the publication year.&lt;/dd&gt;
730
731 &lt;dt&gt;date&lt;/dt&gt;
732 &lt;dd&gt;The data or year the movie was released. This make the movie show
733 up in Butter, as well as make it possible to know the age of the
734 movie and is useful to figure out copyright status.&lt;/dd&gt;
735
736 &lt;dt&gt;director&lt;/dt&gt;
737 &lt;dd&gt;The director of the movie. This make it easier to know if the
738 correct movie is found in movie databases.&lt;/dd&gt;
739
740 &lt;dt&gt;publisher&lt;/dt&gt;
741 &lt;dd&gt;The production company making the movie. Also useful for
742 identifying the correct movie.&lt;/dd&gt;
743
744 &lt;dt&gt;links&lt;/dt&gt;
745
746 &lt;dd&gt;Add a link to the IMDB title page, for example like this: &amp;lt;a
747 href=&quot;http://www.imdb.com/title/tt0028496/&quot;&amp;gt;Movie in
748 IMDB&amp;lt;/a&amp;gt;. This make it easier to find duplicates and allow for
749 counting of number of unique movies in the Archive. Other external
750 references, like to TMDB, could be added like this too.&lt;/dd&gt;
751
752 &lt;/dl&gt;
753
754 &lt;p&gt;I did consider proposing a Custom field for the IMDB title ID (for
755 example &#39;imdb_title_url&#39;, &#39;imdb_code&#39; or simply &#39;imdb&#39;, but suspect it
756 will be easier to simply place it in the links free text field.&lt;/p&gt;
757
758 &lt;p&gt;I created
759 &lt;a href=&quot;https://github.com/petterreinholdtsen/public-domain-free-imdb&quot;&gt;a
760 list of IMDB title IDs for several thousand movies in the Internet
761 Archive&lt;/a&gt;, but I also got a list of several thousand movies without
762 such IMDB title ID (and quite a few duplicates). It would be great if
763 this data set could be integrated into the Internet Archive metadata
764 to be available for everyone in the future, but with the current
765 policy of leaving metadata editing to the uploaders, it will take a
766 while before this happen. If you have uploaded movies into the
767 Internet Archive, you can help. Please consider following my proposal
768 above for your movies, to ensure that movie is properly
769 counted. :)&lt;/p&gt;
770
771 &lt;p&gt;The list is mostly generated using wikidata, which based on
772 Wikipedia articles make it possible to link between IMDB and movies in
773 the Internet Archive. But there are lots of movies without a
774 Wikipedia article, and some movies where only a collection page exist
775 (like for &lt;a href=&quot;https://en.wikipedia.org/wiki/Caminandes&quot;&gt;the
776 Caminandes example above&lt;/a&gt;, where there are three movies but only
777 one Wikidata entry).&lt;/p&gt;
778
779 &lt;p&gt;As usual, if you use Bitcoin and want to show your support of my
780 activities, please send Bitcoin donations to my address
781 &lt;b&gt;&lt;a href=&quot;bitcoin:15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b&quot;&gt;15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b&lt;/a&gt;&lt;/b&gt;.&lt;/p&gt;
782 </description>
783 </item>
784
785 <item>
786 <title>Legal to share more than 3000 movies listed on IMDB?</title>
787 <link>http://people.skolelinux.org/pere/blog/Legal_to_share_more_than_3000_movies_listed_on_IMDB_.html</link>
788 <guid isPermaLink="true">http://people.skolelinux.org/pere/blog/Legal_to_share_more_than_3000_movies_listed_on_IMDB_.html</guid>
789 <pubDate>Sat, 18 Nov 2017 21:20:00 +0100</pubDate>
790 <description>&lt;p&gt;A month ago, I blogged about my work to
791 &lt;a href=&quot;http://people.skolelinux.org/pere/blog/Locating_IMDB_IDs_of_movies_in_the_Internet_Archive_using_Wikidata.html&quot;&gt;automatically
792 check the copyright status of IMDB entries&lt;/a&gt;, and try to count the
793 number of movies listed in IMDB that is legal to distribute on the
794 Internet. I have continued to look for good data sources, and
795 identified a few more. The code used to extract information from
796 various data sources is available in
797 &lt;a href=&quot;https://github.com/petterreinholdtsen/public-domain-free-imdb&quot;&gt;a
798 git repository&lt;/a&gt;, currently available from github.&lt;/p&gt;
799
800 &lt;p&gt;So far I have identified 3186 unique IMDB title IDs. To gain
801 better understanding of the structure of the data set, I created a
802 histogram of the year associated with each movie (typically release
803 year). It is interesting to notice where the peaks and dips in the
804 graph are located. I wonder why they are placed there. I suspect
805 World War II caused the dip around 1940, but what caused the peak
806 around 2010?&lt;/p&gt;
807
808 &lt;p align=&quot;center&quot;&gt;&lt;img src=&quot;http://people.skolelinux.org/pere/blog/images/2017-11-18-verk-i-det-fri-filmer.png&quot; /&gt;&lt;/p&gt;
809
810 &lt;p&gt;I&#39;ve so far identified ten sources for IMDB title IDs for movies in
811 the public domain or with a free license. This is the statistics
812 reported when running &#39;make stats&#39; in the git repository:&lt;/p&gt;
813
814 &lt;pre&gt;
815 249 entries ( 6 unique) with and 288 without IMDB title ID in free-movies-archive-org-butter.json
816 2301 entries ( 540 unique) with and 0 without IMDB title ID in free-movies-archive-org-wikidata.json
817 830 entries ( 29 unique) with and 0 without IMDB title ID in free-movies-icheckmovies-archive-mochard.json
818 2109 entries ( 377 unique) with and 0 without IMDB title ID in free-movies-imdb-pd.json
819 291 entries ( 122 unique) with and 0 without IMDB title ID in free-movies-letterboxd-pd.json
820 144 entries ( 135 unique) with and 0 without IMDB title ID in free-movies-manual.json
821 350 entries ( 1 unique) with and 801 without IMDB title ID in free-movies-publicdomainmovies.json
822 4 entries ( 0 unique) with and 124 without IMDB title ID in free-movies-publicdomainreview.json
823 698 entries ( 119 unique) with and 118 without IMDB title ID in free-movies-publicdomaintorrents.json
824 8 entries ( 8 unique) with and 196 without IMDB title ID in free-movies-vodo.json
825 3186 unique IMDB title IDs in total
826 &lt;/pre&gt;
827
828 &lt;p&gt;The entries without IMDB title ID are candidates to increase the
829 data set, but might equally well be duplicates of entries already
830 listed with IMDB title ID in one of the other sources, or represent
831 movies that lack a IMDB title ID. I&#39;ve seen examples of all these
832 situations when peeking at the entries without IMDB title ID. Based
833 on these data sources, the lower bound for movies listed in IMDB that
834 are legal to distribute on the Internet is between 3186 and 4713.
835
836 &lt;p&gt;It would be great for improving the accuracy of this measurement,
837 if the various sources added IMDB title ID to their metadata. I have
838 tried to reach the people behind the various sources to ask if they
839 are interested in doing this, without any replies so far. Perhaps you
840 can help me get in touch with the people behind VODO, Public Domain
841 Torrents, Public Domain Movies and Public Domain Review to try to
842 convince them to add more metadata to their movie entries?&lt;/p&gt;
843
844 &lt;p&gt;Another way you could help is by adding pages to Wikipedia about
845 movies that are legal to distribute on the Internet. If such page
846 exist and include a link to both IMDB and The Internet Archive, the
847 script used to generate free-movies-archive-org-wikidata.json should
848 pick up the mapping as soon as wikidata is updates.&lt;/p&gt;
849
850 &lt;p&gt;As usual, if you use Bitcoin and want to show your support of my
851 activities, please send Bitcoin donations to my address
852 &lt;b&gt;&lt;a href=&quot;bitcoin:15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b&quot;&gt;15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b&lt;/a&gt;&lt;/b&gt;.&lt;/p&gt;
853 </description>
854 </item>
855
856 <item>
857 <title>Some notes on fault tolerant storage systems</title>
858 <link>http://people.skolelinux.org/pere/blog/Some_notes_on_fault_tolerant_storage_systems.html</link>
859 <guid isPermaLink="true">http://people.skolelinux.org/pere/blog/Some_notes_on_fault_tolerant_storage_systems.html</guid>
860 <pubDate>Wed, 1 Nov 2017 15:35:00 +0100</pubDate>
861 <description>&lt;p&gt;If you care about how fault tolerant your storage is, you might
862 find these articles and papers interesting. They have formed how I
863 think of when designing a storage system.&lt;/p&gt;
864
865 &lt;ul&gt;
866
867 &lt;li&gt;USENIX :login; &lt;a
868 href=&quot;https://www.usenix.org/publications/login/summer2017/ganesan&quot;&gt;Redundancy
869 Does Not Imply Fault Tolerance. Analysis of Distributed Storage
870 Reactions to Single Errors and Corruptions&lt;/a&gt; by Aishwarya Ganesan,
871 Ramnatthan Alagappan, Andrea C. Arpaci-Dusseau, and Remzi
872 H. Arpaci-Dusseau&lt;/li&gt;
873
874 &lt;li&gt;ZDNet
875 &lt;a href=&quot;http://www.zdnet.com/article/why-raid-5-stops-working-in-2009/&quot;&gt;Why
876 RAID 5 stops working in 2009&lt;/a&gt; by Robin Harris&lt;/li&gt;
877
878 &lt;li&gt;ZDNet
879 &lt;a href=&quot;http://www.zdnet.com/article/why-raid-6-stops-working-in-2019/&quot;&gt;Why
880 RAID 6 stops working in 2019&lt;/a&gt; by Robin Harris&lt;/li&gt;
881
882 &lt;li&gt;USENIX FAST&#39;07
883 &lt;a href=&quot;http://research.google.com/archive/disk_failures.pdf&quot;&gt;Failure
884 Trends in a Large Disk Drive Population&lt;/a&gt; by Eduardo Pinheiro,
885 Wolf-Dietrich Weber and Luiz André Barroso&lt;/li&gt;
886
887 &lt;li&gt;USENIX ;login: &lt;a
888 href=&quot;https://www.usenix.org/system/files/login/articles/hughes12-04.pdf&quot;&gt;Data
889 Integrity. Finding Truth in a World of Guesses and Lies&lt;/a&gt; by Doug
890 Hughes&lt;/li&gt;
891
892 &lt;li&gt;USENIX FAST&#39;08
893 &lt;a href=&quot;https://www.usenix.org/events/fast08/tech/full_papers/bairavasundaram/bairavasundaram_html/&quot;&gt;An
894 Analysis of Data Corruption in the Storage Stack&lt;/a&gt; by
895 L. N. Bairavasundaram, G. R. Goodson, B. Schroeder, A. C.
896 Arpaci-Dusseau, and R. H. Arpaci-Dusseau&lt;/li&gt;
897
898 &lt;li&gt;USENIX FAST&#39;07 &lt;a
899 href=&quot;https://www.usenix.org/legacy/events/fast07/tech/schroeder/schroeder_html/&quot;&gt;Disk
900 failures in the real world: what does an MTTF of 1,000,000 hours mean
901 to you?&lt;/a&gt; by B. Schroeder and G. A. Gibson.&lt;/li&gt;
902
903 &lt;li&gt;USENIX ;login: &lt;a
904 href=&quot;https://www.usenix.org/events/fast08/tech/full_papers/jiang/jiang_html/&quot;&gt;Are
905 Disks the Dominant Contributor for Storage Failures? A Comprehensive
906 Study of Storage Subsystem Failure Characteristics&lt;/a&gt; by Weihang
907 Jiang, Chongfeng Hu, Yuanyuan Zhou, and Arkady Kanevsky&lt;/li&gt;
908
909 &lt;li&gt;SIGMETRICS 2007
910 &lt;a href=&quot;http://research.cs.wisc.edu/adsl/Publications/latent-sigmetrics07.pdf&quot;&gt;An
911 analysis of latent sector errors in disk drives&lt;/a&gt; by
912 L. N. Bairavasundaram, G. R. Goodson, S. Pasupathy, and J. Schindler&lt;/li&gt;
913
914 &lt;/ul&gt;
915
916 &lt;p&gt;Several of these research papers are based on data collected from
917 hundred thousands or millions of disk, and their findings are eye
918 opening. The short story is simply do not implicitly trust RAID or
919 redundant storage systems. Details matter. And unfortunately there
920 are few options on Linux addressing all the identified issues. Both
921 ZFS and Btrfs are doing a fairly good job, but have legal and
922 practical issues on their own. I wonder how cluster file systems like
923 Ceph do in this regard. After all, there is an old saying, you know
924 you have a distributed system when the crash of a computer you have
925 never heard of stops you from getting any work done. The same holds
926 true if fault tolerance do not work.&lt;/p&gt;
927
928 &lt;p&gt;Just remember, in the end, it do not matter how redundant, or how
929 fault tolerant your storage is, if you do not continuously monitor its
930 status to detect and replace failed disks.&lt;/p&gt;
931
932 &lt;p&gt;As usual, if you use Bitcoin and want to show your support of my
933 activities, please send Bitcoin donations to my address
934 &lt;b&gt;&lt;a href=&quot;bitcoin:15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b&quot;&gt;15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b&lt;/a&gt;&lt;/b&gt;.&lt;/p&gt;
935 </description>
936 </item>
937
938 <item>
939 <title>Web services for writing academic LaTeX papers as a team</title>
940 <link>http://people.skolelinux.org/pere/blog/Web_services_for_writing_academic_LaTeX_papers_as_a_team.html</link>
941 <guid isPermaLink="true">http://people.skolelinux.org/pere/blog/Web_services_for_writing_academic_LaTeX_papers_as_a_team.html</guid>
942 <pubDate>Tue, 31 Oct 2017 21:00:00 +0100</pubDate>
943 <description>&lt;p&gt;I was surprised today to learn that a friend in academia did not
944 know there are easily available web services available for writing
945 LaTeX documents as a team. I thought it was common knowledge, but to
946 make sure at least my readers are aware of it, I would like to mention
947 these useful services for writing LaTeX documents. Some of them even
948 provide a WYSIWYG editor to ease writing even further.&lt;/p&gt;
949
950 &lt;p&gt;There are two commercial services available,
951 &lt;a href=&quot;https://sharelatex.com&quot;&gt;ShareLaTeX&lt;/a&gt; and
952 &lt;a href=&quot;https://overleaf.com&quot;&gt;Overleaf&lt;/a&gt;. They are very easy to
953 use. Just start a new document, select which publisher to write for
954 (ie which LaTeX style to use), and start writing. Note, these two
955 have announced their intention to join forces, so soon it will only be
956 one joint service. I&#39;ve used both for different documents, and they
957 work just fine. While
958 &lt;a href=&quot;https://github.com/sharelatex/sharelatex&quot;&gt;ShareLaTeX is free
959 software&lt;/a&gt;, while the latter is not. According to &lt;a
960 href=&quot;https://www.overleaf.com/help/17-is-overleaf-open-source&quot;&gt;a
961 announcement from Overleaf&lt;/a&gt;, they plan to keep the ShareLaTeX code
962 base maintained as free software.&lt;/p&gt;
963
964 But these two are not the only alternatives.
965 &lt;a href=&quot;https://app.fiduswriter.org/&quot;&gt;Fidus Writer&lt;/a&gt; is another free
966 software solution with &lt;a href=&quot;https://github.com/fiduswriter&quot;&gt;the
967 source available on github&lt;/a&gt;. I have not used it myself. Several
968 others can be found on the nice
969 &lt;a href=&quot;https://alternativeto.net/software/sharelatex/&quot;&gt;alterntiveTo
970 web service&lt;/a&gt;.
971
972 &lt;p&gt;If you like Google Docs or Etherpad, but would like to write
973 documents in LaTeX, you should check out these services. You can even
974 host your own, if you want to. :)&lt;/p&gt;
975
976 &lt;p&gt;As usual, if you use Bitcoin and want to show your support of my
977 activities, please send Bitcoin donations to my address
978 &lt;b&gt;&lt;a href=&quot;bitcoin:15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b&quot;&gt;15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b&lt;/a&gt;&lt;/b&gt;.&lt;/p&gt;
979 </description>
980 </item>
981
982 </channel>
983 </rss>