<link>http://people.skolelinux.org/pere/blog/</link>
<atom:link href="http://people.skolelinux.org/pere/blog/index.rss" rel="self" type="application/rss+xml" />
+ <item>
+ <title>Legal to share more than 3000 movies listed on IMDB?</title>
+ <link>http://people.skolelinux.org/pere/blog/Legal_to_share_more_than_3000_movies_listed_on_IMDB_.html</link>
+ <guid isPermaLink="true">http://people.skolelinux.org/pere/blog/Legal_to_share_more_than_3000_movies_listed_on_IMDB_.html</guid>
+ <pubDate>Sat, 18 Nov 2017 21:20:00 +0100</pubDate>
+ <description><p>A month ago, I blogged about my work to automatically check the
+copyright status of IMDB entries, and try to count the number of
+movies listed in IMDB where it is legal to distribute it the Internet.
+I have continued to look for good data sources, and identified a few
+more. The code used to extract information from various data sources
+is available in
+<ahref="https://github.com/petterreinholdtsen/public-domain-free-imdb">a
+git repository</a>, currently available from github.</p>
+
+<p>So far I have identified 3186 unique IMDB title IDs. To gain
+better understanding of the structure of the data set, I created a
+histogram of the year associated with each movie (typically release
+year). It is interesting to notice where the peaks and dips in the
+graph are located. I wonder why they are placed there. I suspect
+World Word II caused the dip around 1940, but what caused the peak
+around 2010?</p>
+
+<p><img src="http://people.skolelinux.org/pere/blog/images/2017-11-18-verk-i-det-fri-filmer.png" /></p>
+
+<p>I've so far identified ten sources for IMDB title IDs for movies in
+the public domain or with a free license. This is the statistics
+reported when running 'make stats' in the git repository:</p>
+
+<pre>
+ 249 entries ( 6 unique) with and 288 without IMDB title ID in free-movies-archive-org-butter.json
+ 2301 entries ( 540 unique) with and 0 without IMDB title ID in free-movies-archive-org-wikidata.json
+ 830 entries ( 29 unique) with and 0 without IMDB title ID in free-movies-icheckmovies-archive-mochard.json
+ 2109 entries ( 377 unique) with and 0 without IMDB title ID in free-movies-imdb-pd.json
+ 291 entries ( 122 unique) with and 0 without IMDB title ID in free-movies-letterboxd-pd.json
+ 144 entries ( 135 unique) with and 0 without IMDB title ID in free-movies-manual.json
+ 350 entries ( 1 unique) with and 801 without IMDB title ID in free-movies-publicdomainmovies.json
+ 4 entries ( 0 unique) with and 124 without IMDB title ID in free-movies-publicdomainreview.json
+ 698 entries ( 119 unique) with and 118 without IMDB title ID in free-movies-publicdomaintorrents.json
+ 8 entries ( 8 unique) with and 196 without IMDB title ID in free-movies-vodo.json
+ 3186 unique IMDB title IDs in total
+</pre>
+
+<p>The entries without IMDB title ID are candidates to increase the
+data set, but might equally well be duplicates of entries already
+listed with IMDB title ID in one of the other sources, or represent
+movies that lack a IMDB title ID. I've seen examples of all these
+situations when peeking at the entries without IMDB title ID. Based
+on these data sources, the lower bound for movies listed in IMDB that
+are legal to distribute on the Internet is between 3186 and 4713.
+
+<p>It would be great for improving the accuracy of this measurement,
+if the various sources added IMDB title ID to their metadata. I have
+tried to reach the people behind the various sources to ask if they
+are interested in doing this, without any positive replies so far.
+Perhaps you can help me get in touch with the people behind VODO,
+Public Domain Torrents, Public Domain Movies and Public Domain Review
+to try to convince them to add more metadata to their movie entries?</p>
+
+<p>Another way you could help is by adding pages to Wikipedia about
+movies that are legal to distribute on the Internet. If such page
+exist and include a link to both IMDB and The Internet Archive, the
+script used to generate free-movies-archive-org-wikidata.json should
+pick up the mapping as soon as wikidata is updates.</p>
+</description>
+ </item>
+
<item>
<title>Some notes on fault tolerant storage systems</title>
<link>http://people.skolelinux.org/pere/blog/Some_notes_on_fault_tolerant_storage_systems.html</link>
</description>
</item>
- <item>
- <title>Simpler recipe on how to make a simple $7 IMSI Catcher using Debian</title>
- <link>http://people.skolelinux.org/pere/blog/Simpler_recipe_on_how_to_make_a_simple__7_IMSI_Catcher_using_Debian.html</link>
- <guid isPermaLink="true">http://people.skolelinux.org/pere/blog/Simpler_recipe_on_how_to_make_a_simple__7_IMSI_Catcher_using_Debian.html</guid>
- <pubDate>Wed, 9 Aug 2017 23:59:00 +0200</pubDate>
- <description><p>On friday, I came across an interesting article in the Norwegian
-web based ICT news magazine digi.no on
-<a href="https://www.digi.no/artikler/sikkerhetsforsker-lagde-enkel-imsi-catcher-for-60-kroner-na-kan-mobiler-kartlegges-av-alle/398588">how
-to collect the IMSI numbers of nearby cell phones</a> using the cheap
-DVB-T software defined radios. The article refered to instructions
-and <a href="https://www.youtube.com/watch?v=UjwgNd_as30">a recipe by
-Keld Norman on Youtube on how to make a simple $7 IMSI Catcher</a>, and I decided to test them out.</p>
-
-<p>The instructions said to use Ubuntu, install pip using apt (to
-bypass apt), use pip to install pybombs (to bypass both apt and pip),
-and the ask pybombs to fetch and build everything you need from
-scratch. I wanted to see if I could do the same on the most recent
-Debian packages, but this did not work because pybombs tried to build
-stuff that no longer build with the most recent openssl library or
-some other version skew problem. While trying to get this recipe
-working, I learned that the apt->pip->pybombs route was a long detour,
-and the only piece of software dependency missing in Debian was the
-gr-gsm package. I also found out that the lead upstream developer of
-gr-gsm (the name stand for GNU Radio GSM) project already had a set of
-Debian packages provided in an Ubuntu PPA repository. All I needed to
-do was to dget the Debian source package and built it.</p>
-
-<p>The IMSI collector is a python script listening for packages on the
-loopback network device and printing to the terminal some specific GSM
-packages with IMSI numbers in them. The code is fairly short and easy
-to understand. The reason this work is because gr-gsm include a tool
-to read GSM data from a software defined radio like a DVB-T USB stick
-and other software defined radios, decode them and inject them into a
-network device on your Linux machine (using the loopback device by
-default). This proved to work just fine, and I've been testing the
-collector for a few days now.</p>
-
-<p>The updated and simpler recipe is thus to</p>
-
-<ol>
-
-<li>start with a Debian machine running Stretch or newer,</li>
-
-<li>build and install the gr-gsm package available from
-<a href="http://ppa.launchpad.net/ptrkrysik/gr-gsm/ubuntu/pool/main/g/gr-gsm/">http://ppa.launchpad.net/ptrkrysik/gr-gsm/ubuntu/pool/main/g/gr-gsm/</a>,</li>
-
-<li>clone the git repostory from <a href="https://github.com/Oros42/IMSI-catcher">https://github.com/Oros42/IMSI-catcher</a>,</li>
-
-<li>run grgsm_livemon and adjust the frequency until the terminal
-where it was started is filled with a stream of text (meaning you
-found a GSM station).</li>
-
-<li>go into the IMSI-catcher directory and run 'sudo python simple_IMSI-catcher.py' to extract the IMSI numbers.</li>
-
-</ol>
-
-<p>To make it even easier in the future to get this sniffer up and
-running, I decided to package
-<a href="https://github.com/ptrkrysik/gr-gsm/">the gr-gsm project</a>
-for Debian (<a href="https://bugs.debian.org/871055">WNPP
-#871055</a>), and the package was uploaded into the NEW queue today.
-Luckily the gnuradio maintainer has promised to help me, as I do not
-know much about gnuradio stuff yet.</p>
-
-<p>I doubt this "IMSI cacher" is anywhere near as powerfull as
-commercial tools like
-<a href="https://www.thespyphone.com/portable-imsi-imei-catcher/">The
-Spy Phone Portable IMSI / IMEI Catcher</a> or the
-<a href="https://en.wikipedia.org/wiki/Stingray_phone_tracker">Harris
-Stingray</a>, but I hope the existance of cheap alternatives can make
-more people realise how their whereabouts when carrying a cell phone
-is easily tracked. Seeing the data flow on the screen, realizing that
-I live close to a police station and knowing that the police is also
-wearing cell phones, I wonder how hard it would be for criminals to
-track the position of the police officers to discover when there are
-police near by, or for foreign military forces to track the location
-of the Norwegian military forces, or for anyone to track the location
-of government officials...</p>
-
-<p>It is worth noting that the data reported by the IMSI-catcher
-script mentioned above is only a fraction of the data broadcasted on
-the GSM network. It will only collect one frequency at the time,
-while a typical phone will be using several frequencies, and not all
-phones will be using the frequencies tracked by the grgsm_livemod
-program. Also, there is a lot of radio chatter being ignored by the
-simple_IMSI-catcher script, which would be collected by extending the
-parser code. I wonder if gr-gsm can be set up to listen to more than
-one frequency?</p>
-</description>
- </item>
-
</channel>
</rss>