X-Git-Url: https://pere.pagekite.me/gitweb/homepage.git/blobdiff_plain/ea70595b420dc8bda793ba9a383df1085a9e7c8e..b43b2a6bad499f33c732672e61cbbaea9dc87fad:/blog/index.html diff --git a/blog/index.html b/blog/index.html index 5ee011c39b..006bf5f6c3 100644 --- a/blog/index.html +++ b/blog/index.html @@ -19,6 +19,78 @@ +
+
Legal to share more than 3000 movies listed on IMDB?
+
18th November 2017
+

A month ago, I blogged about my work to automatically check the +copyright status of IMDB entries, and try to count the number of +movies listed in IMDB where it is legal to distribute it the Internet. +I have continued to look for good data sources, and identified a few +more. The code used to extract information from various data sources +is available in +a +git repository, currently available from github.

+ +

So far I have identified 3186 unique IMDB title IDs. To gain +better understanding of the structure of the data set, I created a +histogram of the year associated with each movie (typically release +year). It is interesting to notice where the peaks and dips in the +graph are located. I wonder why they are placed there. I suspect +World Word II caused the dip around 1940, but what caused the peak +around 2010?

+ +

+ +

I've so far identified ten sources for IMDB title IDs for movies in +the public domain or with a free license. This is the statistics +reported when running 'make stats' in the git repository:

+ +
+  249 entries (    6 unique) with and   288 without IMDB title ID in free-movies-archive-org-butter.json
+ 2301 entries (  540 unique) with and     0 without IMDB title ID in free-movies-archive-org-wikidata.json
+  830 entries (   29 unique) with and     0 without IMDB title ID in free-movies-icheckmovies-archive-mochard.json
+ 2109 entries (  377 unique) with and     0 without IMDB title ID in free-movies-imdb-pd.json
+  291 entries (  122 unique) with and     0 without IMDB title ID in free-movies-letterboxd-pd.json
+  144 entries (  135 unique) with and     0 without IMDB title ID in free-movies-manual.json
+  350 entries (    1 unique) with and   801 without IMDB title ID in free-movies-publicdomainmovies.json
+    4 entries (    0 unique) with and   124 without IMDB title ID in free-movies-publicdomainreview.json
+  698 entries (  119 unique) with and   118 without IMDB title ID in free-movies-publicdomaintorrents.json
+    8 entries (    8 unique) with and   196 without IMDB title ID in free-movies-vodo.json
+ 3186 unique IMDB title IDs in total
+
+ +

The entries without IMDB title ID are candidates to increase the +data set, but might equally well be duplicates of entries already +listed with IMDB title ID in one of the other sources, or represent +movies that lack a IMDB title ID. I've seen examples of all these +situations when peeking at the entries without IMDB title ID. Based +on these data sources, the lower bound for movies listed in IMDB that +are legal to distribute on the Internet is between 3186 and 4713. + +

It would be great for improving the accuracy of this measurement, +if the various sources added IMDB title ID to their metadata. I have +tried to reach the people behind the various sources to ask if they +are interested in doing this, without any positive replies so far. +Perhaps you can help me get in touch with the people behind VODO, +Public Domain Torrents, Public Domain Movies and Public Domain Review +to try to convince them to add more metadata to their movie entries?

+ +

Another way you could help is by adding pages to Wikipedia about +movies that are legal to distribute on the Internet. If such page +exist and include a link to both IMDB and The Internet Archive, the +script used to generate free-movies-archive-org-wikidata.json should +pick up the mapping as soon as wikidata is updates.

+
+
+ + + Tags: english, opphavsrett. + + +
+
+
+
Some notes on fault tolerant storage systems
1st November 2017
@@ -750,103 +822,6 @@ brukere med Ring, slik at jeg også bruker
-
-
Simpler recipe on how to make a simple $7 IMSI Catcher using Debian
-
9th August 2017
-

On friday, I came across an interesting article in the Norwegian -web based ICT news magazine digi.no on -how -to collect the IMSI numbers of nearby cell phones using the cheap -DVB-T software defined radios. The article refered to instructions -and a recipe by -Keld Norman on Youtube on how to make a simple $7 IMSI Catcher, and I decided to test them out.

- -

The instructions said to use Ubuntu, install pip using apt (to -bypass apt), use pip to install pybombs (to bypass both apt and pip), -and the ask pybombs to fetch and build everything you need from -scratch. I wanted to see if I could do the same on the most recent -Debian packages, but this did not work because pybombs tried to build -stuff that no longer build with the most recent openssl library or -some other version skew problem. While trying to get this recipe -working, I learned that the apt->pip->pybombs route was a long detour, -and the only piece of software dependency missing in Debian was the -gr-gsm package. I also found out that the lead upstream developer of -gr-gsm (the name stand for GNU Radio GSM) project already had a set of -Debian packages provided in an Ubuntu PPA repository. All I needed to -do was to dget the Debian source package and built it.

- -

The IMSI collector is a python script listening for packages on the -loopback network device and printing to the terminal some specific GSM -packages with IMSI numbers in them. The code is fairly short and easy -to understand. The reason this work is because gr-gsm include a tool -to read GSM data from a software defined radio like a DVB-T USB stick -and other software defined radios, decode them and inject them into a -network device on your Linux machine (using the loopback device by -default). This proved to work just fine, and I've been testing the -collector for a few days now.

- -

The updated and simpler recipe is thus to

- -
    - -
  1. start with a Debian machine running Stretch or newer,
  2. - -
  3. build and install the gr-gsm package available from -http://ppa.launchpad.net/ptrkrysik/gr-gsm/ubuntu/pool/main/g/gr-gsm/,
  4. - -
  5. clone the git repostory from https://github.com/Oros42/IMSI-catcher,
  6. - -
  7. run grgsm_livemon and adjust the frequency until the terminal -where it was started is filled with a stream of text (meaning you -found a GSM station).
  8. - -
  9. go into the IMSI-catcher directory and run 'sudo python simple_IMSI-catcher.py' to extract the IMSI numbers.
  10. - -
- -

To make it even easier in the future to get this sniffer up and -running, I decided to package -the gr-gsm project -for Debian (WNPP -#871055), and the package was uploaded into the NEW queue today. -Luckily the gnuradio maintainer has promised to help me, as I do not -know much about gnuradio stuff yet.

- -

I doubt this "IMSI cacher" is anywhere near as powerfull as -commercial tools like -The -Spy Phone Portable IMSI / IMEI Catcher or the -Harris -Stingray, but I hope the existance of cheap alternatives can make -more people realise how their whereabouts when carrying a cell phone -is easily tracked. Seeing the data flow on the screen, realizing that -I live close to a police station and knowing that the police is also -wearing cell phones, I wonder how hard it would be for criminals to -track the position of the police officers to discover when there are -police near by, or for foreign military forces to track the location -of the Norwegian military forces, or for anyone to track the location -of government officials...

- -

It is worth noting that the data reported by the IMSI-catcher -script mentioned above is only a fraction of the data broadcasted on -the GSM network. It will only collect one frequency at the time, -while a typical phone will be using several frequencies, and not all -phones will be using the frequencies tracked by the grgsm_livemod -program. Also, there is a lot of radio chatter being ignored by the -simple_IMSI-catcher script, which would be collected by extending the -parser code. I wonder if gr-gsm can be set up to listen to more than -one frequency?

-
-
- - - Tags: debian, english, personvern, surveillance. - - -
-
-
-

RSS feed