From: Petter Reinholdtsen Date: Sat, 18 Nov 2017 20:20:11 +0000 (+0100) Subject: Generated. X-Git-Url: http://pere.pagekite.me/gitweb/homepage.git/commitdiff_plain/25481fa63e1db3a6b2d09facf37ab145e2db2250?ds=sidebyside Generated. --- diff --git a/blog/archive/2017/11/11.rss b/blog/archive/2017/11/11.rss index 40c3bb2762..6f80d4d271 100644 --- a/blog/archive/2017/11/11.rss +++ b/blog/archive/2017/11/11.rss @@ -13,11 +13,11 @@ Sat, 18 Nov 2017 21:20:00 +0100 <p>A month ago, I blogged about my work to automatically check the copyright status of IMDB entries, and try to count the number of -movies listed in IMDB where it is legal to distribute it the Internet. -I have continued to look for good data sources, and identified a few +movies listed in IMDB that is legal to distribute on the Internet. I +have continued to look for good data sources, and identified a few more. The code used to extract information from various data sources is available in -<ahref="https://github.com/petterreinholdtsen/public-domain-free-imdb">a +<a href="https://github.com/petterreinholdtsen/public-domain-free-imdb">a git repository</a>, currently available from github.</p> <p>So far I have identified 3186 unique IMDB title IDs. To gain diff --git a/blog/archive/2017/11/index.html b/blog/archive/2017/11/index.html index 5784377e73..860730285e 100644 --- a/blog/archive/2017/11/index.html +++ b/blog/archive/2017/11/index.html @@ -31,11 +31,11 @@

A month ago, I blogged about my work to automatically check the copyright status of IMDB entries, and try to count the number of -movies listed in IMDB where it is legal to distribute it the Internet. -I have continued to look for good data sources, and identified a few +movies listed in IMDB that is legal to distribute on the Internet. I +have continued to look for good data sources, and identified a few more. The code used to extract information from various data sources is available in -a +a git repository, currently available from github.

So far I have identified 3186 unique IMDB title IDs. To gain diff --git a/blog/data/2017-11-18-verk-i-det-fri-filmer.txt b/blog/data/2017-11-18-verk-i-det-fri-filmer.txt new file mode 100644 index 0000000000..3070e14601 --- /dev/null +++ b/blog/data/2017-11-18-verk-i-det-fri-filmer.txt @@ -0,0 +1,62 @@ +Title: Legal to share more than 3000 movies listed on IMDB? +Tags: english, opphavsrett +Date: 2017-11-18 21:20 + +

A month ago, I blogged about my work to automatically check the +copyright status of IMDB entries, and try to count the number of +movies listed in IMDB that is legal to distribute on the Internet. I +have continued to look for good data sources, and identified a few +more. The code used to extract information from various data sources +is available in +a +git repository, currently available from github.

+ +

So far I have identified 3186 unique IMDB title IDs. To gain +better understanding of the structure of the data set, I created a +histogram of the year associated with each movie (typically release +year). It is interesting to notice where the peaks and dips in the +graph are located. I wonder why they are placed there. I suspect +World Word II caused the dip around 1940, but what caused the peak +around 2010?

+ +

+ +

I've so far identified ten sources for IMDB title IDs for movies in +the public domain or with a free license. This is the statistics +reported when running 'make stats' in the git repository:

+ +
+  249 entries (    6 unique) with and   288 without IMDB title ID in free-movies-archive-org-butter.json
+ 2301 entries (  540 unique) with and     0 without IMDB title ID in free-movies-archive-org-wikidata.json
+  830 entries (   29 unique) with and     0 without IMDB title ID in free-movies-icheckmovies-archive-mochard.json
+ 2109 entries (  377 unique) with and     0 without IMDB title ID in free-movies-imdb-pd.json
+  291 entries (  122 unique) with and     0 without IMDB title ID in free-movies-letterboxd-pd.json
+  144 entries (  135 unique) with and     0 without IMDB title ID in free-movies-manual.json
+  350 entries (    1 unique) with and   801 without IMDB title ID in free-movies-publicdomainmovies.json
+    4 entries (    0 unique) with and   124 without IMDB title ID in free-movies-publicdomainreview.json
+  698 entries (  119 unique) with and   118 without IMDB title ID in free-movies-publicdomaintorrents.json
+    8 entries (    8 unique) with and   196 without IMDB title ID in free-movies-vodo.json
+ 3186 unique IMDB title IDs in total
+
+ +

The entries without IMDB title ID are candidates to increase the +data set, but might equally well be duplicates of entries already +listed with IMDB title ID in one of the other sources, or represent +movies that lack a IMDB title ID. I've seen examples of all these +situations when peeking at the entries without IMDB title ID. Based +on these data sources, the lower bound for movies listed in IMDB that +are legal to distribute on the Internet is between 3186 and 4713. + +

It would be great for improving the accuracy of this measurement, +if the various sources added IMDB title ID to their metadata. I have +tried to reach the people behind the various sources to ask if they +are interested in doing this, without any replies so far. Perhaps you +can help me get in touch with the people behind VODO, Public Domain +Torrents, Public Domain Movies and Public Domain Review to try to +convince them to add more metadata to their movie entries?

+ +

Another way you could help is by adding pages to Wikipedia about +movies that are legal to distribute on the Internet. If such page +exist and include a link to both IMDB and The Internet Archive, the +script used to generate free-movies-archive-org-wikidata.json should +pick up the mapping as soon as wikidata is updates.

diff --git a/blog/index.html b/blog/index.html index 006bf5f6c3..8570aa6d9b 100644 --- a/blog/index.html +++ b/blog/index.html @@ -24,11 +24,11 @@
18th November 2017

A month ago, I blogged about my work to automatically check the copyright status of IMDB entries, and try to count the number of -movies listed in IMDB where it is legal to distribute it the Internet. -I have continued to look for good data sources, and identified a few +movies listed in IMDB that is legal to distribute on the Internet. I +have continued to look for good data sources, and identified a few more. The code used to extract information from various data sources is available in -a +a git repository, currently available from github.

So far I have identified 3186 unique IMDB title IDs. To gain diff --git a/blog/index.rss b/blog/index.rss index 12c558eafa..edab48138f 100644 --- a/blog/index.rss +++ b/blog/index.rss @@ -13,11 +13,11 @@ Sat, 18 Nov 2017 21:20:00 +0100 <p>A month ago, I blogged about my work to automatically check the copyright status of IMDB entries, and try to count the number of -movies listed in IMDB where it is legal to distribute it the Internet. -I have continued to look for good data sources, and identified a few +movies listed in IMDB that is legal to distribute on the Internet. I +have continued to look for good data sources, and identified a few more. The code used to extract information from various data sources is available in -<ahref="https://github.com/petterreinholdtsen/public-domain-free-imdb">a +<a href="https://github.com/petterreinholdtsen/public-domain-free-imdb">a git repository</a>, currently available from github.</p> <p>So far I have identified 3186 unique IMDB title IDs. To gain diff --git a/blog/tags/english/english.rss b/blog/tags/english/english.rss index 6c7c1c7b91..f9a2f48c1d 100644 --- a/blog/tags/english/english.rss +++ b/blog/tags/english/english.rss @@ -13,11 +13,11 @@ Sat, 18 Nov 2017 21:20:00 +0100 <p>A month ago, I blogged about my work to automatically check the copyright status of IMDB entries, and try to count the number of -movies listed in IMDB where it is legal to distribute it the Internet. -I have continued to look for good data sources, and identified a few +movies listed in IMDB that is legal to distribute on the Internet. I +have continued to look for good data sources, and identified a few more. The code used to extract information from various data sources is available in -<ahref="https://github.com/petterreinholdtsen/public-domain-free-imdb">a +<a href="https://github.com/petterreinholdtsen/public-domain-free-imdb">a git repository</a>, currently available from github.</p> <p>So far I have identified 3186 unique IMDB title IDs. To gain diff --git a/blog/tags/english/index.html b/blog/tags/english/index.html index c09cc83da8..32a1785f52 100644 --- a/blog/tags/english/index.html +++ b/blog/tags/english/index.html @@ -30,11 +30,11 @@

A month ago, I blogged about my work to automatically check the copyright status of IMDB entries, and try to count the number of -movies listed in IMDB where it is legal to distribute it the Internet. -I have continued to look for good data sources, and identified a few +movies listed in IMDB that is legal to distribute on the Internet. I +have continued to look for good data sources, and identified a few more. The code used to extract information from various data sources is available in -a +a git repository, currently available from github.

So far I have identified 3186 unique IMDB title IDs. To gain diff --git a/blog/tags/opphavsrett/index.html b/blog/tags/opphavsrett/index.html index a510f9a9bf..fb4cf94fea 100644 --- a/blog/tags/opphavsrett/index.html +++ b/blog/tags/opphavsrett/index.html @@ -30,11 +30,11 @@

A month ago, I blogged about my work to automatically check the copyright status of IMDB entries, and try to count the number of -movies listed in IMDB where it is legal to distribute it the Internet. -I have continued to look for good data sources, and identified a few +movies listed in IMDB that is legal to distribute on the Internet. I +have continued to look for good data sources, and identified a few more. The code used to extract information from various data sources is available in -a +a git repository, currently available from github.

So far I have identified 3186 unique IMDB title IDs. To gain diff --git a/blog/tags/opphavsrett/opphavsrett.rss b/blog/tags/opphavsrett/opphavsrett.rss index b9a2b3ba4a..21918499bc 100644 --- a/blog/tags/opphavsrett/opphavsrett.rss +++ b/blog/tags/opphavsrett/opphavsrett.rss @@ -13,11 +13,11 @@ Sat, 18 Nov 2017 21:20:00 +0100 <p>A month ago, I blogged about my work to automatically check the copyright status of IMDB entries, and try to count the number of -movies listed in IMDB where it is legal to distribute it the Internet. -I have continued to look for good data sources, and identified a few +movies listed in IMDB that is legal to distribute on the Internet. I +have continued to look for good data sources, and identified a few more. The code used to extract information from various data sources is available in -<ahref="https://github.com/petterreinholdtsen/public-domain-free-imdb">a +<a href="https://github.com/petterreinholdtsen/public-domain-free-imdb">a git repository</a>, currently available from github.</p> <p>So far I have identified 3186 unique IMDB title IDs. To gain