--- /dev/null
+Title: Legal to share more than 11,000 movies listed on IMDB?
+Tags: english, opphavsrett, verkidetfri
+Date: 2018-01-07 23:30
+
+<p>I've continued to track down list of movies that are legal to
+distribute on the Internet, and identified more than 11,000 title IDs
+in The Internet Movie Database so far. Most of them (57%) are feature
+films from USA published before 1923. I've also tracked down more
+than 24,000 movies I have not yet been able to map to IMDB title ID,
+so the real number could be a lot higher. According to the front web
+page for <a href="https://retrofilmvault.com/">Retro Film Vault</A>,
+there are 44,000 public domain films, so I guess there are still some
+left to identify.</p>
+
+<p>The complete data set is available from
+<a href="https://github.com/petterreinholdtsen/public-domain-free-imdb">a
+public git repository</a>, including the scripts used to create it.
+Most of the data is collected using web scraping, for example from the
+"product catalog" of companies selling copies of public domain movies,
+but any source I find believable is used. I've so far had to throw
+out three sources because I did not trust the public domain status of
+the movies listed.</p>
+
+<p>Anyway, this is the summary of the 28 collected data sources so
+far:</p>
+
+<p><pre>
+ 2352 entries ( 66 unique) with and 15983 without IMDB title ID in free-movies-archive-org-search.json
+ 2302 entries ( 120 unique) with and 0 without IMDB title ID in free-movies-archive-org-wikidata.json
+ 195 entries ( 63 unique) with and 200 without IMDB title ID in free-movies-cinemovies.json
+ 89 entries ( 52 unique) with and 38 without IMDB title ID in free-movies-creative-commons.json
+ 344 entries ( 28 unique) with and 655 without IMDB title ID in free-movies-fesfilm.json
+ 668 entries ( 209 unique) with and 1064 without IMDB title ID in free-movies-filmchest-com.json
+ 830 entries ( 21 unique) with and 0 without IMDB title ID in free-movies-icheckmovies-archive-mochard.json
+ 19 entries ( 19 unique) with and 0 without IMDB title ID in free-movies-imdb-c-expired-gb.json
+ 6822 entries ( 6669 unique) with and 0 without IMDB title ID in free-movies-imdb-c-expired-us.json
+ 137 entries ( 0 unique) with and 0 without IMDB title ID in free-movies-imdb-externlist.json
+ 1205 entries ( 57 unique) with and 0 without IMDB title ID in free-movies-imdb-pd.json
+ 84 entries ( 20 unique) with and 167 without IMDB title ID in free-movies-infodigi-pd.json
+ 158 entries ( 135 unique) with and 0 without IMDB title ID in free-movies-letterboxd-looney-tunes.json
+ 113 entries ( 4 unique) with and 0 without IMDB title ID in free-movies-letterboxd-pd.json
+ 182 entries ( 100 unique) with and 0 without IMDB title ID in free-movies-letterboxd-silent.json
+ 229 entries ( 87 unique) with and 1 without IMDB title ID in free-movies-manual.json
+ 44 entries ( 2 unique) with and 64 without IMDB title ID in free-movies-openflix.json
+ 291 entries ( 33 unique) with and 474 without IMDB title ID in free-movies-profilms-pd.json
+ 211 entries ( 7 unique) with and 0 without IMDB title ID in free-movies-publicdomainmovies-info.json
+ 1232 entries ( 57 unique) with and 1875 without IMDB title ID in free-movies-publicdomainmovies-net.json
+ 46 entries ( 13 unique) with and 81 without IMDB title ID in free-movies-publicdomainreview.json
+ 698 entries ( 64 unique) with and 118 without IMDB title ID in free-movies-publicdomaintorrents.json
+ 1758 entries ( 882 unique) with and 3786 without IMDB title ID in free-movies-retrofilmvault.json
+ 16 entries ( 0 unique) with and 0 without IMDB title ID in free-movies-thehillproductions.json
+ 63 entries ( 16 unique) with and 141 without IMDB title ID in free-movies-vodo.json
+11583 unique IMDB title IDs in total, 8724 only in one list, 24647 without IMDB title ID
+</pre></p>
+
+<p> I keep finding more data sources. I found the cinemovies source
+just a few days ago, and as you can see from the summary, it extended
+my list with 63 movies. Check out the mklist-* scripts in the git
+repository if you are curious how the lists are created. Many of the
+titles are extracted using searches on IMDB, where I look for the
+title and year, and accept search results with only one movie listed
+if the year matches. This allow me to automatically use many lists of
+movies without IMDB title ID references at the cost of increasing the
+risk of wrongly identify a IMDB title ID as public domain. So far my
+random manual checks have indicated that the method is solid, but I
+really wish all lists of public domain movies would include unique
+movie identifier like the IMDB title ID. It would make the job of
+counting movies in the public domain a lot easier.</p>