1 Title: Metadata proposal for movies on the Internet Archive
2 Tags: english, opphavsrett, verkidetfri
5 <p>It would be easier to locate the movie you want to watch in
6 <a href="https://www.archive.org/">the Internet Archive</a>, if the
7 metadata about each movie was more complete and accurate. In the
8 archiving community, a well known saying state that good metadata is a
9 love letter to the future. The metadata in the Internet Archive could
10 use a face lift for the future to love us back. Here is a proposal
11 for a small improvement that would make the metadata more useful
12 today. I've been unable to find any document describing the various
13 standard fields available when uploading videos to the archive, so
14 this proposal is based on my best quess and searching through several
15 of the existing movies.</p>
17 <p>I have a few use cases in mind. First of all, I would like to be
18 able to count the number of distinct movies in the Internet Archive,
19 without duplicates. I would further like to identify the IMDB title
20 ID of the movies in the Internet Archive, to be able to look up a IMDB
21 title ID and know if I can fetch the video from there and share it
24 <p>Second, I would like the Butter data provider for The Internet
26 (<a href="https://github.com/butterproviders/butter-provider-archive">available
27 from github</a>), to list as many of the good movies as possible. The
28 plugin currently do a search in the archive with the following
32 collection:moviesandfilms
33 AND NOT collection:movie_trailers
34 AND -mediatype:collection
35 AND format:"Archive BitTorrent"
39 <p>Most of the cool movies that fail to show up in Butter do so
40 because the 'year' field is missing. The 'year' field is populated by
41 the year part from the 'date' field, and should be when the movie was
42 released (date or year). Two such examples are
43 <a href="https://archive.org/details/SidneyOlcottsBen-hur1905">Ben Hur
45 <a href="https://archive.org/details/Caminandes2GranDillama">Caminandes
46 2: Gran Dillama from 2013</a>, where the year metadata field is
49 So, my proposal is simply, for every movie in The Internet Archive
50 where an IMDB title ID exist, please fill in these metadata fields
51 (note, they can be updated also long after the video was uploaded, but
52 as far as I can tell, only by the uploader):
57 <dd>Should be 'movie' for movies.</dd>
60 <dd>Should contain 'moviesandfilms'.</dd>
63 <dd>The title of the movie, without the publication year.</dd>
66 <dd>The data or year the movie was released. This make the movie show
67 up in Butter, as well as make it possible to know the age of the
68 movie and is useful to figure out copyright status.</dd>
71 <dd>The director of the movie. This make it easier to know if the
72 correct movie is found in movie databases.</dd>
75 <dd>The production company making the movie. Also useful for
76 identifying the correct movie.</dd>
80 <dd>Add a link to the IMDB title page, for example like this: <a
81 href="http://www.imdb.com/title/tt0028496/">Movie in
82 IMDB</a>. This make it easier to find duplicates and allow for
83 counting of number of unique movies in the Archive. Other external
84 references, like to TMDB, could be added like this too.</dd>
88 <p>I did consider proposing a Custom field for the IMDB title ID (for
89 example 'imdb_title_url', 'imdb_code' or simply 'imdb', but suspect it
90 will be easier to simply place it in the links free text field.</p>
93 <a href="https://github.com/petterreinholdtsen/public-domain-free-imdb">a
94 list of IMDB title IDs for several thousand movies in the Internet
95 Archive</a>, but I also got a list of several thousand movies without
96 such IMDB title ID (and quite a few duplicates). It would be great if
97 this data set could be integrated into the Internet Archive metadata
98 to be available for everyone in the future, but with the current
99 policy of leaving metadata editing to the uploaders, it will take a
100 while before this happen. If you have uploaded movies into the
101 Internet Archive, you can help. Please consider following my proposal
102 above for your movies, to ensure that movie is properly
105 <p>The list is mostly generated using wikidata, which based on
106 Wikipedia articles make it possible to link between IMDB and movies in
107 the Internet Archive. But there are lots of movies without a
108 Wikipedia article, and some movies where only a collection page exist
109 (like for <a href="https://en.wikipedia.org/wiki/Caminandes">the
110 Caminandes example above</a>, where there are three movies but only
111 one Wikidata entry).</p>
113 <p>As usual, if you use Bitcoin and want to show your support of my
114 activities, please send Bitcoin donations to my address
115 <b><a href="bitcoin:15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b">15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b</a></b>.</p>