X-Git-Url: http://pere.pagekite.me/gitweb/homepage.git/blobdiff_plain/fa2ae3abb2a07276b3b31bab1ad6185589cd202c..e8e4a69e19ff10d4362af1042919be12be4b2772:/blog/index.rss diff --git a/blog/index.rss b/blog/index.rss index 0c96521c59..294234ed70 100644 --- a/blog/index.rss +++ b/blog/index.rss @@ -6,6 +6,121 @@ http://people.skolelinux.org/pere/blog/ + + Metadata proposal for movies on the Internet Archive + http://people.skolelinux.org/pere/blog/Metadata_proposal_for_movies_on_the_Internet_Archive.html + http://people.skolelinux.org/pere/blog/Metadata_proposal_for_movies_on_the_Internet_Archive.html + Tue, 28 Nov 2017 12:00:00 +0100 + <p>It would be easier to locate the movie you want to watch in +<a href="https://www.archive.org/">the Internet Archive</a>, if the +metadata about each movie was more complete and accurate. In the +archiving community, a well known saying state that good metadata is a +love letter to the future. The metadata in the Internet Archive could +use a face lift for the future to love us back. Here is a proposal +for a small improvement that would make the metadata more useful +today. I've been unable to find any document describing the various +standard fields available when uploading videos to the archive, so +this proposal is based on my best quess and searching through several +of the existing movies.</p> + +<p>I have a few use cases in mind. First of all, I would like to be +able to count the number of distinct movies in the Internet Archive, +without duplicates. I would further like to identify the IMDB title +ID of the movies in the Internet Archive, to be able to look up a IMDB +title ID and know if I can fetch the video from there and share it +with my friends.</p> + +<p>Second, I would like the Butter data provider for The Internet +archive +(<a href="https://github.com/butterproviders/butter-provider-archive">available +from github</a>), to list as many of the good movies as possible. The +plugin currently do a search in the archive with the following +parameters:</p> + +<p><pre> +collection:moviesandfilms +AND NOT collection:movie_trailers +AND -mediatype:collection +AND format:"Archive BitTorrent" +AND year +</pre></p> + +<p>Most of the cool movies that fail to show up in Butter do so +because the 'year' field is missing. The 'year' field is populated by +the year part from the 'date' field, and should be when the movie was +released (date or year). Two such examples are +<a href="https://archive.org/details/SidneyOlcottsBen-hur1905">Ben Hur +from 1905</a> and +<a href="https://archive.org/details/Caminandes2GranDillama">Caminandes +2: Gran Dillama from 2013</a>, where the year metadata field is +missing.</p> + +So, my proposal is simply, for every movie in The Internet Archive +where an IMDB title ID exist, please fill in these metadata fields +(note, they can be updated also long after the video was uploaded, but +as far as I can tell, only by the uploader): + +<dl> + +<dt>mediatype</dt> +<dd>Should be 'movie' for movies.</dd> + +<dt>collection</dt> +<dd>Should contain 'moviesandfilms'.</dd> + +<dt>title</dt> +<dd>The title of the movie, without the publication year.</dd> + +<dt>date</dt> +<dd>The data or year the movie was released. This make the movie show +up in Butter, as well as make it possible to know the age of the +movie and is useful to figure out copyright status.</dd> + +<dt>director</dt> +<dd>The director of the movie. This make it easier to know if the +correct movie is found in movie databases.</dd> + +<dt>publisher</dt> +<dd>The production company making the movie. Also useful for +identifying the correct movie.</dd> + +<dt>links</dt> + +<dd>Add a link to the IMDB title page, for example like this: &lt;a +href="http://www.imdb.com/title/tt0028496/"&gt;Movie in +IMDB&lt;/a&gt;. This make it easier to find duplicates and allow for +counting of number of unique movies in the Archive. Other external +references, like to TMDB, could be added like this too.</dd> + +</dl> + +<p>I did consider proposing a Custom field for the IMDB title ID (for +example 'imdb_title_url', 'imdb_code' or simply 'imdb', but suspect it +will be easier to simply place it in the links free text field.</p> + +<p>I created +<a href="https://github.com/petterreinholdtsen/public-domain-free-imdb">a +list of IMDB title IDs for several thousand movies in the Internet +Archive</a>, but I also got a list of several thousand movies without +such IMDB title ID (and quite a few duplicates). It would be great if +this data set could be integrated into the Internet Archive metadata +to be available for everyone in the future, but with the current +policy of leaving metadata editing to the uploaders, it will take a +while before this happen. If you have uploaded movies into the +Internet Archive, you can help. Please consider following my proposal +above for your movies, to ensure that movie is properly +counted. :)</p> + +<p>The list is mostly generated using wikidata, which based on +Wikipedia articles make it possible to link between IMDB and movies in +the Internet Archive. But there are lots of movies without a +Wikipedia article, and some movies where only a collection page exist +(like for <a href="https://en.wikipedia.org/wiki/Caminandes">the +Caminandes example above</a>, where there are three movies but only +one Wikidata entry).</p> + + + Legal to share more than 3000 movies listed on IMDB? http://people.skolelinux.org/pere/blog/Legal_to_share_more_than_3000_movies_listed_on_IMDB_.html @@ -729,54 +844,5 @@ with GNU Radio on raspbian, causing glibc to abort().</p> - - Datalagringsdirektivet kaster skygger over Høyre og Arbeiderpartiet - http://people.skolelinux.org/pere/blog/Datalagringsdirektivet_kaster_skygger_over_H_yre_og_Arbeiderpartiet.html - http://people.skolelinux.org/pere/blog/Datalagringsdirektivet_kaster_skygger_over_H_yre_og_Arbeiderpartiet.html - Thu, 7 Sep 2017 21:35:00 +0200 - <p>For noen dager siden publiserte Jon Wessel-Aas en bloggpost om -«<a href="http://www.uhuru.biz/?p=1821">Konklusjonen om datalagring som -EU-kommisjonen ikke ville at vi skulle få se</a>». Det er en -interessant gjennomgang av EU-domstolens syn på snurpenotovervåkning -av befolkningen, som er klar på at det er i strid med -EU-lovgivingen.</p> - -<p>Valgkampen går for fullt i Norge, og om noen få dager er siste -frist for å avgi stemme. En ting er sikkert, Høyre og Arbeiderpartiet -får ikke min stemme -<a href="http://people.skolelinux.org/pere/blog/Datalagringsdirektivet_gj_r_at_Oslo_H_yre_og_Arbeiderparti_ikke_f_r_min_stemme_i__r.html">denne -gangen heller</a>. Jeg har ikke glemt at de tvang igjennom loven som -skulle pålegge alle data- og teletjenesteleverandører å overvåke alle -sine kunder. En lov som er vedtatt, og aldri opphevet igjen.</p> - -<p>Det er tydelig fra diskusjonen rundt grenseløs digital overvåkning -(eller "Digital Grenseforsvar" som det kalles i Orvellisk nytale) at -hverken Høyre og Arbeiderpartiet har noen prinsipielle sperrer mot å -overvåke hele befolkningen, og diskusjonen så langt tyder på at flere -av de andre partiene heller ikke har det. Mange av -<a href="https://data.holderdeord.no/votes/1301946411e">de som stemte -for Datalagringsdirektivet i Stortinget</a> (64 fra Arbeiderpartiet, -25 fra Høyre) er fortsatt aktive og argumenterer fortsatt for å radere -vekk mer av innbyggernes privatsfære.</p> - -<p>Når myndighetene demonstrerer sin mistillit til folket, tror jeg -folket selv bør legge litt innsats i å verne sitt privatliv, ved å ta -i bruk ende-til-ende-kryptert kommunikasjon med sine kjente og kjære, -og begrense hvor mye privat informasjon som deles med uvedkommende. -Det er jo ingenting som tyder på at myndighetene kommer til å være vår -privatsfære. -<a href="http://people.skolelinux.org/pere/blog/How_to_talk_with_your_loved_ones_in_private.html">Det -er mange muligheter</a>. Selv har jeg litt sans for -<a href="https://ring.cx/">Ring</a>, som er basert på p2p-teknologi -uten sentral kontroll, er fri programvare, og støtter meldinger, tale -og video. Systemet er tilgjengelig ut av boksen fra -<a href="https://tracker.debian.org/pkg/ring">Debian</a> og -<a href="https://launchpad.net/ubuntu/+source/ring">Ubuntu</a>, og det -finnes pakker for Android, MacOSX og Windows. Foreløpig er det få -brukere med Ring, slik at jeg også bruker -<a href="https://signal.org/">Signal</a> som nettleserutvidelse.</p> - - -