From: Petter Reinholdtsen Date: Sun, 19 Mar 2017 06:52:22 +0000 (+0100) Subject: New post about noark. X-Git-Url: http://pere.pagekite.me/gitweb/homepage.git/commitdiff_plain/53798080ccfe7c0d9ad356a83ded5b219b186c9b?ds=inline New post about noark. --- diff --git a/blog/data/2017-03-19-noark5-nikita.txt b/blog/data/2017-03-19-noark5-nikita.txt new file mode 100644 index 0000000000..7f527affcb --- /dev/null +++ b/blog/data/2017-03-19-noark5-nikita.txt @@ -0,0 +1,91 @@ +Title: Free software archive system Nikita now able to store documents +Tags: english, standard, nuug +Date: 2017-03-19 08:00 + +

The Nikita +Noark 5 core project is implementing the Norwegian standard for +keeping an electronic archive of government documents. +The +Noark 5 standard document the requirement for data systems used by +the archives in the Norwegian government, and the Noark 5 web interface +specification document a REST web service for storing, searching and +retrieving documents and metadata in such archive. I've been involved +in the project since a few weeks before Christmas, when the Norwegian +Unix User Group +announced +it supported the project. I believe this is an important project, +and hope it can make it possible for the government archives in the +future to use free software to keep the archives we citizens depend +on. But as I do not hold such archive myself, personally my first use +case is to store and analyse public mail journal metadata published +from the government. I find it useful to have a clear use case in +mind when developing, to make sure the system scratches one of my +itches.

+ +

If you would like to help make sure there is a free software +alternatives for the archives, please join our IRC channel +(#nikita on +irc.freenode.net) and +the +project mailing list.

+ +

When I got involved, the web service could store metadata about +documents. But a few weeks ago, a new milestone was reached when it +became possible to store full text documents too. Yesterday, I +completed an implementation of a command line tool +archive-pdf to upload a PDF file to the archive using this +API. The tool is very simple at the moment, and find existing fonds, +series and files while asking the user to select which one to use if +more than one exist. Once a file is identified, the PDF is associated +with the file and uploaded, using the title extracted from the PDF +itself. The process is fairly similar to visiting the archive, +opening a cabinet, locating a file and storing a piece of paper in the +archive. Here is a test run directly after populating the database +with test data using our API tester:

+ +

+~/src//noark5-tester$ ./archive-pdf mangelmelding/mangler.pdf
+using arkiv: Title of the test fonds created 2017-03-18T23:49:32.103446
+using arkivdel: Title of the test series created 2017-03-18T23:49:32.103446
+
+ 0 - Title of the test case file created 2017-03-18T23:49:32.103446
+ 1 - Title of the test file created 2017-03-18T23:49:32.103446
+Select which mappe you want (or search term): 0
+Uploading mangelmelding/mangler.pdf
+  PDF title: Mangler i spesifikasjonsdokumentet for NOARK 5 Tjenestegrensesnitt
+  File 2017/1: Title of the test case file created 2017-03-18T23:49:32.103446
+~/src//noark5-tester$
+

+ +

You can see here how the fonds (arkiv) and serie (arkivdel) only had +one option, while the user need to choose which file (mappe) to use +among the two created by the API tester. The archive-pdf +tool can be found in the git repository for the API tester.

+ +

In the project, I have been mostly working on +the API +tester so far, while getting to know the code base. The API +tester currently use +the HATEOAS links +to traverse the entire exposed service API and verify that the exposed +operations and objects match the specification, as well as trying to +create objects holding metadata and uploading a simple XML file to +store. The tester has proved very useful for finding flaws in our +implementation, as well as flaws in the reference site and the +specification.

+ +

The test document I uploaded is a summary of all the specification +defects we have collected so far while implementing the web service. +There are several unclear and conflicting parts of the specification, +and we have +started +writing down the questions we get from implementing it. We use a +format inspired by how The +Austin Group collect defect reports for the POSIX standard with +their +instructions for the MANTIS defect tracker system, in lack of an official way to structure defect reports for Noark 5 (our first submitted defect report was a request for a procedure for submitting defect reports :). + +

The Nikita project is implemented using Java and Spring, and is +fairly easy to get up and running using Docker containers for those +that want to test the current code base. The API tester is +implemented in Python.