BHL
Archive
This is a read-only archive of the BHL Staff Wiki as it appeared on Sept 21, 2018. This archive is searchable using the search box on the left, but the search may be limited in the results it can provide.

Archival_Storage




Archival Storage Ingest Script


The ingestSIP.php script browses the SIP folder and ingest new digital object to Fedora.
You must be sure that when you add new object, the name doesn't start with a . (dot) or INGESTED_BOOK_.
When it finishes adding a object, it renames the object folder to INGESTED_BOOK_old_book_folder.

Requirements


Configuration

Before running this script you must set up 3 global parameter in this file (ingestSIP.php) :
  1. BHL_NAMESPACE : Used to create the book pid.
  2. BHL_SIP_DIRECTORY : The folder where the script will find books.
  3. LOOP_INTERVAL : Interval time that the script sleep before checking for new book.

Run It

To launch the script, just run this command : "php ingestSIP.php".

SIP Hierarchy

On each SIP we must use some structure, for example :
- a book named OriginOfSpecies must be on a folder named also OriginOfSpecies.
- in this folder we must have a folder named pages witch will contain the pages images (jp2 or tiff).

TODO :

Finish the script to consider the taxonomy datastreams and other needed datastreams.



Low Level Storage based on Akubra


The strategy of low level storage depends on DataStream IDs, which is declared in ${FEDORA_HOME}/server/conf/akubra-llstore.xml.
After Fedora Commons 3.4, Akubra is used as the default low level storage framework, which is configured in ${FEDORA_HOME}/server/conf/fedora.fcfg).

How to compile

  1. Download the maven project from BHL-E Github;
  2. Run "mvn package".

How to install

  1. Purge all objects in Fedora Repository, then shut down the server;
  2. Place akubra-mux-0.3.jar and bhle-llstore-0.0.1.jar in ${FEDORA_HOME}/tomcat/webapps/fedora/WEB-INF/lib;
  3. Replace akubra-llstore.xml with the one in the install package, and modify the store paths and DataStream IDs according to your needs;
  4. Restart the server.

How it works

A subclass of org.akubraproject.mux.AbstractMuxConnection overrides the getStore method to provide BlobStore according to the keywords of DataStream IDs in akubra-llstore.xml. And the filesystem storage is reused from akubra-fs (simple filesystem implementatio) and akubra-map (wraps an existing BlobStore to provide a blob id mapping layer) without any modification. Therefore, all the path mappings for objects and datastreams are still based on MD5 mapping.