Archival_Storage_Ingest_Script
common tasks
- ingest books: bhladmin@bhl-test2: php ingestSIP.php
- You must be sure that when you add new object, the name doesn't start with a . (dot) or INGESTED_BOOK_.
- When it finishes adding a object, it renames the object folder to INGESTED_BOOK_old_book_folder.
location
- script: ~/test/ingest/ingestScript.php
- books: ~/test/ingest/books
dependencies
- PHP
- Tesseract OCR (optional) : We use this OCR engine to generate page OCR.
- kakadu : We need kakadu to convert images
modifying config files
All config info is placed in the head of ingestScript.php as followed:
how to test if it works
- Leave just one book with several pages, and run the script.
- If they can be ingested into Fedora successfully, then it works.
login information if necessary
proxy information
hierarchy of book folder
- books
- book_1
- pages
- page_1.tiff (or .jpeg)
- page_2.tiff
- TN.jpeg
- book_2
- ......