Botanicus Ingest Workflow
Botanicus Steps to Add a Volume Scanned Elsewhere
Pre-Ingest
- 1. Inspect submitted files on drive/media
- a. Ensure or rename files in form “itemidentifier_seriesnum.ext”
- i. If not named correctly, then use Adobe Bridge or similar to batch rename.
- b. Ensure JPG or TIF format
- i. If not in correct format, then use Adobe Photoshop or similar to convert
- c. Copy into a folder named “itemidentifier”
- 2. Locate appropriate MARC file for items
- a. Accept from org providing files
- b. Download from MOBOT/BHL partner catalog
- c. Download from OCLC
Staging
- 3. Convert MARC to MARCXML using MarcEdit
- 4. Edit MARCXML using a simple text editor to include a 945 for item to load
- a. Example for a serial: http://www.botanicus.org/MARCXML/b11871283.xml
- b. Example for a monograph: http://www.botanicus.org/MARCXML/b11932077.xml
- c. *MOBOT Specific*
- i. This is where item information is stored in MOBOT’s MARC.
- ii. Not every library uses this field for similar information
- iii. Not every library has this information in MARC
- 1. Sometimes in holdings records
- 2. Sometimes nothing at all
- d. Add following subfields for each 945:
- i. a$ - Item Call Number
- ii. c$ - enum & chron (Volume & Date)
- 1. In MOBOT practice this is one field
- 2. In Wonderfetch these are separate, as they should be in the database
- iii. i$ - Barcode
- 1. This value must match “itemidentifier” in step 1a. [double check]
- iv. y$ - ILS Itemid
- e. Save MARCXML
- 5. Use MARCXMLImport to push info into Botanicus DB schema
- a. Open MarcXMLImport
- b. Browse to MARCXML file from 4c
- c. Parse
- d. Double check that titles, authors, subjects, publication details, item info looks in order
- e. Submit info to Botanicus
- i. This creates or updates a title record with title, author, publication details, MarcBibID.
- 1. Status set PublishReady=0
- ii. Each 945 is turned into an item record
- 1. Status set Item.StatusID=30 [double check]
- 6. Set up PageConvert on a local PC to operate on files
- a. Change config to point to current location of “itemidentifier” folders
- b. Ensure enough room for temp JP2 files on default location, or change in config
Ingest
- 7. Run PageConvert
- a. PageConvert will convert each .TIF or .JPG in the “itemidentifier” folder(s) into JP2.
- i. These JP2 files are stored in a default, but configurable location on local hard drive.
- b. Once JP2 conversion is done, PageConvert copies each “itemidentifier” folder to current vault in Botanicus, as indicated by value in Botanicus DB.
- i. \\server\vault\MARCBibID\ItemIdentifier\
- c. Once each volume has been uploaded, its Item.StatusID is set to 30 [double check]
- d. Email confirming PageCovert action is run with success or error reported
- 8. PagePublish runs at scheduled intervals
- a. Looks for any items with Item.StatusID=30
- b. Looks for JP2 in vault\MarcBibID\ItemIdentifier\JP2
- c. Adds a new record to Page table for each JP2
- i. Assumes sequence of files is correct on file system
- ii. Sets each Page to Active
- d. Upon completion of job, sets Item.StatusID=40 & Title.PublishReady=1
- e. Writes out PDF .go & .dat files (that include information for automatically processing.
- f. Write out OCR .job file.
- 9. Remove TIF or JPG from any temporary local directories that might have been created in any part of above process.
Post-Production
- 10. Book is now online & viewable
- a. PDF uses instruction files to create PDF & copy it to: \\server\\vault\\MarcBibID\\Itemidentifier
- b. OCR uses instruction file to write OCR to [[file:///\\server\\vault\\|\\server\vault\]]... [double check]
- 11. Imaging Technicians use Paginator to add following details to each page:
- a. Page Prefix
- b. Page number
- c. Page Type
- d. Volume
- e. Issue
- f. PartPrefix
- g. Part
- h. Year