BHL
Archive
This is a read-only archive of the BHL Staff Wiki as it appeared on Sept 21, 2018. This archive is searchable using the search box on the left, but the search may be limited in the results it can provide.

Notes on Metadata from SIL

Tom Garnett, SIL, Notes
Article-Level Access – Too Good to Be True?
SIL Cataloging and Digital Library staff recently met with Dr. Thom Hickey from OCLC, http://www.oclc.org/research/staff/hickeyt.htm , to discuss several metadata and BHL-related issues, which we will Thom’s blog, BTW, is at http://connect.educause.edu/aggregator/sources/37 or http://outgoing.typepad.com/outgoing/

We had shared the BHL Prospectus, the draft Union Catalog Document, and some other BHL draft documents with Dr. Hickey and we also further discussed the vision at our meeting. I want to focus on one particular set of issues he raised. We explained that in our preliminary thinking BHL members felt that a researcher must be able to get from a citation (often in that cribbed, cryptic format that one finds in many older taxonomic texts) to the relevant text. In our preliminary thinking we had discussed that Union Catalog must be able to somehow provide article level access to the BHL content and that there would be article level metadata to do it. Dr. Hickey cautioned us here.

He said we might end up paying more for the metadata in this case than for the digitized content to which the metadata refers. He also wondered if by insisting on article-level metadata we would be making the Union Catalog so complex as to drag the project on for many, many years. I feel he had a point.

If the BHL project implies that human beings will create article-level metadata before, during, or after the scanning process, it will require so many resources that it will probably never get done. I am of this opinion because recently SIL catalogers conducted a long-telephone interview with John Kiplinger of Jstor about their work flow which gives you an example of the complexity of the issues involved. (notes attached). Jstor has been a model of well-done article-level access for a major digitizing effort.

What are the alternatives?
Have scanners assign a GUID to any article and worry about it later? Maybe create an index compilation of Index Kewensis, Index Animalium, Nomenclature Zoologicus, etc. to which the scanners or someone else can link the GUID? This would probably lack Zoo Record and would the linking would not be simple.

Perform OCR on anything that looks like an article title and allow dirty (and it would be very dirty) OCR searching against that to locate the article?

Other ideas?


Tom Garnett, SIL