Ingest Review Strategies
Back to
Collection Development |
Collections Committee Discussion
BHL Ingest Criteria Review
Summary for the IC
Collections & Ingest Committee: Bianca Lipscomb, Don Wheeler, Grace Duke, Suzanne Pilsk, Connie Rinaldo, Christine Giannoni, Becky Morin, Judy Warnement, Matt Person
The ongoing ingest of IA “non-biodiversity” collections, i.e. materials brought in as a result of non-BHL member scanning, has resulted in the acquisition of both relevant and irrelevant materials. The BHL Collections Committee has reviewed some of these materials and has identified the following examples of irrelevant content:
More examples:
Relevance and Collections
There is a need to review the Ingest Criteria, a series of LCSH terms and call nos. used to match against the IA corpus, and refine the list to include only those terms and call nos. that yield the highest return of relevant content. The Committee has implemented 4 strategies for review:
Relevant Files for Review
- IAAnalysisSubjectsAndCategories.xls (ca. November 2009)
- Sheet 1: List of LCSH ‘650 a’ used as part of Ingest Criteria; List of LCSH ‘650 v/x’ used to exclude certain titles from the Ingest; List of LC call nos. used as part of the Ingest Criteria
- Sheet 2: Combined list of…
- BHL LCSH 650s (= “BHL” in ‘desrip’ column; ‘Notes’ column indicates where term is 650 z) and counts indicating how often these terms occur in the BHL collection
- LCSH ‘650 a’ terms used as part of Ingest Criteria (= “used” in descrip column)
- Subjects in BHL - 2010-04-16.xls
- Sheet 1 = BHL Member Subjects – list of all 650 a LCSH which occur 5 or more times in the BHL collection for BHL member contributed titles only
- Sheet 2 = Non-BHL Member Subjects – list of all 650 a LCSH which occur 5 or more times in the BHL collection for Ingested titles only
- Sheet 3 = Combined List – lists from Sheet 1 and 2 combined with the list of 650 a terms used as part of the Ingest Criteria (see also above)
Review BHL Terms [Don, Matt]
- Look at the list of BHL subject keywords, derived from the ‘650 a’, and identify terms that are:
- “IN” – LSCH terms that we want to match against the ‘650 a’s for “non-biodiversity” titles in the IA corpus
- “OUT” – LSCH terms that, while relevant to biodiversity, we will not use because they are too broad in scope and result in a questionable ROI for the BHL collection.
- For example: BHL should NOT include the LCSH terms “Physiology” and “Reproduction” as part of the Ingest Criteria
- Note: To be on the safe side, Suzanne and I found that it was better to recommend exact LCSH terms and not wildcard-ed terms like “Bears*” (pulls “Bears on postage stamps”…[yes, a real LCSH]) or “*Ecology” (pulls Gynecology).
don_matt2Subjects in BHL - 2010-04-16.xls
Identify Good Ingest Terms [Connie, Judy]
- What terms are we bringing in as a result of the Ingest completed to date that are NOT already BHL terms and would be good additions to the Ingest Criteria?
- “Wilderness areas” and “Wildlife management” for example
BHL Ingest terms cr jw dw.xls
Identify Bad Ingest Terms [Christine, Grace]
- What terms are we bringing in as a result of the Ingest completed to date that are LCSH terms that we absolutely do not want as part of our criteria? This is an ALL OR NOTHING approach -- these terms would override any other terms also associated with the record.
- For example: BHL should exclude all records with the LCSH term = “God” from the Ingest pool.
Grace_Christine_ListofTermsNotToUseinBHLIngest.doc
Call nos. [Suzanne, Becky]
- What are the MARC tags, subfields, and indicators that we want to use for the Ingest Criteria? Suggested (indicators unknown):
- 050 a
- 090 a
- 082 a
- 092 a
- 852 j/h
- 099
- What Dewey nos. do we want to use? Dewey Analysis - BHL Member Libraries.xls
- Decision: Match Deweys to LC subclasses below
- Should/How can we refine the list of LC call nos. we are currently using as part of our Ingest Criteria?
- GC (Oceanograpy)
- Q [no other letter] (general science) -- Need to refine?!
- QH (natural history)
- QK (botany)
- QL (zoology)
- QR (microbiology) -- Do we want this one? - May 10, 2010
- S [no other letter] (agriculture) -- Need to refine?!
- SB (plant culture)
- SD (forestry)
- SF (animal culture)
- SH (fisheries and related)
- QE 700-999 (Paleontology) -- All QE NOT relevant
Review Excluded subject subdivisions [any volunteers?]
- Are there additional subdivisions that we want to exclude? – REMEMBER: the ALL OR NOTHING approach applies here too
- Are there any subdivisions that we would always want regardless of the ‘650 a’? Would it ever be the case that one of these subdivisions would NOT have a ‘650 a’ that’s already part of our criteria?
-
May 10, 2010 ruled out as a viable strategy
Code
|
Subject
|
v
|
Anecdotes
|
v
|
Directories
|
v
|
Drama
|
v
|
Folklore
|
v
|
Humor
|
v
|
Juvenile%
|
v
|
Lab manuals
|
v
|
Legends%
|
v
|
Poetry
|
v
|
Popular%
|
v
|
Readers
|
v
|
Textbooks
|
x
|
Fiction
|
x
|
Folklore
|
x
|
Juvenile%
|
x
|
Lab manuals
|
x
|
Legends%
|
x
|
Marketing
|
x
|
Philosophy
|
x
|
Poetry
|
x
|
Religious aspects
|
x
|
Social%
|
x
|
Study and teaching
|
x
|
Technique
|
-
Apr 21, 2010 I plan to take a look at a little bit of everything