Ingest Decisions
Duplication of content in BHLDupes in scanning workflowContributorsMetadata Quality Image QualityResource AllocationScopeSearch/BrowseControlled Vocabulary Ingest Methodology
Duplication
Duplicates in BHL collection
Decision
- They Exist: Duplicates already exist within the BHL collection.
- Gap-Fills: Some of these "duplicates" are in fact gap-fills that need to be merged via portal editing. Portal editing policy decisions will need to be made but, generally, the institution that contributed the first volume of the serial should be considered the "parent" institution, meaning that their MARC record will be considered THE representative record for the title in the BHL collection. In the case where the parent institution's MARC record is insufficient, i.e. lacks subject classification, OCLC no., etc. the gap-filling institution can assume the "parent" role.
- Tabled into Portal editing discusssion
- Attribute IA ingested content as contributing library = "archive.org: University of Michigan"
- Legitimate Dups: Other duplicates are legitimate and need to be managed as cleanly and realistically as possible to benefit our users
- duplicates among BHL member libraries will need to be investigated on a case by case basis and addressed via Gemini
- duplicates between a BHL member library and IA: ideally, IA copies should be "ranked" below BHL member libraries
- could be addressed manually via portal editing via the 'sort title' field
- could be addressed via an automated process? Can we have IA results always fall under BHL member results?
Duplicates in Scanning Workflow
Decision:
Monographs
- Use standard monographic deduping tool as is to identify duplicates as best as can be achieved for content contributed by member libraries and content ingested from IA.
- Prioritize deduping against BHL member libraries
- Investigate and assess deduping against IA monograph set Monographic Deduper-- shall we form a committee?
- Duplicates identified in IA should be identified as "rejected
Serials
- Serial deduper only a short term solution for BHL member content - continue to use for BHL member libraries
- Short Term option #1: submit list of IA serials to serial deduper (as was performed for monographic deduper)
- Short Term option #2: Serials treated as "monographs" in IA data structure anyway, so won't they also turn up as part of the monographic deduper
- Long Term: BHL-E to develop deduplication tools in the future to address deduplication of monographs & serials in a more robust platform
Contributors
Decision:
- Consolidate contributors via IA ingest as a single entity
- Include list of contributing libraries via IA ingest in BHL About language on public facing wiki
- Allow for search by institution in Advanced search only
Metadata Quality Issues
Decision:
- Short Term: Use Gemini to address metadata concerns on a case by case basis as users inform us
- Long Term: Encourage metadata curation as part of our workflow = hire metadata specialist to address BHL metadata overall
Image Quality Issues
Decision:
- Short Term: Use Gemini to address image quality concerns on a case by case basis as users inform us
- enable insertion of blank "page unavailable" page into IA ingested content
- send item for scanning if in BHL member library holdings
- flag item as incomplete
Resource Allocation
Decision:
- Long Term: Encourage and fund BHL member participation by way of a portal editing role; BHL members do not need to only scan content, they must also participate in metadata curation role
Scope
Decision:
Search / Browse
Decision:
Controlled vocabularies
Decision:
Incorporate into routine workflow
need method via Admin Dashboard to merge creators and edit in bulk
big names addressed on occasion on back end
Ingest Methodology
Decision: