04242009DedupeMtg Notes
Dedupe Group conf. call Notes
4/24/2009
Purpose
Purpose of rough requirements: To guide reconstruction of master bid list + monographic deduper for BHL-E (they will be funding a new development effort). The purpose is not to reconstruct existing bid-list unless we are considering re-uploading everything again, but we don’t want to lose what we’ve already accomplished!
Address the following:
- What are the immediate problems and workarounds suggested to address these problems?
- Long-term goals?
- Demonstrating where we are for BHL-US and catalyzing vision for BHL-E
Rough requirements will be presented in the form of a laundry list with priorities identified due 5/8 (list of reconstructive elements, not “enhancements” to present tools)
Major Issues
Cataloging issues re: union catalog among (academic/museum*) institutions: some records have OCLC nos. while others have ISSN nos., only reliable connection is the title
- Need proper programmer, OCLC algorithm is relatively weak, there are big companies working on these kinds of issues – how can we fully address automated merging?
- (Are there companies that would be willing to work with us…for cheap…what does IA do to manage duplicates? Probably nothing)
Problems are wider than deduping – We have the opportunity to create a journal authority list w/ various access points -- Need a JournalBank to support guidelines for merge facility, i.e. suggesting titles to be merged manually and automatically (ideally) [already exists somewhere?]. Merging on monographs too big of an issue
(why again?)
Concept
Produce a master bid list + monographic deduper that allows users to work with one discrete set (monographs or serials) at a time; master list with artificially separate sets for monographs and serials; functionality for bidding and merging thus become 2 different levels of activity within each set
- Need to retain ability to bid on serials
- Need ability to automate title merging (for serials only, see above?)
- Ability to upload bids in XL file (for monographs)
- Clarify how this addresses issues re: monographic series
- Present scenarios of use for each set w/in context of master list
MARC Issues
One master list needs automated matching on MARC fields: 440, 490, 780 & 785
MARC leader changes w/ updates to 856 fields and
whenever there are updates to the catalog record
· Match of last resort
· Need some ability to match leaders to portal è to think more about how to use them…
Functionality
· Diacritics to be fixed
· Improved journal search, by keyword for example
· Improve filtering capability, by institution for ex.
· Improve sorting capability, by bid status for ex.
· Filter by subject è multilingual issues, too big to tackle
· Allow tool to be able to import datasets like those that will be used to support material ingest, CDL for ex.
· Allow for automated merging of duplicate records = OCLC algorithm
· Ability to upload in bulk and/or place single manual bids
[need clarification]
Reports
Need ability to link what’s been scanned with bid list è requires ID that maintains throughout the whole process (w/ titles this is easy, but volumes?!)
Intellectual Property Rights
Short-term: Establish communication between external permissions/portal request system that grabs titleIDs from master bid-deduper list; requires announcements on update as to priority titles with regard to permissions, request, and gap fills.
Long-term: Integration of permissions/portal requests with master bid-deduper list
Other Issues
· Table: bidding ahead è do we want public interface? (current list is public BTW)
·
Ingest issues: how to automatically enter into bid-list titles that are already digitized
Synchronization of bid list & deduper
· Item level chronology & enumeration correction?
· Helpful to see vols. & years in bid list
· Can we itemize in bid list to clarify è Communication is needed between portal and bid list for ex. “The following vols. have been scanned”
By Institution:
· MBLWHOI: Matt P. changing bids from full to partial, getting bids in order now
· NHM: does not go back to bid-list to edit bids for in-house rejects or IA rejects – goes to shelf first
· MCZ: Joe goes to shelf-first, bids on what can be scanned, evaluates rejects – IA rejects then goes back to bid list to change
· SIL?
· NYBG? No…
· AMNH Does not bid ahead, bidding consistent with picklist; picklist : packlist ::1:1
Practically speaking you shouldn’t have to go back to do these kinds of edits
· Explore automation via © status for ea. Institution (where MCZ & NHM are different than rest)
o Enter © years – ppl. Have different years
o Insert © cut-off bid-holdings è gap-fills
- What are the differences between university and museum libraries in terms of their cataloging practices? (Research unionization of WRLC catalog)