BHL in OCLC
Back to
Metadata
Email 11/20/2017
Hi Bianca: Yes, I’ve confirmed that the BHLMR holdings are on that record you cite. It was created as part of an OCLC process for creation of records for Hathi Trust eBooks. The holding for BHLMR appears legitimate to me and leads to the correct thing.
Current count of BHLMR holdings in WorldCat: 15,315
You can get this by doing a holding library search “li:BHLMR”
The new MODS to MARC conversion and then load will not, if all goes well, create duplicates. Our Data Sync process has extensive matching algorithms built into it; so, ideally, if there is a matching record already in WorldCat, your record will match to it. We know this won’t work perfectly, but it ought to work decently well.
The only number I’m really concerned with are the
99 with BHLMR still listed as the cataloging source. Those are ones we may want to try to match manually after the load. But 99 is doable. ---Cynthia
Convo 11/9/2017
Cynthia has a new employee Brian who can work with converting MODS to MARC which is great news!
She will forward his notes about working with our MODS
[ ] identify ball park of how many duplicates
[ ] send Cynthia copy of BHL OCLC agreement
control numbers = OCLC control numbers
OCLC willing to do custom work for us
may be financial implications
hopefully before end of year to load records!
Robin and Gil found BHLMR record
see OCLC 880502044
http://www.worldcat.org/oclc/880502044
October 27, 2017
followed up with Cynthia
Turns out this doesn't have a BHL holdings symbol on it at all. It's just a Cornell created record that points to a digital book on BHL.
sent Cynthia a note to let her know I was back on 10/2/2017
Email exchange on May 5 2017
Summary: OCLC still does not have a MODS to MARC process secured but they hope to have one by the end of the calendar year. Worst case, Mike has suggested a plan B to convert MODS to MARCXML but this is not without potential problems. Bianca to revisit status with OCLC post maternity leave (estimated October 2017).
Subject: Re: MODS to MARC
Sent: Tuesday, June 06, 2017 12:44 PM
Bianca,
There are tools for programmatically converting MARCXML to MODS and MODS to MARCXML available from Library of Congress (
https://www.loc.gov/standards/mods/mods-conversions.html). I expect that we could not just use them as-is without evaluating them first. It is likely that we would have to do a fair bit of testing, and possibly modify the tools a bit, to ensure that the MARCXML produced was at least “good enough”.
I am not volunteering to do that evaluation/testing/conversion, unless instructed to. However, be aware that the option exists, if all else fails.
MIKE
Sent: Tuesday, June 06, 2017 10:54 AM
Hi Cynthia,
Thanks for the update. It’s too bad that you received false information but it is understandable. Conversion of MODS files is really the best option for us.
Our MARCXML is used as the basis for *sourcing* metadata for the titles digitized into our database. Therefore the MARCXML is based on the print versions and not the updated digital versions we hold in our library. These records would lack the 856 URLs to their digital surrogates for example which would be a major drawback. It’s also somewhat tricky for us to provide access to our MARCXML as it is stored more deeply than our MODS (Mike can explain better if details are necessary).
I appreciate you continuing to work on this and by the end of the calendar year is not a problem. We’d prefer to find a practical, correct, long term solution to updating our records in OCLC and are willing to wait. If you hear anything further, please feel free to keep us posted — even if I am on leave. I will check in with you when I am back in the office. I anticipate being out from the end of July to the end of September.
Thank you,
Bianca
Date: Monday, June 5, 2017 at 17:52
Hi Bianca:
Apologies for not responding sooner. I keep hoping I’ll have better news for you.
So, here’s the story. The person who told me that conversion of MODS files could happen in our Leiden office was mistaken. I am sorry I gave you incorrect information.
The developers who work on our new Data Ingest system have long been promising that we would be able to load MARCXML soon, though soon keeps taking longer than I expect. Am I correctly remembering that you have MARCXML available? If yes, I am hopeful we will be able to load that sometime this summer.
That may well not happen before your maternity leave. I am sincerely sorry that this has turned into such a long and drawn out process.
We will reach the end of it! And, before the end of the calendar year.
Sincerely,
Cynthia
March 22 2017 OCLC call (WorldCat KB)
I learned today that in Feb 2010, Tom Garnett and Graham Higley signed an agreement with OCLC to allow BHL metadata to be indexed within their systems. This agreement allows BHL metadata to be a part of:
- WorldCat bib catalog (MARC)
- WorldCat knowledge base (KBART)
For the record, the document is attached and on the LAN here: S:\ISD\BHL\Metadata\OCLC\METADATA AGREEMENT FULLY-EXECUTED 20100304 (1).pdf
1) WorldCat Bib Catalog (MARC)
- I am working with Cynthia Whitacre to update BHL’s MARC records
- Most of the old BHL MARC records loaded back from 2010 have been eliminated
- Challenges remain for getting new MARC records loaded into the system as BHL provides MODS, not MARC data exports
- Hoping to work with OCLC to figure out how to parse MODS to create the MARC needed (better option), or direct OCLC to original MARC from which BHL records are based (decent option)
2) WorldCat knowledge base (KBART)
- I spoke to Stephanie Doellinger and Tim Martin today about updating the BHL.ebook collection records in the WorldCat KB
- Current records are problematic because they are incomplete, sourced from our data exports title.txt file (http://www.biodiversitylibrary.org/data/title.txt) causes some bad links between WorldCat KB records and BHL content
- Upon request for removal of the records, Stephanie proposed a “deprecate and replace” plan which will flag existing records to show they are no longer being updated until BHL is ready to submit KBART files - completed as of April 2017
- 92 libraries currently subscribe to BHL KB records via WorldCat
- OCLC looking forward to receiving BHL KBART records and offered us assistance reviewing our files
Notes from call on 3/22/17
with Stephanie Doellinger and Tim Martin
- wrong links
- knowledge base - how was BHL data pulled for this?
- removing data
1) based on identifiers, if incomplete metadata then bad links
OCLC active w/ KBART std
depricated ==> would show as flagged for no longer updated
better to have titles & URLs replaced once ready to send an e.g. rather than starting over completely
OCLC has 115K BHL titles at present
how many libraries would be impacted? 92
2) original company openly informatics 2010 or 2012
taking from /data/title.txt
in 2014 BHL ok'd agreement
3) rather than removal, mark as depricated records
OCLC via Stephanie's team has offered to help with our KBART review when we are ready
(kb-data@oclc.org )
Jan 2016 Tech call
ML's concerns:
- we have MARCXML at the item level so if a title has items from multiple libraries there are multiple MARC records for that one title
- 1500 titles w/out MARC records at all
- we’d need to add 856s to point to the digital copies - how would we do this?
- wouldn’t we want records to represent the digital copies instead of the print copies
- Could we construct MARC records from our title level records in our DB?...
- parsing MODS would be easier technically
- is it MODS in general or BHL’s MODS?
Nov 2016
via phone call w/ Cynthia Whitacre
How many records are left? CW to get a list
OCLC loading records via a new sync process to load MARCXML
Testing by end of year/Jan 2017
transitionaing all loading projects - can load MARC now but not MARCXML - anticipate ability to load MARCXML by early January 2017
had trouble with MODS but MARC (.mrc) is possible
focus on title records
how many titles do we add per week/month?
BHL to decide about sending records in some interval
FTP account to set up for us
load one big XML file w/ all records
data sync processing
Aug 2016
via email from Cynthia
OCLC has deleted of 221,942 records
1038 records could not be deleted due to other holdings.
BHLMR_not_deleted.txt
You can decide what you would like to do with those records. You can leave them as is or you may decide to undertake a project to upgrade them using the MODS records you have created. Your best bet for doing that is to work online, using Connexion, since batch matching of your new records to these records is not possible on OCLC’s end. While 1,038 is a large number, it is quite manageable compared to over 220,000!
Identified holdings at COO, LGG, gbt, HLQ so far and contacted Cornell... -
Aug 3, 2016
Suzanne has helped determine that we still do not need a login for BHLMR, the SIL login to Connexion is sufficient for "matching" records
July 2016
Bianca talked to Cynthia Whitacre and Carly Bogen of OCLC 7/15/16
Notes
OCLC Digital Content Gateway records do not match or deduplicate with the rest of the WorldCat system
this is how many BHL records were created (if not all)
Goal to delete all records for BHLMR (BHLMR is holdings symbol) -- Carly will work on deleting all those records
after confirming that this is done we can proceed with loading fresh records
depends on Patty’s workload, normally within a week or 2
could be done by the end of the calendar month
then clean slate to load fresh data into WorldCat
Title level records
MODS is easiest for BHL to provide to OCLC
not through Digital Content Gateway, instead use batch project loading process which will put BHL records through traditional matching
Cynthia needs to check about whether or not we can load MODS
set holdings symbol for BHLMR
Caveat: can’t delete BHLMR records that have holdings from other libraries
can delete records w/out holdings
Originally recs brought in via Digital Content Gateway
Can’t take links to electronic records and auto associate them with print records
222,960 records - most classified as “downloadable archival material” which is default via the Digital Content Gateway
but BHL only has approx. 110,000 Title records right now
sounds like a duplication problem!
to proceed with the deletion that they can do and can report on quantities deleted and what they needed to retain
[X] send Cynthia a dozen records of what we would prefer to send instead so that she can decide what is the best strategy to load records
informed her that we load new records on a weekly basis
batch load project is best way to go
for future loading of BHL records we have to send them on a periodic basis
batch loading not set up for automated processing
Yes, proceed with deletion!
software linking print records to electronic records runs automatically on OCLC's side, strictly for Google and Hathi - not available for others :(
June 2016
Suzanne contacted OCLC June 2016
- Now Martin is the contact for BHLMR - our account number is 01OCLC20095062.
- Meanwhile, we do NOT have a log on account and password to catalog directly into OCLC.
- Do we want to arrange to get one? I'm not sure what it will entail.
See also
BHL OCLC Synchronization