BHL OCLC Synchronization
Metadata Issues
Records in BHL that do NOT have OCLC numbers (as of June 2010)
TitlesWithoutOCLC_06_2010.xls
June 04, 2010
Questions for OCLC and Mike as we move forward.
May 12, 2010
Phone call with Diana D, Joe DV, Chrs C, Bianca L, and Suzanne P. Discussed the needs of the Provider Neutral record and the DLF requirements.
Mapping of BHL data to Provider Neutral
here
Provider Neutral Monograph documentation
http://www.loc.gov/catdir/pcc/bibco/PN-Guide.pdf
Digital Master Registry Information:
http://www.diglib.org/collections/reg/reg.htm and benchmarks:
http://www.diglib.org/standards/bmarkfin.htm
Spring 2010
OCLC BHL Holdings Symbol:
BHLMR NUC MARC symbol:
DcWaBHL
Phone Call:
Attendance: Marianne (OCLC’s Data Ingest Person); Bill (OCLC’s Manager / Contact); Susan (OCLC’s Metadata / tags person); BHL people: Chris; Mike; Bianca; Suzanne
1. Loading records from BHL to WorldCat : MARC records would make it fast and simple. If not, there are other processes. An option is for Chris/Mike can take MARCXML run it through MARCEdit to produce a MARC record. Thumbs up from Marianne. Will have proper 856 link to BHL? Problem with the leader if we add 856?
Susan suggests URL in the 856 40 $u
Question: Other data to include? Chris asked about putting in the sponsor of the scan, etc.?
Hathi and Google are using a set of fields – Hathi Trust registered as digital master certified records. Committed to preserving. Set of fields 042 DLR others 506 533 538 583. Susan will provide a link to the specifications. Some fields it would be good if BHL provided, others OCLC can batch add to our records. OCLC is suppose to let us know what they can’t do.
Instead of initial plan of using WorldCat records and making BHL records from that – the synchronizing has changed. Originally Bill had planned to use BHL records as a trigger and find a print record and create a digital record. If BHL gives MARC for digital manifestatoin then would use if good enough.
Note: BHL is a blended library of data. Some records are excellent some are not. BHL could be reviewed by OCLC determined some full level and some as minimal.
BHL staff will need to talk to determine bestest. Write out MARC records as digital objects with key metadata about digital that is not coming from print edition.
Question: 260? Publisher – public domain BHL publisher? If use the “OCLC Power Loader” then it does take the print.
Marianne is the source to let BHL know what OCLC can and cannot do with the loading. She is the point person to cull the records for minimal/excellent/problem records
Approximately sending 30,000 records.
Records to have: BHL holdings, URL, digital record coding, register of digital masters,
Fixed fields, 043 DLR, key off do some values in fixed fields. Marianne will make recommendations of what things. Marianne can give feed back with test (that could be the entire set not a subset since it is only 30,000).
After initial load: issues about getting records on an on-going basis, Chris. Marianne can set up something up possibly having BHL send files.
Question from Chris: Digital OCLC numbers and Print OCLC numbers connected? Answer from OCLC: Not done in batch process. 776 $w field print oclc number. Hathi is telling the number. Grab the Worldcat record and then flip to digital. And touches print record and 776 $w for digital.
If not batching – Marianne can put the 776$w in for us. Definition is okay for that. We can tell them where the oclc number is and where you want to put it.
Customizable fields – consultative process Marianne is good at helping details. Hathi Library has a “contributing library” value. Holding symbol for owning library? Hathi don’t want that. University of California DOES want that. So they are doing that – BUT in a separate place related to MARC record not in MARC – registry entries. Give name of library address etc. Number in registry data pairs records. Probably not in MARC record. Why are people doing this? PR? Political? Necessity for xxxx?
Send/ Create what BHL wants digital records to look like – Samples. Send to Marianne
Steps: Marianne will work with Mike and either come and get – or we put in the ftp file. File comes to OCLC evaluates. Manual look at some that see better than tools. Based on tools on MARC – set up customize 007 standard, retag fields or subfields and indicator values. Run records against set up and refine.
For example: OCLC no in 035 sometimes 001 – Need to tell where the OCLC number is located
2. Provider Neutral records are coming to OCLC - add records – a process will come around automated – like editions of a digital book – all these are the same – rolled into a single record with links to BHL, Hathi, IA, Google with holdings for each digital project.
Old OCLC numbers never die - 019 in the new record holds old OCLC number. Resolution Table handles.
Send along – small sample or all – Marianne will take it from there.
Mike is the BHL and Marianne or her staff. Create set up etc.
Updates – if you send through the same record. Marianne will explain how. What happens when the provider neutral happens? What is the flow then? Still unclear.
Hathi does use the register to preservation elements. Retain $5 institution in preservation fields does stay in neutral record.
Susan will send DLR – information. Does NOT have provide neutral record.
PCC site has neutral record specs.
Marianne and Mike email connect – Bill will send along.
Timing issues. As soon as possible. Marianne schedules the batch load. Date will be generated. How often we put up to update. Monthly 1st day export. Either pick up or FTP.
January 22, 2010
Tom G, Bianca, Suzanne had phone conversation with Bill Carney to detail exactly what the BHL's membership MOU covered regarding metadata vs images. Bill will be contacting OCLC's legal staff and reporting back with language that would be needed to be secured for moving forward.
January 6, 2010
Bianca found in WorldCat? OCLC number 469678813
http://www.worldcat.org/oclc/469678813&referer=brief_results
3 Full text available from Biodiversity Heritage Library u
http://www.archive.org/details/exoticmicrolepid02meyr z Connect to full text.
Suzanne changed it to this
3 Full text available from Biodiversity Heritage Library u
http://www.biodiversitylibrary.org/bibliography/9241 z Connect to full text.
Question: How did this record get put in? Why did it say BHL but pointed to IA? Can we identify more in WorldCat that say they are BHL but point to IA?
October 7, 2009
Mike L. sent Bill Carney at OCLC data for matching in OCLC to create digital records for BHL titles.
From:Mike Lichtenberg
Sent: Tuesday, October 06, 2009 4:32 PM
To: Carney,Bill
Attached is a ZIP archive containing all of the MARC records for titles currently in BHL, as well as a spreadsheet that cross-references each of the MARC records with the title’s BHL Url.
May 7, 2009
OCLC: Susan Westberg and Bill Carney
BHL: Mike Lichtenberg, Chris Freeland, Phil Cryer, Bianca Lipscomb, Tom Garnett and Suzanne Pilsk
1. OCLC synchronization with BHL records.
- BHL will provide MARC XML or MARC to OCLC for all materials (monographs and serials) to be included into WorldCat.
- OCLC will use the Macro route (instead of batch processing) to identify an OCLC record for the print material. Using the WorldCat record, OCLC will derive a new record for the digital manifestation and assign a new OCLC number.
- Following the PCC (Program for Cooperative Cataloging) the record will be a digital neutral record.
- BHL will get an OCLC symbol that will be used on the new digital record. Bill will be sending information to Becky Hurley at OCLC to get the BHL symbol registration process started. OCLC will contact Tom Garnett for answers to questions related to registering the symbol. This will include basic contact information for BHL. Suggested symbol was BHLMR for Biodiversity Heritage Library Metadata Repository.
- If a record is not found in WorldCat, OCLC will use the metadata provided by BHL to create the new digital manifestation record.
- The digital manifestation record will have an 856 with indicators 4 and 0 reflect that the URL in the subfield u ($u) is the link to the actual resource. BHL will provide the URL address to the Title Level for the works. It will be recorded in the 856 $u. $z will have Biodiversity Heritage Library.
- Question: does the $z have any other wording? Sponsored by? Hosted by? Look here? The wording should carefully differentiate BHL from the "contributing library" as described in 533 $c
- Question: we decided that for now we will not have a $3 in the 856 to record volume and chronology information. Right? BL: Correct, it was decided that recording this information will be complicated especially in light of addressing gap-fill issues down the road. A focus on the title is, at this point, easiest.
- BHL will provide OCLC with the BHL identifier number. This number will be key in the reporting back from OCLC to BHL of the new digital manifestation record in WorldCat.
- Question: The BHL Sponsoring Library needs to be recorded in a consistent manner. Possibly this could be the 533 $c. If that tag is already used in the record does this cause a problem? Should the Sponsoring Library be recorded somewhere else? Should it be paired with the BHL id number?
- BHL will provide a sample of 100 records. OCLC will send back a sample for BHL to review. See the OCLC Sample Page for an itemized list of proposed examples.
- OCLC would like to know if BHL does any changes to records after the initial submission. Just changed records would be requested from OCLC. OCLC may not incorporate any changes that BHL sends. OCLC’s database is a “vendor” neutral record system and the record is based on an OCLC record and not BHL record. Richard Roberts is the contact name at OCLC for Macro loading and any changes to records. He will potentially have questions once the sample is sent.
- OCLC will report back to BHL a “cross reference report”: a two column report that includes the BHL identifier and the new OCLC digital manifestation number.
- Initially, the sample will be a zipped up file. In the future, BHL will provide a service that writes out files. It will be a self service operation for OCLC to come and get the records.
BHL needs to clarify where each data element is located exactly:
contributing library name
the BHL unique id number (could be sent as part of MARC 999 for ex.)
the OCLC number
2. Bowker ISBNs
OCLC is still working out the details of a contract with Bowker. OCLC has a April 23, 2008 mapping of MARC to the Bowker required fields.
BHL will tell OCLC which records will go forward to Bowker for ISBN numbers based on gift criteria = Eng lang, public domain, held by a BHL library. OCLC will use this as part of the pilot project to see how this works. A small contract will need to be signed with BHL libraries and OCLC. Once all is in place, using the synchronized data from BHL in WorldCat, OCLC will submit records to Bowker and get ISBNs. OCLC will put the ISBN in the digital manifestation record in the 020. They will let us know the new ISBN. Details to be worked out later.
3. Open URL
BHL wants to get material hit from an Open URL Resolve. OCLC’s Openly Informatics, now known as OCLC New Jersey would be interested in getting BHL information into the knowledge base. Bill will find out who to contact there. Brian Cananon might be the contact who is OCLC’s third party database organization manager. Possibly he or Mark Blanchard could provide information regarding schema needed for article level data to be ingested. OCLC has article information partitioned off from WorldCat though each article does receive and OCLC Number. Bill will contact Chris with information. Chris will provide Bill with a little more detail about BHL’s Cite.
4. Implications
- Integration with OCLC allows BHL to play in the library world
- Likely to increase traffic to Portal as Worldcat shares records with Google
- Potential for print-on-demand services