BHL
Archive
This is a read-only archive of the BHL Staff Wiki as it appeared on Sept 21, 2018. This archive is searchable using the search box on the left, but the search may be limited in the results it can provide.

Collections Analysis


>Functional Requirements > Functional Req. 2 > Putting it Together > Future Needs > Final Thoughts

Collections analysis to date

9:45-10:50
Marcia Adams, SI - Review of data collected from SIL,MoBot, and NHLondon and Best Practices suggestions
Bill Carney, OCLC - Data analysis and review, BHL metadata and linking to OpenWorldCat

Handouts:

Comparison of SIL, MoBot and NHL extracts OCLC outline of metadata submitted
Comparison of extracts.xls Best Practices for BHL Data Extract.doc LavoieOutlineOCLCDeliveries.doc

Notes:

Marcia Adams began by pointing out the documents in the packets.

Obvious decisions on best practices for the intial extracts from participating libraries and potentially for new libraries/intitutions joining BHL:

Format type of records: MARC
The group needs to decide on subject coverage: Broad sweeps of participating libraries collections or more targetted approach?
Should we include microfilm? maps? etc. London Natural History Museum excluded maps. There are many issues surrounding maps (e.g.size of maps (fitting scanners) etc.). But each library can decide what works best for them.

There was discussion on if the Metadata Repository of the BHL is suppose to be a listing of relevant material even if a subsection of the titles "will never be scanned". The BHL has soft edges currently without distinct "boundaries".

The larger, overall discussion was based on the question if the project will be "stuff" driven or "data" driven. With the scanners in place, it seemed to actually not be an issue exactly. The "chicken v. egg" syndrom seemed to appear with various perpectives of what it actually meant by "driven". The "stuff" will feed the machines that are scanning but all "stuff" must have some metadata associated or it is lost. The "bucket" of metadata must be there to pull from at the time of scanning.

It was decided that after OCLC's initial analysis, the beginning scanning decisions (topics by which institution) can be done informally. Once the BHL is up and active, the decision on what should be scanned where will become more important.

The workflow issues were apparent. Large collections are not barcoded as neatly and comphehensively as MoBOTs. This will need to be worked out in some detail for each institution.

Issues around the metadata granularity of monographs, multivolume monographs, and serials is still a MAJOR issue to be resolved.

Martin Kalfatovic (Smithsonian) drew a fundamental chart to show how the individual ILSs go to the Metadata Repository to store the data the scanner picks from - following the current Internet Archive Z39.50 model.

Bill Carney (OCLC) drew a chart that illustrates how the five large institutions are working with Google and how OCLC is working with the workflow of exchange of data. Data is clumped as the "G5". Google makes an executive decision on what it is really going to scan. OCLC communicates back to Google that link to WorldCat. This link provides the "Find It In Your Library" button on Google.

(Is this right?) OCLC also marks titles regarding "intented to scan".

The statistics from OCLC were just a quick look at the three libraries that had submitted data.
605,000 records
400,000 Bib level monographs
200,000 serials
5,000 other

Thesis and Disserations - These Numbers are NOT right - Please, someone, please edit this page and record what really these might be!
12,000 English
390,000 Non English?
215,000??

134,000 titles PRE- 1923
350,000 titles 1923 -On
121,000 NO DATE

134,000 Print
6,600 Microform
1,900 Other

OCLC is interested in following along the lines of the Google model with the "find in the library" link in the final BHL.

Discussion highlighted the need for some "rights" working group being established to work out details of the "not for commercial use" language used for OCLC's metadata records and the Open Content Alliance's open access policies.

Decisions:

Extraction of data from Libraries to OCLC - please send data as soon as possible to OCLC for analysis

Subject Areas

Formats/material types


Exclude


Item data


Serials Holdings Data


Documentation




Action Items:
Libraries who are having problems should contact members of BHL that have similar systems.


Suzanne will send out a message to the libraries that still need to send data to help move the process along.

Bill will talk to Brian about the discussions that took place at the meeting.