BHL Institutional Council Meeting March 22, 2010 MINUTES
AGENDA
BHL Institutional Council Meeting March 22, 2010 AMNH
1. Welcome and introduction – 10 minutes Higley
Thanks to Tom B. for setting up the meeting. Introductions
Graham Higley, NHM, EOL Steering Committee
Bianca Lipscomb, BHL Collections Coordinator
Connie Rinaldo, Harvard MCZ, Ernst Mayr Library
Susan Fraser, NYBG Library
Judy Warnement, Harvard Botany Libraries
Christine Gianonni, FMNH Library
Nancy Gwinn, Smithsonian Libraries
Martin Kalfatovic, Smithsonian Libraries
Doug Holland, MOBOT Library
Chris Freeland, Tech Director BHL, MOBOT
Eileen Mathias ANSP Library
Jean Farrington, CAS Library
Chris Mills, Kew Gardens Library
Tom Garnett, Program Director, BHL
Tom Baione AMNH Library
Cathy Norton (on the phone), MBL/WHOI Library
2. BHL Collections Report - 15 minutes Lipscomb
BHL-IC_Bianca.ppt
BHLStaffReport.docx
40,000 titles, 28 mill pages
Monographs=93% (number of titles): exported data
62% of collection is BHL members, 38% ingested
80% titles in English, 8% german, 6% french, 2% latin
November is the ingest spike--content from IA. Now ingested weekly.
20% of titles match exactly (duplicates). May be duplicates that aren't exact matches (not different editions)
Have a mechanism to merge duplicated titles. This is a good benchmark to move forward.
Don Wheeler, Connie Rinaldo, Christine Giannone, matt Person, Becky Morin, Grace Duke, Judy, Bianca: Collections committee, weekly conference call. Drafted mission statement: who what why, flexible (on wiki)
Tom noted that the strategic plan does not have a mission statement--can discuss further during strategic plan discussion. Mission statement came out of the collections group because collections group needed boundaries.
Needed to define the terms "Biodiversity" and "Heritage" to really define the collection. Some reason to retain duplicates. Ingested content will be reviewed for relevance if attention is called to an item. But will not go through items scanned by partners: if it is in, it is in. Items that are removed from BHL are still in IA.
Visual for collections--core, supporting, outliers, excluded.
Permissions: 39 publishers have signed agreements, 3 with moving walls; 95 titles with permission to scan post 1923; Using Gemini (issue tracking system) to coordinate among each other so that permissions titles are priortized particularly for gap-filling; content providers (orange bag): self sufficient major providers (OAI compliant)-1, providers requiring assistance-16/22titles, self sufficient minor providers-3, 2 book-like items, 2 others; Bianca is working on drafting language about what the requirements are and a realistic view of how long it will take and language about what content providers should expect once stuff is in BHL. Need to have clear expectations. For instance, transcriptions will be added to BHL but not inserted as OCR behind a particular volume.
Drafting language about due diligence processes
3. BHL Staff Activity Update – 30 minutes Lipscomb
Handout
Gemini issue processing; post processing work; "report an error" button875 issues entered
612 since the button: 71% resolved 29% unresolved; roughly 3 issues a day
26% collections issues, 22% portal editing issues (resequencing, for instance), 20% pdf issues 58 messages of praise (Martin suggested that Bianca put these on the testimonial page)
Working to improve issue response--need fewer unresolved issues!
Problem: users use pdf generator to download whole book, 100 pages at a time
Bianca proposes a "buddy system" for BHL libraries to improve issue resolution: Will post proposal document on wiki
Documentation for portal editing should help improve the data. Graham suggested that we bring in some of the BHL-Europe partners, e.g. Francisco Welter
Fulfilling user requests: FedEx acct for supporting gap fulfillment; Pagination: needed but not enough resources
Can we legally share metadata with others?: This is an issue for IC. Can't stop them from going to IA. Need some clarity.
ACTION: Bianca will draft wording for review on monthly call with language to take back to our directors/lawyers
QA: post scanning, retrospective and just-in-time but not enough time and resources for all member libraries to participateAddressing user requests to scan; need a formal queue; Judy noted that they monitor ILL and add appropriate items to scan list
ACTION: develop a "suggest a title" form with appropriate fields
COMMUNICATION: successes=public facing wiki, video tutorials, survey (jointly with Europe)--421 surveys by March 21 (6 languages): targeted to BHL members, posted on BHL, questions get returned a survey, listservs, directed at scientific audiences; There is money for another survey in BHL Europe in a year.
BHL staff are successful in collaboration and communication.
Committees need regular conference calls--need to be sure we have a call line. Graham said if there is an issue, we can work through NHM
Decision documentation needs to be more clearly communicated. Not just wiki--need a push mechanism (Tom's regular report?)
ACTION: EC needs to review how to address the issues of passing issues up the chain of command
SUMMARYChain of commandShould buddy system be mandated?Who should be able to become a member of the public-facing wiki? Since it is public facing, Tom thinks that access should be more restricted. Need to define content for public-facing wiki. No one monitors regularly. Publicity/outreach. Graham suggested that perhaps one of the European partners should monitor. Chris noted that Tech work needs to be done by Tech team; list of permissions, due diligence, collection development policy Much of this is BHL Classic--does it make sense to have the Europeans do this? Graham thinks that someone who doesn't know the detail may provide very important feedback. Could be helpful.
ACTION: Bianca will extract the defined content for public wiki.
4. BHL Technical Development - 30 minutes Freeland
PPT:
2010_3_BHLIC_2.ppt
Chris skypes a lot, all over the world.scanned content: 25-30TBbotanicus yet to be loaded: 12-15 TBingest 22-25 TB
TOTAL!: 59-70 TB!!
143 BHLs would fit in a sperm whale
4000 visitors a day (unique) (spends 5 min on the site)
679,967 unique visitors
Australia, Brasil, US, but users all over the worldTropicos is now the number 1 referrer to BHL
PDF articlizing stats: 25,000 pdfs served since August (10,000 have metadata); up to around 700 weekly
EOL is now #3
UCAL has most items contributed to BHL, BHL libraries generally show more names
Taxonomic density--MOBOT is the highest
How many species have been reported only once? 1.5 mill unique (70 mill name strings in 28 mill pages; 58 mill verified); 329,000 found on only one page in BHL.
Suggests that there are a lot of species that there is very little information about. This doesn't prove it but supports a hunch. Should look at 2-3 occurrences within a book and this may be more diagnostic. Without the BHL, it would be difficult to find these names.
New since Nov: new color scheme, content from IA, names indexing for ingest, APIs (ways for machines to talk to BHL), OAI interface, Darwin's library annotations (Mike L working on this but don't have enough money!), primary/secondary titles enhancements, orange bag solution testing, working with EOL on nomenclatural acts
Consumers: EarthCape, BioGuid (rod page), BioSTOR (rod page), JSTOR-in discussion
Research projects: BREC-NSF (Bryan Heidorn), Conjecturator-NSF, Hong Cui at UArizona-NSF, Darwin Library
OCR correction using WikiSource: can log in and correct text, especially scientific names (crowd sourcing); can we reincorporate into portal? Can it handle 28 million pages? How will it work with other content
Partnership statement: Do we need an agreement with parties using BHL materials. We need a document that is more explicit about relationships, not to restrict or control. Similar to metadata question--need a boilerplate response including expectation setting. Recognize that we have an asset and need to define appropriate use and compensation and licensing (like NLM). ACTION: review NLM language. Graham noted that at NHM grant bids that include database work, must review tariff with Library. Must also review privacy policy, copyright language.
WH Cluster: 22 TB transferred lots more to go. Should be complete by May. 17,000 for 2 nodes 133 TB
5. Coffee Break – 10 minutes
6 Publicity - Outreach – marketing; public library community; - 30 minutes Fraser
OCLC understanding-- there will be a BHL symbol; kind of like CRL
Notes that NYPG has public library activity and lots of ILL requests from folks who did not know about BHL. How do we get the word out?
Need to do a press release, brochures, listservs, feeds, science news sites, Do ILL folks know about BHL? Lots of reference requests are failed ILL requests, Need links back in own catalogs
Brazil didn't know about BHL when we attended. Send posters to scientific group. Send posters and cards to organizations; email blast, LAPI and JSTOR, JSTOR wants content that overlaps with us. Need a press package--contract out. This will result in a higher-quality piece. JUNE 27--ACLTS award for Outstanding Collaboration
"Fieldnotebooks"-- with pens, post it notes, pencils-- where do the swag suggestions go? Task group finish in a month
ACTION: Form a short-term working group to pursue other venues and press package with recommended areas of responsibility. {Jean, Susan, Tom B., Connie} Commuications plan--get some talking point for staff
talk to EOL, plan panels to target groups; Need a new/additional informational poster; IBC in Melbourne; Library of Congress is working with China and they want BHL to present., ILL conference. Send suggestions to Susan for working group.
Review what professional meetings/panels should be targeted.
Suzanne Walters: Library Marketing that works
7. Financial Report – 15 minutes Garnett, Kalfatovic
Spending MacArthur money--we can propose reasonable reallocation for unspent money for year 4. See financial review pdf. MacArthur money runs out in July 2012.
Nancy asked about the oversight fee--Tom noted that any subaward cost gets 12% added for overhead. Smithsonian central takes 3%. Nancy asked where the overhead costs go. NYBG--overhead goes to institution, AMNH Library gets some back for direct costs; MOBOT--into institutional fund.
Need to do our best to get Years 4 and 5 right. Reallocations may need input from EOL Steering Comm or Jim unless they are small. Money is primarily going to salaries and after that, scanning.Tom organized financial report by central and then by institution.
ACTION: Tom will organize financials by categories (scanning, salaries etc).
Martin: put together Member Contribution document. He wants to know if it makes sense and asks that we send suggestions about how and what to report.Graham noted that this is very useful because we can present internally and to EOL to show how much we are contributed (needs to be ready for April). This is calendar 2009 since we all have different FY. Tom and Bianca are not counted.
Chris noted that we need to show the total costs. Martin and Susan have a more detailed, hourly tracking concept for coming up with a scanning cost.
Get your stat. collector in touch with Martin.
Nancy: Also need a summary of all the money raised outside of MacArthur.
Graham noted that we are providing 100% added value. people costs, nonfunded, people costs funded would be a good comparison.
8 Global BHL Developments - Revised Governance - 45 minutes Garnett, Higley
Implications for governance--not to solve but to review. BHL probably should not follow the proposed EOL model.
Principals of proposed BHL governance structure are accepted and liked because they are simple but not sure it will work. It is based on a global BHL steering committee. The other partners must review global structure and everyone must agree that their roles and contributions are adequately reflected. Do we like this structure enough to throw it out to everyone else. Is this a good enough working document to open the global dialog.
Tom thinks the other groups would be open to this.
Nancy noted that it should be a "Coordination Committee" not a steering committee.
Tom noted that some projects are bringing major resources and others not so much. Thus balancing the coordination is important. 2 reps might be needed when institution has content and major tech contributions. Tom noted that he would not like to see technical pieces separated out.
Bianca would like to see the content committee as less of an island and see it involved significantly as technology.
Be somewhat flexible about who is part of the committee. Who pays for attendees? TBD. Preference is for self-funding but will look for alternative funding. Don't want to embed governance structure in the MOU. Can use letters of agreement.
AGREED: Coordinating (rather than steering); number of reps should be qualifiedACTION: GRaham and Tom will revise and clean up draft.ACTION: Plan to engage others on governance
BHL-China, BHL-Europe, BHL-North America, BHL-Australia
9. Lunch – catered. one hour
10. Resource Needs, technical and programmatic – 30 minutes; Freeland, Garnett
CHRIS F: BHL Europe, China, Australia, BrazilJSTOR, Scielo--auto ingest, but smaller publishers need help and individual users that have individual content--lots of investment and tending including communicationPubMed Central, PLoS, JStOR: all have 10 to dozens of staff to handle inquiries, ingest. BHL has 1 technician paid from MacArthur funds. People can't really do this as self-service. Except for the large publishers, manual intervention necessary.2 years to make Citebank better OR bigger--can't do both. Biblio is a good startTaxonomic LIterature 3---beyond TL2---and make a Global Reference Index to Biodiversity Fits in with WP2 (Europe)Citebank won't actually solve all of our problems. What can we do? Twitter survey--everyone said CONTENT not services. Chris recommends: use Moore funds leftover to get a consultant to expand CiteBank; reallocate half funds to scanning and half to content assistant; PLoS may be building a Citebank like repository--find out Thursday. Seek funding for TL3 (GRIB) CONTENT IS KING.
TOM: some quick figures: during the first 2.5 years--1.3 mill on staff and more than that on scanning (52% scanning)70% staff, 30% salary....then year 5: 100% staff. (MacArthur funding). Could ramp up scanning if the money was there. If we cut scanning, can't cut staff to save money. There is a minimum staff needed even in a static environment. There are groups that have content but getting it in could cost quite a lot. May not be a better deal to take content than to scan it. Even if no one gives us content, we are creating articles (12,000 user-created pdfs). Tom wants to be sure there is enough money invested in development process to support the content going into Citebank. Need to do a better job of outreach and marketing. As we get more global, more technical coordination will be needed. We hope that technical development can occur in this environment. Not happening now.
GRAHAM: Pages scanned, Pages ingested (IA), Pages waiting to be ingested. What are the costs per page for the different needs? What are the overhead costs? What is the yield per investment? There is a tradeoff between management/tech salary and scanning to get more pages. The route to more pages might be to spend time getting agreements rather than scanning. Program Director and content coordinator---not strictly overhead because they are bringing in content. Put the money where we get the most pages for the least cash.
Chris is suggesting that we have to pull back on plans for Citebank--need to keep it going for articlizing and for big publisher content like JStoR and SciElo. We work towards a broad Citebank. Need to bring technical developments supported by MacArthur to completion by the end of the grant.
SUMMARY: Need a well-defined, limited Citebank by June. Need to be thinking about TL3 (new name). Spec out opportunities once Citebank has been released.
11. Review and revision of BHL Strategic Plan including discussion of the reallocation proposal – 1 hour Gwinn
Henning has plan to bring in European content in 2012, China and Australia also have plan.
How much biodiversity literature are we providing, not just how much we are scanning. Numbers on spreadsheet need to be revisited. Anticipate how much it will cost and then fund-raise to add content.
Bidding list moved to Vienna and will be up March 2010; another tool in process by Dec 2010 (tool kit, master list and enable users to request items.
Improved scanning workflow management tools.
ACTION: Tom revise 1..5 in strategic plan
ACTION Judy will revise 1.4 in strategic plan
ACTION: Strategic plan: Chris will get count for articles in citebank, also 3.1, 3.2.4, 3.26--more information
ACTION: Tom fill in 2.4 but now 3.2.4
Did a good job of accomplishing our objectives. Are there other goals to add? Collection Development Policy, TL3
Metadata improvement workflow (number of titles merged, titles paginated, : Improving data accuracy, responding to user feedback; responding to survey
12. Coffee – 10 minutes
13 Sustainability – 30 minutes Garnett
MacArthur funding: 2 mill left until july 2012Moore 1/2012, 400,000SIL Seidel 1.4 mill until 2014
Funding requests submitted: BREC, IMLS grant for field notebooks, others: bid for artistic materials for Europeana for high-end scanning, also BHLEurope 2, (funded Vibrant....NHM is lead partner and will have some impact on BHL), maybe 2011 for TL3
Sources of future funding: NSF (dimensions of biodiversity-first solicitation not an easy BHL fit. AToL, CyberInfrastructure), IMLS?, Mellon, Moore, Sloan, Others--need compelling vision;
Private fundraising (donors)Federal base fundingCharge model--are there value-added services we can charge for? Scanned content has to be open access because that is what we were funded for.
print on demand, test with kirtas, foldouts can come as separate pages and other options; also high quality super high resolution prints of images
Outsource as much as possible; storage at IA and PubMed, integrate Citebank with PLoS Hub, replicate content at BHL, global partner content, caretaker mode still costs money, make bhl indispensible, subsume bhl into some larger program (OTSP)
low funding: Need better coordination than now exists, is this realistic?, lead project "repository" institution--different than content repository? What does this mean: who do you call when you want to talk to BHL? Should we assign a host institution?
MOU?
Who are we indispensable to? We are indispensable to our institutions! We have created a digital library of our stuff. We should become a consortium and thus we have to think about paying for membership in the consortium. How do we sustain ourselves. Membership fees.
In an age of declining budgets...we still think about BHL as a project with an end. Need to get everyone to understand that this will go on. Resource sharing is the reason behind consortium--education, work together, paid to staff consortium; exec. director, secretary, auditors 501 3c to write checks and have payrolls.
How many natural history libraries in the world? maybe 100 with small staffs. (IAMSLIC 300-400--everyone joined)
Library self-fundingConsortiumliberate content and let anyone take it on
Graham noted that it may not take that many institutions to keep the BHL going. It is a gain of efficiency--better science done more quickly. Our institutions should want to pay for it. If they don't, then it isn't working. This might play into governance--who are the organizations willing to put money into it?
We need to prepare and demonstrate the benefits to our institutions so that there can be central funding (Program and Technical Directors). What is the BHL operational budget?
Need to ask the institutions and sell the idea that we have a responsibility to maintain this long-term. BHL has become part of our mission.
Build endowment to generate enough interest to sustain. JSTOR may be willing to help with sustainability plan.
Preservation must be part of sustainability plan.
Need to keep the "outside academia" audience and this may be a compelling reason for someone to donate.
ACTION: Tom will draft sustainability plan by June 1, 2010.
14. Election: Cathy Norton
Doug proposes that we postpone the election and keeping current Exec. committee for 6 months or 1 year. Set up a governance and by-laws task force. For instance election of 2 board members should be in alternate years. We will have to re work the by-laws as global BHL expands. Review in 6 month call (September?) and resolve by 1 year.
agreed to postpone
15. Other business: Google patent on serials parsing
Principle is so wide ranging that it is probably already documented. Requires research. Probably can be challenged in Europe. Lee Giles student's work. How organize the challenge? BHL can't challenge.
ACTION: TOM contact Fred von Lohman at EFF for google challenge
ILL agreement: ACTION: ALL read and review for decision on next phone call.
OCLC ready to mount our data but need signature. But we are all OCLC libraries so they have signed documents: governed by agreements already in place; grants no more rights than in existing agreements.
16. Action items; closing Higley - 10 minutes Higley
ACTION: review NLM language. Graham noted that at NHM grant bids that include database work, must review tariff with Library. Must also review privacy policy, copyright language.
ACTION: GRaham and Tom will revise and clean up draft.
ACTION: Plan to engage others on governance
ACTION: Bianca will extract the defined content for public wiki.
ACTION: Form a short-term working group to pursue other venues and press package with recommended areas of responsibility. {Jean, Susan, Tom B., Connie}
ACTION: Bianca and Tom will draft wording for review on monthly call with language to take back to our directors/lawyers
ACTION: develop a "suggest a title" form with appropriate fields--staff plus Chris
ACTION: EC needs to review how to address the issues of passing issues up the chain of command
ACTION: Tom will organize financials by categories (scanning, salaries etc).
ACTION: Strategic plan: Judy will revise 1.4
ACTION: Strategic plan: Chris will get count for articles in citebank, also 3.1, 3.2.4, 3.26--more information
ACTION: Strategic plan: Tom: 1.5, fill in 2.4 but is now 3.2.4
ACTION: Strategic plan: Martin: 6.5.
ACTION: Strategic plan: Connie and Nancy will reformat spreadsheet
ACTION: Tom will draft sustainability plan by June 1, 2010.
ACTION: TOM contact Fred von Lohman at EFF to challenge google on serials parsing patent
ACTION: ALL read and review for decision on next phone call.
Documents for the Meeting