BHL Institutional Council Conference Call September 17, 2008
BHL Institutional Council Conference Call
September 17, 2009
On the call: Graham Higley, chair Connie Rinaldo, minutes. Matthew Bolin in for Tom Baione, Jim Edwards, Susan Fraser, Tom Garnett, Nancy Gwinn, Doug Holland, Judy Warnement (Cathy Norton, Elizabeth Babcock and Chris Mills were unavailable).
A. • Financial Report – Tom Garnett 15 minutes Handouts: Spreadsheet and highlighted document.
Money in two project year intervals. Have expended almost all of the first year and in some subawards, have already begun using the 2nd project year funds. Money from MacArthur foundation was allocated through Smithsonian. Smithsonian has 2 roles--dispensing subawards through Smithsonian and then as a BHL spending a subaward. Spreadsheet has section for central costs and then sections for subawards in the spreadsheet. NHM money is funneled through MBL. Subaward allows institution to incur expenses and then submit invoices to Smithsonian. Direct costs plus overhead calculation of 12%. How much left to spend---depends? All but $29,000 is encumbered but still $1 million (more than half) to spend. This is due to a variety of delays in getting scanning going but spending is increasing. Questions? Judy noted that she did not know in a timely fashion that the money originally set aside for HU Botany had been spent elsewhere--there was no communication. Tom notes that communication needs to be clearer as BHL responds to changing conditions. Susan asked for clarification on project year: July 2007-June 2008 and 2nd is July 2008- June 2009. New budget request will be made and any allocation would be for July 2009-June 2011. Any subaward allocations for period of July 2009 should be worked out at March 2008 IC meeting. The new submission needs totals, milestones and deliverables but we do not need details for who does what until later. Connie noted that there should be a clear, formal process for subaward assignments. Nancy hoped that some allocation suggestions will happen before the face-to-face meeting. Tom noted that he needs to have some idea what each library can do, estimates from us. Graham summarized: Focus on planning and communication. For libraries that have subaward, the invoice processing is going smoothly. Please let Tom know if there are problems.
B. • BHL Portal Development – Chris Freeland 10 minutes
Tom noted that portal development has been very busy. We are lucky to have Chris!!! Just shy of 8 million pages. (4 mill at beginning of June). Development broken down into completed and prioritized work (handout copied below). COMPLETED 1. Network Upgrade: Network upgraded and traffic to and from BHL prioritized. Still delays in loading page images from IA. 2. Automated Ingest from IA 3. Automated Names Harvesting from uBio 4. Title relationships: metadata from different libraries 5. Exports Titles, Items, Pages, Names: allows us to grab all titles from BHL; more services coming but tab-delimited export 6. Name Finding Evaluation: 8 week evaluation on name-finding (TDWG presentation): interesting findings about OCR errors and comparison with golden gate; TaxonFinder is more accurate than Golden Gate. Data-focused since launch in Feb 08: next year will get more into interface PRIORITIZED 1. Articles Interface: number one criticism: 2. Citation Resolver: once we have interface, we need to have a citation resolver to allow others to link in to both articles and monographs 3. Fedora integration with IA: Now: full run, digital library cover to cover digital page turning but Fedora has interface that accommodates articles--want to take PLOS tools and Fedora and make it the place for article storage delivery; end goal is to move away from .net but not sure if Fedora is robust enough. Services on top of Fedora: OAI 4. Distributing content: getting scans and files mirrored--big challenge because this is 25+ terabytes of huge files--hard to shuttle around on commercial networks. There are discussions with potential partners--Smithsonian data center and NSF project called "DataNet" and others are being explored as well. There are technical and political issues. 5. Ingest from other data providers Books Articles metadata xml PDFs 6. Metadata workflow Jim E. asked if things are set up so third party developers might be able to collaborate. Chris noted that we are not formally running BHL portal as open source project but we can accommodate other interested parties. Jim: Rather than putting all the development with MOBOT and other partners, others could write applications. Are there plans to put BHL into open source realm? Chris: The data is open source and available; the interface is not currently open but could be. Jim: will you be using the TDWG meeting to look for additional partners? Chris: yes, and we are working with a variety of communities already.
C. • Open Access Policy draft – Tom Garnett 10 minutes
At last call, we identified the need for a clear statement about open access on BHL on the portal. Tom sent a draft for comment. Nancy asked about commercial use--do we allow commercial use of data? The wording on the copyright section of the portal is the original Botanicus wording--should be rewritten. There is a different creative commons license for commercial use. Jim Edwards notes that pre 1923 is out of copyright but if have other agreements for more recent stuff, we may not be able to allow commercial use. Tom notes that this is covered in the policy. Connie noted that she thinks we should have notification and discussion for commercial use even for public domain. Tom noted that we need to identify who would be the contact. Currently we pass on decision making to the library that contributed and some libraries may have different policies. Jim has a concern--if asserting a non commercial license over stuff in the public domain, but by doing so does library x own that material and should get money for commercial purpose. Tom noted that the agreement with publishers does not prohibit commercial use although some publishers have added this back in. Nancy noted that we have to do something without a lot of exceptions because this is such a big database. Nancy also said we should allow for non-commercial use and ask that people request permission for commercial use. It would give us an opportunity to see what use can be made. Tom noted that the MOU says: all info in the public domain, BHL or providers will not seek to assert intellectual property rights. Graham suggests a comment about public domain and then a statement about other types of material. Tom will revise and send out another draft for discussion. D. • Selection and the Collection Status report – Doug Holland, Connie Rinaldo - 20 minutes Report is separate handout. Doug noted that he and Connie completed the OCLC training. One thing that came out of the training was that there was no de-dupe algorithm although we thought there was one. That is why the data is skewed. Also a lack of granularity. Can't really use for selection but can generate lists of specific titles. Good for predicting trends--general reporting, pulling out bar charts and numbers. Good for rough measurements by language or date--good for grant proposals, possibly. Connie noted that OCLC tool is a significant annual expense ($9000) and is this worth it and who will pay? Tom noted that outside funding was used to pay for first year. Doug noted that the data is not live. It is updated quarterly. Graham noted that there is a staff meeting coming up and perhaps we should ask them if it is useful. Chris will add it to the agenda. Tom talked a bit about this staff meeting--in Woods Hole, early November, BHL funded. Doug noted that the recommendation to improve de-duping tools is really important and this will be discussed at the November staff meeting. Connie noted that one recommendation discussed but not explicitly stated is that we need to work more with the taxonomic community. Tom took over to talk about species verification. Some of these groups have a bibliographic component, e.g. Decapod community. For Decapods, there is a community-vetted list of species names and core lists of most important texts for these species description. Chris has analyzed how much we have already done, how much is serial literature, how much is in copyrighted material, etc. We see this as a great tool for ensuring that we are providing what the taxonomic community and EOL need. There is a very complete Solanaceae list. This can refine our selection process. We can get measurable targets that specifically inform user groups. Tom wants to put on agenda for November meeting. Graham suggests the summary should be on staff agenda and have a call to make some of these decisions. Could then write an informed paper. If we need more resources, we need to be ready to scope the resources. A new document should come out of this shortly after the November staff meeting and a conference call in later November to allow revisions. Connie volunteered to continue working on this topic. Jim E. urges that we come to decisions as early as possible. One of the themes in MacArthur proposal is better integration among EOL components. This relates directly to this. An early decision would help.
E. • Strategic Planning Process – (Nancy Gwinn, Susan Fraser) 20 minutes
Susan noted that she and Nancy both expressed a need for this process to happen. We need to clarify what the BHL is and what the goals are. Goals are not well defined and we have so many activities. If we prioritize goals, it would help inform the budgeting process. Nancy agreed and noted that the long list of goals are not defined as long or short term. Judy also agreed and noted that if we add new partners, how do we fund all of these initiatives. Susan noted that it would help at the local level to clarify workflow. Nancy also said that this kind of document would help with funding requests. Tom agreed we need this process but will take time. He suggested monthly conference calls and a process where a group works now on a strategic plan draft to be reviewed at March meeting. Graham thinks we need an outline by December. Jim would suggest separating strategic plan from strategic planning process. Graham notes that working documents thus far reflect money but not where we are going. Tom summarized that we need prioritization for MacArthur proposal. Graham suggested that the Exec group focus on pulling together a process in the next few calls.
F. • BHL decision-making and communication 10 minutes
Tom said some of this was discussed in financial report responding to Judy's concern. We need to do a better job of communicating. Connie reminded us that Tom suggested increasing BHL IC calls to monthly calls and also Tom should communicate more frequently with directors directly. Nancy noted that this would take a lot of time. Connie thinks both are important Doug thought more frequent group calls good but not individual. Judy agreed and isn't so interested in individual call. Tom asked if monthly is enough for a group call. Nancy thought monthly group calls would be a good place to start. EOL board, strategic plan and feedback on selection process are already on agenda. Judy asked if the March IC meeting dates have been selected--answer was no. Nancy said it would help to set dates as soon as possible. Tom will set up time for monthly conference calls and March meeting.
G. • Bowker and ISBNs – Tom Garnett, Chris Freeland 5 minutes
Tom noted that we are digitizing books without ISBNs. Bowker is willing to give us a range of ISBNs for free. Several products use ISBNs to identify content. It is another way to get content out there. Very little work except signing contract and no money. Connie noted that the Harvard lawyers expressed some concern about risk for declaring public domain and out of copyright. How is public domain defined? Tom will check since Harvard defines public domain as pre-1909 for non-American items.
H. • New Directions – Tom Garnett 15 minutes
Chris and Tom will meet with EOL component leaders and discuss how BHL supports the rest of EOL and what the directions will be for BHL. Tom wants feedback on his draft new directions slide reproduced here: What Can the BHL Be?
- One-stop shopping for scientific biodiversity texts.
– Digital Preservation of the scholarly record in the field of biodiversity.
– A suite of services to facilitate the reuse of the BHL content especially by the EOL
– Embedding of textual content in the emerging knowledge ecology. Three sets of players
– practicing biologists, libraries, publishers. Slide 8 New Directions – BHL Portal must allow article-level access; ability to ingest, display, and download article level content.
– Field notebooks, gray literature and archival material need to be included.
– Selection of BHL scanning must be better coordinated with taxonomic community.
– Coordinated, public copyright verification process.
– Ability to interoperate with other major digitizing providers especially in Europe and Asia.
– Captcha experiment for OCR cleanup
– PLOS-like reviews/comments/edits of articles.
– Social networking tools to correct OCR
– Improved data mining
– Accelerate permissions from copyright holders
– Increase level of activity with publishers for article-level access and permissions.
– Renewed effort to gather more current materials.
Nancy noted that this is what sparked the strategic planning point. We should be focusing on scanning before moving in new directions. Graham sort of agrees with Nancy but we might need to note new directions that can be coming up for the report to MacArthur. Graham also suggested that we vote 0-10 using importance and urgency as the two voting categories. This kind of vote might help us narrow the focus quickly. Doesn't mean tossing out others. Nancy notes that we are already working on some of these (article level) so we have tacit agreement on at least this. Judy noted that there are different levels/audiences. How do we improve content for users? What are all the new, cool things? What is needed for the libraries to continue? Jim noted that strategic planning process will help us prioritize. Add another dimension--integration with other components of EOL. Booz and Associates consultant is helping EOL think through critical path analysis. What does EOL need to do to fulfill milestones/goals and how do the parts fit together to make that happen? To aid this process, consultant has sent some PowerPoint slides asking what would a particular component need to do to make a specific scenario happen? This should help to identify integration needs. Graham asked what Tom needs to make progress. Tom can reformat list with to identify importance/urgency. Nancy noted that focusing on scanning, improving interface, and articles are pretty urgent. This is about identifying priorities. Connie asked if we can add the integration idea to the list. Jim noted that a third column could be added--where does this fit in integration of EOL components. Nancy asked what EOL is expecting from us? Jim said first, CONTENT but not miscellaneous content. Need targeted content. EOL needs to identify some areas for depth and then return to BHL. The second thing is the capability for people to use what we have digitized in species pages--need a real capability to know what is available without having to look at each reference. Need a presentation that developers can use ("articalization" and other tools). Nancy noted that this points again to creating a really usable database as first priority. Jim noted that it isn't clear that anything on Tom's list goes beyond this except adding field notes and social networking tools. New directions?
I. • Next IC meeting – 5 minutes
Anything to add to previous comments: Lock down a date in March--2 days. Full day for IC and another day for BHL Architecture. Will start monthly conference calls in October. Might need to allocate more time for March meeting if we want to have a face-to-face strategic planning meeting.
J. • Other Business
Judy noted that Judy and Connie are funded for the IMLS grant. We want to add a wiki section for managing the grant but we would need to add some non-BHL members to wiki. We are requesting permission. Group agreed that this is acceptable.