bhlstaffcallnov212013
Back to
Staff calls main page
Dial 1-877-860-3058 and enter the passcode 961479
Agenda:
- Gemini updates & upcoming training sessions proposed for: (Jackie & Bianca)
- Monday Nov 25 at 10am ET
- Monday Nov 25 at 11am ET
- Tuesday Nov 26 at 2pm ET
- Tuesday Nov 26 at 3pm ET
- Round Robin (all)
- Scheduling for staff calls Dec-Feb (Bianca)
- Art of Life (Trish)
- Technical update (William)
- Clustering of Article Data - Mike has been fine-tuning the clustering algorithm that takes the titles and groups them with related ones. See the "Same As:" in http://www.biodiversitylibrary.org/part/4925#/summary
- View Taxa Name Sources: In the Bibliography by Taxon Page, we now have a link to Name Sources in the top. The services we are providing now allow us to indicate where the names come from. For example, look at http://www.biodiversitylibrary.org/namedetail/Inga_alba
- OAI/PMH: Open Archive Initiative/Protocol for Metadata Harvesting is being implemented for our portal, so information from other repositories, like Pensoft, AMNH, SciELO and Real Jardín Botánico could be integrated.
- OLEF: Thanks to Mike Lichtenberg's effort and Wolfgang Koller's assistance, BHL US/UK has now added OLEF as a format output to our existing OAI-PMH API to generate and serve the metadata on the fly. This should facilitate our metadata harmonization with BHL Global Partners. You can try the following examples (from the Developer Tools page in the public wiki: http://biodivlib.wikispaces.com/Developer+Tools+and+API ). Thanks to our colleagues from BHL-Europe and BHL-Egypt, who kindly contributed by developing and improving this format to help promote our common interest of sharing content among us, BHL Nodes.
- http://www.biodiversitylibrary.org/oai?verb=ListIdentifiers&metadataPrefix=olef&set=title&from=2009-02-01&until=2009-02-04
- http://www.biodiversitylibrary.org/oai?verb=ListRecords&metadataPrefix=olef&set=title&from=2009-02-01&until=2009-02-04
- http://www.biodiversitylibrary.org/oai?verb=GetRecord&metadataPrefix=olef&identifier=oai:biodiversitylibrary.org:title/2
- http://www.biodiversitylibrary.org/oai?verb=ListRecords&metadataPrefix=olef&set=part&from=2013-08-27&until=2013-08-28
- Code on Github: Administrative site has been separated from the main code, so Portal code is now available in Github (https://github.com/gbhl/bhl-us). Working on a new API version.
- Presentations
- Participating in the TDWG Meeting by the end of this month.
- DLF Presentation/mini-Hackathon by Conni
- Coming up:
- NESCent-EOL-BHL Research Sprint - http://eol.org/info/509
Mining EoL (http://eol.org) and the BHL(http://www.biodiversitylibrary.org) to address questions about the ecology and evolution of biodiversity. Teaming up a Biologists with Informaticians in teams of 2 to work at National Evolutionary Synthesis Center for 4 days on Feb. 3-7. - Duplicated authors - Gemini issue 31808. Working on the issue of apparently duplicated authors: For example, see the few extra Jonstonus's in the BHL record at: www.biodiversitylibrary.org/bibliography/64232.
- Collections Committee: To figure out what we want to display and the values for the meanings of the different 700 fields information. Jackie and Suzanne will meet with BHL staffers about how we want this to look. Come up with specs. Think about search.
- Tech Team: we will turn off the duplication check while we work on the definite solution by end of January.
- Final answer: In January, we will work with Australia to update the duplication of metadata in place and then make the changes to the DB to allow keeping it in synch. Turn the "dup safety" back on to assure data integrity.
- GNA - As you might recall at our September meeting we explained the problems we were having with on-the-fly query of the services for mining scientific names in a page (bhlstaffcallsep192013). They have installed new servers, giving us a dedicated installation and we are now testing their services in these new servers.
- IMLS - Just met for a pre-kick-off meeting, coming up: converting those nice paragraphs and blocks and arrows into real algorithms.
- Reminder 9th Int'l Conference on Open Repositories, OR2014, 9-13 June 2014 in Helsinki, Finland proposals due Feb 3 2014
Attending
Tomoko Steen, Michael Neubert, Diana Duncan, Diana Shih, Matt Person, Don Wheeler, Jackie Chapman, David Iggulden, Randy Smith, Chris Cardin and Joe DeVeer, Trish Rose-Sandler, Becky Morin, Alison Harding, Carolyn Sheffield, Daria Wingreen-Mason, Martin Kalfatovic, John Mignault, Cathy Buckwalter, JJ Ford, Marty Schlabach, William Ulate, Kevin Nolan, Bianca Crowley
Notes
Gemini (Jackie)
Please take a look at the open Gemini issues, Sort by created dates.
Some issues from 2010 and 2012 still open so please start by taking a look at those.
If can take care of some of those that are assigned to you, great. Otherwise, please re-assign to Bianca and Jackie.
New release of IA scanning funds came out a couple of weeks ago, and those funds are available on a first-come first-serve basis. As of November 20, there $17,551.72 available and the period of performance for the contract runs through July 31, 2014. Carolyn will add this information to the monthly highlights and reminders that she sends out.
Gemini Training Sessions (Bianca)
Next session will be scheduled for Monday morning at 10am
Round Robin
Field Diana– no report, volunteer still paginating
MBLWHOI Matt – working on Gemini things
Harvard Bot JJ – HUH is good. Tagging along w/ MCZ, putting together a shipment in the next couple of weeks. Also doing Flickr uploads. And pagination w/ help from Keiko(?).
Kew David – added some items to spreadsheet and arranging when to send over to NHM
Cornell Marty – we’re continuing to select items to scan in areas of entomology, sending offsite, almost ready to send 100 volumes. Hoping to get into Gemini issues after holidays. Excited about moving forward with portion of IMLS grant. Planning to send seed catalogs to vendor in Montreal for scanning; also investigating possibility of scanning in-house using tabletop IA Scribe... Working on est. relationship w/ IA. Gemini requests for special handling items = on campus digitization and expensive, requesting funding for this
MBG Randy – pretty much status quo. Two people scanning, going through Gemini requests, and still progressing with Engelmann correspondence scans. About halfway through now
MCZ Joe and Chris – doing well, same as last month. Combining shipments with HUH. Participating in IMLS = Work with Brewster field notebooks and start using crowdsourcing for transcription
AMNH Diana Shih – We sent shipment in Oct and has been scanned and on its way back. Gathering for another shipment soon. About 125 volumes in previous shipment.
LC Michael and Tomoko – Made a spreadsheet to figure out if going to scan or not, and if BHL already has it. MN attended IA meeting in Boston - New tabletop version of scribe, work in progress, basic idea, you buy device which communicates through wifi and ability to upload images produced on device, processing is all done by folks in San Fran, paying a per page price as a fraction of what pay now. Interest in having devices used in more places. Devices in place now don’t move easily. Still trying to figure out how to do in copyright materials for which BHL has permission. Google is in bounds for fair use
Marty – IA tabletop scanner is of interest to us, too. In library group charges to do scanning for us.
Don, NYBG – updating copyright statements for 400 items that didn’t have statements. Working on sending shipment. About 38 volumes, might grow. Also participating in IMLs with nursery catalogs, cataloging all of these as serials with holdings statements, over 6000 already in NYBG in catalog and about 1700 will be scanned.
Marty – when we were putting together this grant proposal, maybe two dozen in BHL and since then there are more than 500. Ingested from NAL material that is in IA. Hoping to set up a collection in BHL for these materials
Kevin – 600 seed catalogs on NYBG site
CAS Becky – Has materials dozen items ready to go and two dozen rescans. They’ll be opening a small scanning center in San Fran since scanning center burnt down. So waiting til week after Thanksgiving to hear what they can accommodate
NHM Alison – Just going along as normal. Cleaning up some metadata now due to new IA scanning tech errors and should be back to scanning this week.
SIL Jackie – Scribe in house now up and running. Monique is helping with scanning and Stefan, Cataloging some serials to go into BHL, working on US Ex Ex and some already in BHL now,
ANSP Cathy – ANSP sent out a shipment and when those came back we’ll be sending out more. Some staff changes but back on track. Looking forward to Macaw. SIL reports that ANSP items are making it through their queue.
Beta testing of web-based Macaw to happen at NYBG and MCZ soon...
Logistics
Canceling December Staff Call
Will resume in January
Carolyn will lead January staff call, Diana the February. They’ll be sending out reminders so keep an eye out for those.
Art of Life (Trish)
We had a two day face to face meeting on Art of Life. Hopefully in next month, we’ll know how many images we have. and we have space on IA server. Then we’ll do some basic grouping and classification of those pages with images. Is it a map / chart/ color illustration and so forth. We’re thinking maybe over a million may have images. Help users browse. Working with Joel who has built separate module in Macaw, maybe in January, with some training and pushing out to crowdsourcing platform. Criteria on classifying, maybe some tagging parties. Not sure how many simultaneous users Macaw can have. If you or volunteers, happy to get you involved. Trish will send out email for others to distribute
One thing we’ll be able to do in Macaw is correct false positives, when algorithms marks as image when it’s not, we can create
Tech Update (William)
(See also Technical Update in Agenda section of this page)
Mike has worked on the clustering algorithm which takes titles and groups them into related ones. Grabs titles that seem related by name for BioStor content ingested already. You can see an example in the Summary page of segment 4925 (
http://www.biodiversitylibrary.org/part/4925#/summary). There is an example on the wiki. Information comes from BioStor, now we can look at those that are related.
We are currently looking into the case of other titles, for example, titles of sections like "Proceedings of Learned Societies": (
http://www.biodiversitylibrary.org/search?searchTerm=Proceedings+of+Learned+Societies#/sections) that seem the same but have totally different page numbers, volumes and dates and might be in different books or items but are related because of the title. So those also will come up eventually.
Bibliography of a Taxon page – we now have a link to name sources. The work we did with the Global Name Architecture project allows us to show what were the sources of names along with some information that new services are providing. For example, look at the bibliography page of a species like Inga alba (
http://www.biodiversitylibrary.org/name/Inga_alba) where you will see at the top of page, next to the link with the icon of EOL, the option to "View Name Sources" which will take you to the new page (see
http://www.biodiversitylibrary.org/namedetail/Inga_alba).
Providers for OAI/PMH harvesting – trusted sources like SciElo.
Thanks to Mike Lichtenberg and BHL Europe – for facilitating the generation of metadata in the OLEF format. Updated BHL external wiki page to show new formats. From our own harvesting services.
Put code of portal in GitHub
Still working on new API version but that will come later next year.
Coming up, there is a NEScent-EOL-BHL research sprint that will look at mining BHL and EOL. Teams will work to answer big research questions using mined data. Proposals are being reviewed.
Working on deduplicating authors. Meeting with Suzanne and Jackie to look at that. Hopefully the idea is to figure out how want to display the different values that 700 fields have, how to display so not repeated one after the other in BHL. In meantime, check for duplicated authors is turned off. Hope to restore in January.
Conference on Open Repositories in Helsinki in June (Trish)
Connie, Jane, and Trish were interested. Jane might be able to attend. Trish will coordinate with them and hopefully whoever can attend will take the lead on that proposal. Would be a good fit for BHL.