Sept 30 2013
Meeting Sept 30 2013
Present: Trish, Mike L, William, Bianca
Art of Life
Running algorithm
MBL – figured a workaround, not using imageMagic, using Kakadu , writing down to a file and reading from there – might be slower.
IA – William has draft back to Hank at IA and is having Kyle review first
IMA – Trish sent email to Kyle to see if they can run
IF we split work between 3 places, How should we assign pages to the 3 institutions?
Run it on all BHL pages , has to have OCR
What happens if its find a field notebook (everything has OCR even handwritten texts. )
41 million pages? List of all the identifiers is online and at item level. How is it ordered? Create a list of items and split in 3 and split in 3 segments.
Face to face meeting – William’s attendance is confirmed
who from Woods Hole is attending? Trish followup with Rob on flight
From Kyle we need code and for instructions for installing. Need a meeting this week with Kyle, Trish and William to talk about next steps.
Joel/Macaw – needs latest JSON files from IMA,
Meeting last week with MBL
They will work on the algorithm ONLY until the face to face meeting in November. We would need more money for them if we needed them to work past November 12th
Global Names - they were surprised Dima was spending time on this. Both Martin and Bob expressed lack of interest in GNS. Requires of Anthony’s time.
EOL and BHL agreed to work together to get projects e.g. Field Guide for ? Moving cluster to Smithsonian. MBL plan on turning that off eventually.
Budget scenarios – EC has upcoming meeting Oct 7th, final approval by members in ?
Send email to BHL staff with quick blurb on details of IMLS.
Bianca wants narrative and budget docs
Bianca
User generated PDFs – are we still working on that? Mike will work on when he has time. If anyone has feedback on author question let Mike know.
Metadata file from Alvin
Mike was able to match up all BHL title and volumes against Alvin’s dataset. Next step run it through procedures for Open URL to resolve pages that way. Bianca sent email to Martin about what is bigger picture for why we are doing this? Probably only useful for the DOI info. There are probably other BHL partners who would want to send us their stuff this way. Eg.AMNH Novitates Question to consider - Do we let Biostor provide it to us or do we want to set up a process for other partners to give us article-ized content?
DLF mini hackathon – what data is still needed?
William contacted folks with info on titles and taxa – eg. (Zoobank, IPNI, Index Fungorum, EOL common names, Plaze – treatments) What are subject gaps in BHL? Compare what we have in collection with subjects that define collection. Problem – is there no published content in that area or have we not digitized it. So if we have bibliographies for all published literature that helps.
LCSH Subject headings only go to level of genus and not species. Haven’t gotten data back yet. Deadline? Need in next week . William will volunteer to compile all this data together.
List of Trish for North American trees
Latest plan is outlined in Google doc and just bring datasets that are available
Trees of NA list is great
Once we get full list from providers we can extract just Trees of North America areas or Marine Mammals
One of the datasets we need is a list of all volumes/items that have at least one page containing species name “Pinus banksiana”
Title ID/TitleName/Volume/Date/SpeciesName/nameBankID/subject headings
Title and subject for volumes with “Pinus banksiana”