May 5 2014
Tech Review meeting 2014_05_05
Present: Trish, William, Mike
Absent: Bianca
Art of Life
From last weeks meeting the to do actions are still on Kyle and Joel’s plate. Mostly need to resolve the issue with IA has been down for short periods but doesn’t seem to have affected the algorithms running.
Trish revising the classification documentation. 2 volunteers started classifying now. About 6k pages done. Will upload the 2nd batch of 10k this week to Macaw.
Purposeful Gaming
Heard back from Games Learning Society – they are interested so that makes 8 companies that will bid.
Didn’t hear back from Jiri or Simon on ideas for other companies. Trish still has to followup with Zooniverse. Rob Guralnick suggestd Contact Laura Whyte
laura@zooniverse.org
FromThePage –Mike L. worked with Mike W last week and worked through problems he was having – it should be ready this week. Next steps - Mike L said he would transcribe a few pages and test exporter this week.
OCR generation from Botanicus. We need to confirm if its working (Mike L indicated it has been giving errors for quite awhile now) and can be used for PG. Mike L. did some investigation but hasn’t been able to verify things with Mike B. who has been out of the office. Ran for 2 weeks in mid-April for 2 weeks but not sure beyond that. Maybe somewhat functional.
Mining Biodiversity
Meeting this Friday. Meeting in London last week for UK folks with their founder JISC. They will present a schedule by this week. Haven’t heard from Canadians thou who are working on the n-gram algorithm to improve the OCR text. Contract with SIL – William still needs to contact Cyndy Parr (from EOL) and do a purchase order. Needs schedule first to define timeline for Community Manager and Science Advisor.
BHL
RJB OAI feed– William talked with them about creating OAI sets to remove external content. They are looking into but haven’t given a date. William hasn’t asked them to separate titles yet. If they could get external content separated out then how do we harvest it?- Do we have to delete stuff? Mike says when the new sets are available he could do a search for things we got before but not now and “unpublish” them.
Status of Tasks from TAG meeting in STL early april
- Citebank sunsetting plan – done. Trish has started testing importer.
- Full text search – William is working on plan. Waiting for Mike W to give timeline on solr setup tomorrow.
- Article metadata into Macaw – nothing from Joel on this
- Australia copy of database – done
- Contact Rod Page about keeping user generated PDFs – done
- Project outcomes document – page for each project that summarizes goals, timeline, BHL benefit, etc. (due May 31st) Trish and William will do a prototype for one project this week to determine structure
BHL OAI sets (filtering out externally hosted content) - Mike is about to deploy
User generated PDFs – got Rod’s feedback and sent to larger TAG team. Rod didn’t seem to think keeping PDFs was all that useful. Currently we keep them for 60 days. This topic was a confusion in the TAG meeting – the discussion was supposed to be about the Citebank contributed content and not whether to store User generated PDFs. Mike has questions more about preservation of Citebank PDFS - storing and backing up. William will setup a call with TAG group to work through the issues.
Hiccup last Friday on PDF Generator. Jackie reported it this morning. Not sure if running other stuff on server. Mike W in on vacation today. William will check with Bill Behrns. Mike L looked at server logs but couldn’t find anything.
Gemini update – June 16th date for upgrade