BHL
Archive
This is a read-only archive of the BHL Staff Wiki as it appeared on Sept 21, 2018. This archive is searchable using the search box on the left, but the search may be limited in the results it can provide.

Penn States Serial Solving Status

Notes from the BHL/ IA/ Penn State phone call 15 January 2007
Brewster K. (IA), Tom G. (BHL), Xiaonan L. (Penn State), Cathy N. (MBLWHOI/BHL), John M. (NYBG), Suzanne P.(SI), Casey N. (IA)

Tomorrow's (January 16th/17th) meeting with IA and Chris F. (BHL/MoBot) and Martin K. (SI) will talk about scanning and workflow topics.

Phone Discussion Points:
Download problems seem to be better. But there might be some response time for djvu and pdfs still.

Page number and page count: IA has files that can be used by Xiaonan Lu to verify page counts and page number (image counts and printed page numbering). An example:
http://ia350635.us.archive.org/zipview.php?zip=/1/items/journalofnatural01lond/scandata.zip

Workflow needs to be set up for Xiaonan to know what files to pull from IA besides just the djvu.xml files for her system. SI will work with Penn State to figure out how to proceed in getting more files tested.

Article recognition quality review has begun and examples of problems were sent to Xiaonan. SI will finish the document and look at the form Xiaonan sent before. More data will be sent and another file will be identified to test.

Tom mentioned that BioOne has agreed to share data with BHL and it will be following the NLM DTD. We should aim to make this information compatible - either same mapping or mappable. Basic agreement that output from Penn State should follow the NLM DTD eventually.
(http://dtd.nlm.nih.gov/)

LC is interested in structural data of monographs. They will have one scribe station where they will begin to test some ideas on capturing data. SI will follow up with LC to find out who is working on this and share the information from BHL. Brewster mentioned that Microcsoft and Google are doing some structure recognition on monographs.


Brewster feels that the material is not being used enough through the IA site. He would like people to start working on ideas and activities to increase traffic to his site and get material used more.

*Next call schedule to be determined through email.

Older material :
PennState_IA_report_0328.ppt
PennState_3908800908003701smitrich_djvu.xml.meta (This is an xml file)
PennState_FormatedXML.doc
July 25, 2007 from xiaonan lu [xlu@cse.psu.edu]
http://powderhorn.ist.psu.edu/IA_demo/index.php
The above address went down in August.
Starting in September the following address is being used:

Wang19.ist.psu.edu/IA_demo

Penn State Style Form
Data that Penn State would like to get from serials that get errors when we attempt to submit them
PennStateStyle_form.txt

Testing at Internet Archive:
http://www.archive.org/details/annalsmagazineof15lond
click the 'contents' link in the left nav-bar (under PDF)