Penn States Serial Solving Status
Notes from the BHL/ IA/ Penn State phone call 15 January 2007
Brewster K. (IA), Tom G. (BHL), Xiaonan L. (Penn State), Cathy N. (MBLWHOI/BHL), John M. (NYBG), Suzanne P.(SI), Casey N. (IA)
Tomorrow's (January 16th/17th) meeting with IA and Chris F. (BHL/MoBot) and Martin K. (SI) will talk about scanning and workflow topics.
- issues relating to scanning BHL materials (IA)
- metadata issues on scanned materials (all, but historically been largely Smithsonian)
Phone Discussion Points:
- issues relating to getting the structural metadata for journals (Penn State)
Download problems seem to be better. But there might be some response time for djvu and pdfs still.
Page number and page count: IA has files that can be used by Xiaonan Lu to verify page counts and page number (image counts and printed page numbering). An example:
http://ia350635.us.archive.org/zipview.php?zip=/1/items/journalofnatural01lond/scandata.zip
Workflow needs to be set up for Xiaonan to know what files to pull from IA besides just the djvu.xml files for her system. SI will work with Penn State to figure out how to proceed in getting more files tested.
Article recognition quality review has begun and examples of problems were sent to Xiaonan. SI will finish the document and look at the form Xiaonan sent before. More data will be sent and another file will be identified to test.
Tom mentioned that BioOne has agreed to share data with BHL and it will be following the NLM DTD. We should aim to make this information compatible - either same mapping or mappable. Basic agreement that output from Penn State should follow the NLM DTD eventually.
(
http://dtd.nlm.nih.gov/)
LC is interested in structural data of monographs. They will have one scribe station where they will begin to test some ideas on capturing data. SI will follow up with LC to find out who is working on this and share the information from BHL. Brewster mentioned that Microcsoft and Google are doing some structure recognition on monographs.
- presentation issues (I hope is a bigger issue!)
Brewster feels that the material is not being used enough through the IA site. He would like people to start working on ideas and activities to increase traffic to his site and get material used more.
*Next call schedule to be determined through email.
Older material :
PennState_IA_report_0328.ppt
PennState_3908800908003701smitrich_djvu.xml.meta (This is an xml file)
PennState_FormatedXML.doc
July 25, 2007 from xiaonan lu
[xlu@cse.psu.edu]
http://powderhorn.ist.psu.edu/IA_demo/index.php
The above address went down in August.
Starting in September the following address is being used:
Wang19.ist.psu.edu/IA_demo
- username: bhl
- password: secret0802
Penn State Style Form
Data that Penn State would like to get from serials that get errors when we attempt to submit them
PennStateStyle_form.txt
Testing at Internet Archive:
http://www.archive.org/details/annalsmagazineof15lond
click the 'contents' link in the left nav-bar (under PDF)