Content Enhancement Tasks
Back to
Interns & Volunteers main page
See also
Gaming page on public wiki
BHL Content Enhancement Tasks
Pagination*
BHL receives page-level metadata through its scanning collaboration with Internet Archive and the institutions who share their scans with BHL. In some cases, there are no page numbers expressed in the metadata, and no indications of the type of page (i.e. Table of Contents, Text, Cover, Illustration, Map, Blank), which makes navigating those books nearly impossible without a time consuming page-by-page review.
Current Contributors: JJ, Gilbert Borrego
Tools used for task: Macaw, Paginator
Benefit: improves end user searching within BHL portal
Intern/Volunteer discussion of 2/29/2012 decided pagination was best as focus of work, see Intern Volunteer Coordination
Article-ization
BHL already has a UI that allows users to select non-contiguous pages from a scanned volume and bundle those pages into a PDF that's created on the fly and delivered to them via e-mail notification. This is conceptually similar to the Table of Contents rekeying, but still different because not all published journals have Tables of Contents, and in historic literature not all pages for an article were printed together (plates were often printed at the end of an issue).
Current Contributors: any BHL user can generate articles and have them added to Citebank as long as they create a Title and Author for the article. Rod Page via UBio
Tools used for task: PDF Generator, uBio
Benefit: provides end users with article level access that they desire, connects content in Citebank with content in BHL
Image identification & extraction
BHL has coordinate-based OCR for nearly all of its scanned pages. We'd like to automatically identify the objects within a scanned page that are a "visual resource" (i.e. figures, plates, illuminated texts, tables) and then provide a way to rekey the caption or other descriptive information.
Current Contributors: Gilbert Borrego
Tools used for task: Paginator
Benefit: would provide an easier way to browse through illustrations and plates in BHL books and journals other than the very manual process of browsing page by page in current viewer. Images can be more easily extracted from BHL portal and incorporated into other image-related portals like Flickr, Flickr is a great way to reach new audiences for BHL, particularly those in non-science disciplines who are heavy users of illustrations.
Adding scientific names & common names
Related to the two tasks above, BHL staff manually identify illustrations and set page types for particular books of interest and upload those images to Flickr, where they can be tagged & indexed by others within the Flickr community. Of particular interest to our users is being able to find illustrations by scientific name and by common name. When we add a "
machine tag" with a scientific name to an image in Flickr, that image is then indexed by the Encyclopedia of Life and made available to its large community of users.
Current Contributors: Gilbert Borrego, Flickr users
Tools used for task: Flickr
About BHL stream and instructions for contributing
http://www.flickr.com/people/biodivlibrary/
BioDivLibrary’s photostream
http://www.flickr.com/photos/biodivlibrary/sets/
Benefit: further promotion of BHL content, connects BHL content with content from EOL
Related
Wikimedia Commons (related to image identification above)
http://commons.wikimedia.org/wiki/Commons:Biodiversity_Heritage_Library
This is a project initiated outside of BHL by Guarav Vaidya, grad student at Univ of Colorado, Boulder working on biodiversity informatics
http://www.ggvaidya.com/ He's Tagged 200 images (as of 2/23/12) from BHL, fixed their copyright status, and linked back to BHL website (linking at item and not page level). Its described here as "This project hopes to facilitate a partnership between the BHL and the Wikimedia Commons to their mutual benefit. In order to convince the BHL to commit scarce resources to this task, we want to start by showing them the value of the Wikimedia Commons as a facilitator in making their content widely available and as a way to enhance their brand.
The files:
http://commons.wikimedia.org/wiki/Category:Files_from_the_Biodiversity_Heritage_Library
Template for tagging BHL files:
http://commons.wikimedia.org/wiki/Template:Biodiversity_Heritage_Library
Current Contributors: Guarav Vaidya
Benefit: further promotes BHL content, puts BHL images into other portals, may drive more traffic back to BHL
Wikipedia
BHL & Wikipedia
Wikipedia links to BHL content
http://en.wikipedia.org/w/index.php?title=Special:LinkSearch&target=http%3A%2F%2F*.biodiversitylibrary.org&limit=500&offset=0
http://linkypedia.info/websites/34/pages/
Engaging wikipedians
Glam-wikian: Sarah Stierch, Was Wikipedian in Residence at Archives of American Art and now at Smithsonian Archives StierchS@SI.EDU.
http://en.wikipedia.org/wiki/User:SarahStierchShe is part of GLAM-wiki initiative
http://en.wikipedia.org/wiki/Wikipedia:GLAM/US
ways of engaging wikipedians - editathons
http://www.nypl.org/locations/tid/55/node/134716?lref=55%2Fcalendar
Questions
What about folks uploading citations to Citebank? Does this activity fit in with content enhancement?