Purposeful gaming IMLS grant project page
printer friendly
Type in the content of your page here.
This is the internal wiki page for staff related to the IMLS grant called
Purposeful Gaming which runs Dec 1, 2013-Nov 30, 2015
For the public wiki page for Purposeful Gaming grant see
http://biodivlib.wikispaces.com/Purposeful+Gaming
listserv
Send messages to
bhl-purposeful-gaming-l@cornell.edu
Current members of this e-list:
All Project Staff Monthly Meeting minutes
meeting minutes 2015_10_14.docx
meeting minutes 2015_09_09.docx
meeting minutes 2015_08_25.docx
meeting minutes 2015_06_10.docx
face to face meeting May 12 2015
meeting minutes 2015_04_08.docx
meeting minutes 2015_03_11.docx
meeting minutes 2015_02_11.docx
meeting minutes 2015_01_13.docx
meeting minutes 2014_12_10.docx
meeting minutes 2014_11_12.docx
meeting minutes 2014_10_08.docx
meeting minutes 2014_09_10.docx
meeting minutes 2014_07_09.docx
meeting minutes 2014_06_11.docx
meeting minutes 2014_05_14.docx
meeting minutes 2014_04_09.docx
meeting minutes 2014_03_05.docx
face to face meeting 2014_02_11 final notes.docx
meeting minues 2014_01_08.docx
meeting minutes 2013_12_10.docx
meeting minutes 2013_11_20.docx
NYBG and Cornell (& NAL) meetings
seed & nurs cat digitizing mtg 2015.08.04.doc
seed & nurs cat digitizing mtg 2015.02.10.doc
seed & nurs cat digitizing mtg 2014.12.9.doc
seed & nurs cat digitizing mtg 2014.10.14.doc
seed & nurs cat digitizing mtg 2014.07.08.doc
seed & nurs cat digitizing mtg 2014.01.21.doc
Tiltfactor and MOBOT meetings
Dartmouth Interim Narrative report Year 1.pdf
Tiltfactor meeting 2014_08_20.docx
Meeting with Tiltfactor 2014_08_07.docx
Outreach & Communications Plan
PurposefulGamingOutreachCommunicationPlan.docx
- last updated 12/15/14 by Patrick Randall
7 Gaming Companies that expressed interest and were sent RFP
Transcription Tools
Transcription Tool Assessments
Transcription tool task force was formed to assess the tools above and make a final decision. Team included: William Ulate, Mike Lichtenberg, Trish Rose-Sandler, John Mignault, Joel Ricard, Joe DeVeer. Francis and Jenna from TAG team were asked to join but Jenna said she did not time and no response from Francis.
Meeting minutes:
transcription tool meeting 2014_03_26.docx
TRAnscription tool meeting 2014_03_19.docx
Transcription tool meeting 2014_03_12.docx
transciption tool meeting 2014-03-05.docx
Transcription meeting 2014_02_26.docx
Smithsonian Transcription Tool
https://transcription.si.edu/
Creator/Organization: Smithsonian Institution
Advantages:
- Good navigation
- User friendly
- Books go through a defined process (Start, Review, Complete)
- Stats and progress nicely visualized
- Registered users can both transcribe and review (project staff make final approval)
Limitations:
- Code not yet released for general distribution
- No text markup
FromThePage
http://fromthepage.com/
Creator/Organization: Ben Brumfield (
benwbrum@gmail.com)
Documentation:
https://github.com/benwbrum/fromthepage/wiki/_pages
Code URL:
http://github.com/benwbrum/fromthepage/wiki
Platform: Ruby on Rails
Advantages:
- Can easily import page images from Internet Archive to hosted platform
- Good navigation
- Semantic mark-up for indexing and creation of subject tags (useful to us?)
- Visualization graphic based on relationships between subject tags
- XML exporting
- TEI supported for transcription exports
- Used by SDNHM and MVZ
Limitations:
- Can host up to 100 images for free after which there is a charge. Charge determined on an individual basis
- No formal review process enabled by tool
Open source: yes
Crowdsourcing: yes
User-friendly: yes
Admin oversight: yes
Admin editing: yes
Page coordinate data: no
Sustainability: questionable (run by a single person), but currently under active development
Transcribe Bentham
http://www.transcribe-bentham.da.ulcc.ac.uk/td/Transcribe_Bentham
Creator/Organization: University of London Computer Centre/UCL Bentham Project
Contact: Dr. Tim Causer (
t.causer@ucl.ac.uk)
Code URL:
https://github.com/onothimagen/cbp-transcription-desk
Platform: MediaWiki
Advantages:
- Full TEI mark-up support and toolbar
- Rotate images in viewer
- Extensive revision history
Limitations:
- Navigation not as intuitive as it could be
- Code difficult to install
- Can't get structured data out
- Poor review process - via email to editors
Open source: yes
Crowdsourcing: yes
User-friendly: yes, but not as intuitive as some of the other tools
Admin oversight: yes, though some manual checking and updating of the website is required
Admin editing: yes
Page coordinate data: no, but working on this for another project called
tranScriptorium (
http://transcriptorium.dsic.upv.es/), which is developing software to automatically transcribe handwritten manuscripts
Sustainability: Has support of University of London Computer Centre. Project has been successfully crowdsourcing since October 2010 (over 6,700 complex manuscripts transcribed by volunteers).
Atlas of Living Australia Biodiversity Volunteer Portal
http://volunteer.ala.org.au/
Creator/Organization: Atlas of Living Australia/Australian Museum
Contact: Paul Flemons (
paul.flemons@austmus.gov.au)
Code URL:
https://code.google.com/p/ala-volunteer/source/checkout
Platform: Postgres/Java/Grails/Apache
Advantages:
- Special fields for species occurrence data (useful to us?)
- Rotate images in viewer
- Tool supports formal review process
Limitations:
- Navigation within books a bit tricky
- No text markup
- Transcription box is below page image - not as user friendly
Open source: yes
Crowdsourcing: yes
User-friendly: yes, but navigation within books a bit tricky
Admin oversight: yes
Admin editing: yes
Page coordinate data: no
Sustainability: Supported by Australian Museum and implemented by Atlas of Living Australia
Transcribr (National Archives Transcription Pilot Project)
http://www.archives.gov/citizen-archivist/transcribe/
Creator/Organization: NARA
Code URL:
https://drupal.org/project/transcribe_distribution
Platform: Drupal
Advantages:
- Difficulty rating (beginner, intermediate, advanced)
- Browse for documents by difficulty, year, and transcription status
Limitations:
- Not a limitation of the tool itself, but there is currently very little content available on the site for trial purposes
Open source: yes
Crowdsourcing: yes
User-friendly: yes
Admin editing: yes
Page coordinate data: no
Sustainability: Developed by U.S. National Archives. Under active development. 20 sites using the current distribution
T-PEN
http://t-pen.org/TPEN/
Creator/Organization: St. Louis University Center for Digital Theology
Code URL:
https://github.com/jginther/T-PEN
Platform: Java/Javascript
Advantages:
- Line/column page coordinates?
- Each line can be appended with notes
- Editor has tool for insertion of unusual characters
- Work is auto-saved on the fly
- PDF and XML exports
Limitations:
- Uploads to hosted instance must be under 200mb
- Can't handle vertical writing
- The least intuitive of all the tools investigated - considerable learning curve
- Charge by SLU for further development
Open source: yes
Crowdsourcing: yes
User-friendly: less so than the other tools, steeper learning curve
Page coordinate data: possibly at line-level (tool has ability to link transcription data to lines of text in image)
Sustainability: Developed and maintained at St. Louis University, project funded by Mellon and NEH
Text to Image Linking Tools
These are tools that will give coordinate information to help associate a transcribed word on a page to where it is located within the page image.
Mike L's General notes:
- Any tool that requires manual linking of words in transcriptions to parts of a page scan will be VERY time-consuming to use. I think it is a safe assumption that this includes anything that TiltFactor might develop for us.
- There is a lot of uncertainty around the first option, but it is likely that using any of the tools will require preprocessing of the transcriptions (to get them into the appropriate format for the image-linking tool) and post-processing of the tool output (to get it into the appropriate game input format).
Text to Image Linking Tool (TILT2)
Tiltfactor's asessment
● Automatic
● In development by University of Queensland Australia
● May or may not be ready in time
● Output format unknown
● The idea won a competition at British Library
● Can be tested here:
http://ecdosis.net/tilt/test/post
The Text to Image Linking Tool (TILT2) is an automated method of linking text to scanned facsimiles. They are about a month into development now, and have a working prototype that can be run on some test digitizations and can be found here:
http://ecdosis.net/tilt/test/post. It’s still a little rough (the linking doesn’t work perfectly), but it’s very likely that the tool will become more reliable and robust over the coming months. There aren’t any samples of what the output might look like
TILT project:
http://bltilt.blogspot.co.uk/
They have a working demo of stuff they have uploaded:
http://ecdosis.net/tilt/test/post
And a github page:
https://github.com/AustESE-Infrastructure/TILT
Mike Ls assessment
- Automatic linking of images and text!
- Produces polygons rather than rectangular bounding boxes. (Probably why TiltFactor was pushing for polygons in the game inputs.)
- Very rough. Their demo site didn’t work great, but it is better than nothing. I envision workflow to be 1) auto-process page, 2) manually clean up mistakes in automated output, 3) save/export output.
- Import and export formats are undefined.
- Shows promise. IF this is ready on time, and IF the imports/exports are easily usable, it looks like it will be a good tool.
- Core of the tool appears to be written in Java, but the GitHub site hasn’t been updated in more than two years.
TextGrid
Tiltfactor's asessment
● Manual text/image linking
● Requires downloading software and creating accounts
● Difficult to use
● Difficult to work with output data
TextGrid’s Text Image Link Editor (TILE) is a manual text/image linking editor. In order to use it you need to set up TextGrid, for which you must have an account manually verified. The tool itself is somewhat difficult to use, and the data output format is not easily manipulable.
TextGrid’s TILE:
https://dev2.dariah.eu/wiki/display/TextGrid/Text+Image+Link+Editor
Mike L's assessment
- Initially difficult to understand how to use the tool.
- Allows linking of individual words to parts of a page scan.
- Input text must be in XML format. FromThePage output is already in a XML format, ALA output is not.
- The tool modifies the input files, adding “anchor” tags. It then outputs a separate file that contains bounding box information that relates to the “anchor” tags added to the input file.
- The bounding box coordinates do not use the same unit of measurement as the OCR outputs (DJVU and Tesseract). It is not clear what unit is being used. Needs to be understood before useful game inputs can be produced.
- I envision the workflow to be 1) Convert input files to XML (if necessary), 2) Load into the tool, 3) Use the tool to manually tie words to the image, 4) Output the data from the tool, 5) Transform the tool outputs into the game input format.
- Proprietary desktop application.
TILE Text Image Linking Environment
Tiltfactor's asessment
● Manual tool
● Easy to use, easy to access
● Exports to JSON
● Does not easily link selected text from manuscript to image areas; instead links LINES, and allows ANNOTATION of individual areas
text-image linking environment is a tool that comes out of a collaboration between faculty at Maryland Institute for Technology in the Humanities and Indiana University. This is a manual text image linking tool, but unlike TextGrid’s TILE,
text-image linking environment is easily accessible and built in javascript. The main drawbacks of
text-image linking environment appear to be that it’s built to identify lines and
annotate individual locations in the facsimile, not link them to individual words.
text-image linking environment:
http://mith.umd.edu/tile/
Mike L's assessment
- Input text must be transformed into a format that the tool understands (TEI XML or JSON). No documentation of these formats is provided (though TEI is a standard). The transformation could be challenging.
- The JSON output of the tool would need to be transformed to the game input format. This looks like it would be relative simple.
- Not clear what units are used for bounding box coordinates.
- Can be installed locally.
- Only allows lines to be tied to an image, not individual words. We would need to modify the tool in order to use it to identify words.
- Likely workflow… 1) Convert input files to TEI/JSON, 2) Load into the tool, 3) Use the tool to manually tie lines (words?) to the image, 4) Output the data from the tool, 5) Transform the tool outputs into the game input format.
- PHP and Javascript
Alethia
Recommended by Tim Causer
t.causer@ucl.ac.uk
Our colleagues on the tranScriptorium project have been working on the capturing of co-ordinates for particular image sectors, and have created a dataset which is now publicly downloadable from the project website at
http://transcriptorium.dsic.upv.es. This hasn’t yet been incorporated to our crowdsourcing testing platform, but should be in the next couple of months.
They use a tool called ALETHIA, developed by the University of Salford, lets you mark up manuscript images in a format called PAGE XML, which may be of some interest. From memory, I believe this is an open-source tool, and is extremely user-friendly.
Notes from Nature tool
doi: 10.3897/zookeys.209.3472
Games We Like
https://itunes.apple.com/us/app/type-rider/id667443268
(Trish reviewed - this mobile app is beautifully-designed and is aimed at teaching about history of typography. Its mostly about moving 3 little balls around in different landscapes and helping them get to safety. You are given access to books about typography when you reach certain levels. The audio and visuals really come together in this game and keep me wanting to play. I also like that are are different control settings for moving the objects I like the tilt option where you tilt your screen in the direction you want it to go.
seed & nurs cat digitizing mtg 2014.01.21.docseed & nurs cat digitizing mtg 2014.01.21.doc