BHL
Archive
This is a read-only archive of the BHL Staff Wiki as it appeared on Sept 21, 2018. This archive is searchable using the search box on the left, but the search may be limited in the results it can provide.

Draft use cases ideas from the Berlin meeting

UseCases

Side meeting use cases, Berlin, Wednesday 24 Feb 2010

- Prepare, dicsuss and review use cases until 10th of march (and collect other Use Cases)
- Publish on Wiki
- Set up 2 smal expert groups (technology and taxonomy)
- Discuss at @ Day 2 in Vienna
- Allign with User Survey results

Methodology

Prerequisite
Action
Expected outcome
Exceptions

User Groups

Librarians –
Checking if scan is there
Ingesting

Scientists
Login
Anonymous

General Public

Administrator
System
Functional

Europeana and other Portals

Active programers

who - what - why

Cases


· admin-sys
catalogue contains more than just biodiversity literature - how to filter
Justification: BHL-E only needs biodiversity literature
- fwelter fwelter Jun 10, 2010 I object here, for various reasons. The GRIB should contain the full records of all library catalogues including literature that is not marked as biodiversity related. (1) Much information was published in non-biodiversity related literature sources. These are usually the most difficult sources to find for the scientist, and removing them from the GRIB will create much additional workload. The GRIB needs to contain non-biodiversity related literature. (2) Marking literature as biodiversity related is highly error intensive, because this work is often done by medium-skilled and low-paid persons. The process of removing large sections of library catalogues to exclude them from the GRIB will inevitably provoke that a number of biodiversity related sources will also be removed. (3) If the users know that the catalogue is incomplete, and they do not find what they were looking for, they will always have the feeling that they cannot rely on GRIB, for it being incomplete. Frequent users will quickly make this experience (that some source was not found in BHL, but yes in an original library catalogue of a BHL content provider) and understand the backgrounds, and then some might quickly begin communicating to others that the BHL catalogue is not reliable. (4) Much unnecessary workload for the BHL team would be created by friendly users telling BHL that some "non-biodiversity" sources they found should be added to the GRIB. (5) It is my experience that it is not a problem to search in a catalogue that contains 95 % non-biodiversity related literature. You usually search by using biodiversity related search terms, which rarely hit non-biodiversity related literature in a results set.

· admin-sys
feed back of which literature is already scanned or bid to the original data providers catalogues
Justification: content providers need to avoid double work

· admin-sys
Usage statistics (web stats) on the base of whole system, content provider, thematic, digitized works
Justification: periodic reporting duties of content providers; reporting of usage of system

· admin-sys / admin-funct / sci
Data quality control and assurance (GRIB)
Justification: reliability of source

· admin-sys / admin-funct
Search functionality of portal – google like search
Justification:

· admin-sys
Full automatic implementation of processes
Justification: too much time needed to manunally update content (metadata, digital objects, etc.)

· gen / sci
All content (metadata and digitized – OCR text) that can be retrieved from the portal should be indexed by search engines (Google, )
Justification: most customers find information by - large Projects do not have implemented such facilities BHL, Catalogue of Life, GBIF – e.g. FishBase
adaptation for changing search strategies

· gen / sci / admin-funct
download of content in a simple way independent of the size of the package; this shall include metadata, different formats and resolution of images
Justification: make data available offline, and enabling the reuse in local desktop applications and transfer in local information systems

· gen / sci / admin-funct
quick information if a work has been digitized or is in process of digitiztion
Justification: encourage process for digitization and make activities visible

· gen / sci / lib
search for Images i.e. figures, plates, illustrations and easy download
Justification: course materials for teaching (university , school levels)

· gen / sci / lib
Efficient and quick visualisation of digitized content online
text only files within 1 second,
display of high res images within less than 5 seconds;
zoom options several zoom options
Justification:

· gen / sci / lib
one portal for searching on digital content
Justification: users do not want to search on more than one place

· prog /sci
Stable and unique IDs for each digitized work / article / page
Justification: reference of special portals to digitized content of BHL-E

· sci / gen
meta data enhancement and social tagging
Justification: controlled access for accepted authorities to enhance metadata

· sci
feedback and error reporting
Justification: reliability of resource

· sci
names of taxa which are published in a certain work in original spelling, corrected spelling and reference to protologues / original publication
Justification:

· sci / lib
Access on article level
Justification: content of volumes is mixed but scientists need access to individual domain content

· sci / admin-funct
Names of Persons in full length not abbreviated
Justification: merging of databases, recognition of individual person

· sci / admin-sys / admin-funct
Intelligent linking capabilities / Interface to external global initiatives (Nomenclators – IPNI, Index Fungorum, Catalogue of Life, EoL, CBoL, etc.) (e.g. open URL – webservices)
Justification: visibility of BHL-E

· sci / admin-sys / admin-funct
stable URLs for Scan / Item
Justification: external webites link directly to BHL content

· sci / gen
Page level access data and images, including metadata and complete page numbers and plate numbers
Justification: daily taxonomic work - protologue basic reference

· sci / lib
Full citation of titles (i.e. serial names and monograph), standard abbreviation and variant citations
Justification: reliability of source





linking from outside to bhl-content - search route via name finding tools

relevant initiatives that will make direct use of the bhl-content
GPI (Global Plants Initiative ) http://www.bores.org
GBIF (Global Biodiversity Information Facility) http://www.gbif.org
EOL (Encyclopedia of Life) http://www.eol.org
KeyToNature http://www.keytonature.eu/
TROPICOS http://www.tropicos.org
IPNI http://www.ipni.org
Catalogue of Life http://www.catalogueoflife.org/search.php

providing webservices



webportal

fulltext search

page level access

export functionality - DC / MARC / etc.

duplicate catalogue entries (duplication on purpose) for one
expose only one instance of a journal or monograph - but show the instances in the libraries with their varying titles and indicate available digitized versions – portal must deal with visualization !
differing editions of the same work (example: Species plantarum ed 1 Uppsala, Wien, etc…) - frbr approach !!
some thoughts, sources, and implementations:
http://www.frbr.org/
http://www.frbr.org/categories/ifla
http://www.bnf.fr/pages/version_anglaise/normes/no-acFRBR_gb.htm
https://www.gbv.de/vgm/info/mitglieder/02Verbund/01Erschliessung/01Aktuelles/2008/2008_3612
Open Library (http://openlibrary.org/); note, the current version does not show the grouping of various editions under titles; this new display will be available in later February



document handling and usage

embed OCR text into the digitized page to enable a user to copy paste the text for reuse his purpose

handling of bad quality OCR - feed back mechanisms?

Crowd sourcing – social tagging

attach information to textpage or image to documents (comment pages in animalbase)

add publication dates to fascicles in case of publication parts within one publication in fascicles
edition in fascicels - missing wrappers in bound item

Inserting additional pages in case those are missing in originally digitized work
reasons:
missing parts – e.g. text scanned but plates missing
hand coloured plates (different quality of coloration) differing versions of coloration in different copies - only add plates / issue of differing metadata ?


reporting

feedback forms - multilingual aspects - librarians working things out (!?)


images

Key to Nature has two major use cases - http://www.keytonature.eu/wiki/Key_to_Nature_-_BHL_Europe_use_cases
transcription of the text from their website as follows
  1. a user of an identification key has a function: "find further images", displaying images from reasonable reliable resource (not Google Images).
    • Here we need either very reliable metadata for the image, or need to display the image in the context of the original page, so that a possible mismatch will be evident to non-expert users.
  2. a creator of an identification key needs to illustrate a key and searches for additional or better images.
    • Here we want to integrate content from BHL/BHL-Europe with appropriate attribution, into identification keys. We are therefore very interested in open content licensing (much of our own content is or will be cc-by-sa).
comments:



added value

integration with ontological services – finnish ontology - http://www.seco.tkk.fi/ontologies/


integration of external services

PersonNameDatabases -
establishing / initiating collaboration with authority institutions
NACO (Library of Congress Initiative) http://www.loc.gov/catdir/pcc/naco/instappl.html
Personennamendatei (PND) http://www.d-nb.de/standardisierung/normdateien/pnd_info.htm
ask them for setting up z39.50 / SOAP / JSON services

Redlists – iucn redlist