TechCall_09may2016
Agenda Items
- SciELO asked for "Graphic" page type
- Further discussion pertaining to changing the Contributor field
- BHL Move Update
- Methods or schemas for importing Segment metadata into Macaw (CSV/XML/JATS)
- Hyperlinking items in a monographic series
- New fields in Item table
- Priority review (EABL, DB changes, Full Text Search, Taking over the world)
Notes
SciELO asked for "Graphic" page type
First question is, do we want to consider this as part of our current tech priorities?
Currently, 4 or 5 page types exist in Macaw and not in BHL.
This would add Graphic to the existing:
white card
color card
specimen
bibliography
suppressed = don’t show it all; IA might be picking up on suppressed already
BHL currently does nothing with these when they come in
how did SciELO define the Graphic page type?
No definition provided
Joel will see how often people are using specimen and bibliogprahy
Joel to see if we can get some examples from SciELO
Further discussion pertaining to changing the Contributor field
Susan had an initial conversation with Bianca following our last call where we discussed adding fields related to Contributor at the Item and Segment level. For segments, we discussed the need to have something parallel to the Item Level contributors and also supporting multiple contributors, e.g., BHL Australia and BioStor and with distinguishing between the roles of each contributor. Susan will follow up with Bianca to propose names and definitions for the roles.
We'll revisit the Contributor at the Title level next week after we've all reviewed the email that Mike sent.
BHL Move Update
Joel submitted the last pieces of requested information. Currently waiting for them to complete review of that new information. No known changes at this point.
Methods or schemas for importing Segment metadata into Macaw (CSV/XML/JATS)
Received some change requests to Macaw for segments
When partners are importing new items to BHL and already have segment metadata available, how to handle?
Trish suggested csv and xml
On the fence about developing/providing new functionality for this because it's still unclear how many would actually be using it. BHL Australia is one. Are there others?
Susan - None of the contributors have expressed an interest in this
Trish - at last EABL meeting, in Patrick's summary he mentioned 5 or 6 that have articles but not sure what was meant by that. Article metadata for content already in BHL? Scanned at article level? New content for which article metadata is already available?
Marty has asked several times how to get article metadata into BHL
Susan and Trish have been discussing with Rod, if someone has a spreadsheet of articles, as a csv for example, Rod can process through BioStor and send to BHL
Mike - just use a csv, it's simplest and Rod has said he can work with that
Should Joel implement a csv import first?
Agree that we don’t think we need xml, could alwasy use a transform anyway
Mike -- might consider using Tab separate rather than comma because can get messy with commas in the data
Susan had some thoughts, Trish sent over Australia’s article level metadata.
Turns out most of what they have is articles in English, all have one author, metadata is very stripped down.
Had to support article metadata fields, the full set of fields is so complex.
For someone with more complex metadata (i.e., foreign languages, diacritics, multiple authors) using GUI and interacting with Macaw trying to enter full title can be very cumbersome and prone to introducing incorrect data
Beginning to think we don’t ned to support all of that from the GUI and rather focus on low hanging fruit, what can we already accomplish with what we have. Australia is the simplest use case imaginable
Many can be taken from title level info from Macaw
As far as container goes
Optional ones explode the number of fields we have
Australia metadata is pretty minimal
May be we see what we absolutely need
Asking someone to accurately enter name of scientific and includes scientific names
Many opportunities to introduce mistakes
Minimal support for articles in Macaw
Using a csv file and going through BioStor
If a csv file exists it is probably from a professional indexer or bibliographer or scientist who has invested a lot of time to get the title right
At NYBG, Macaw data enterers are primarily volunteers
May be up to institution how much they want to put it in there in terms of citations, we would need something to identify
Joel will lay out changes he would need to do in Macaw, how they’d be represented in UI
Would only be about 5-7 fields. Also could have some of the optional fields available too.
DOIs need to be available, too. So container info would need to be entered for things that do not exist in BHL.
They’re entering this as something being added to Macaw
What if could upload container info first, start adding articles, becomes affiliated with container?
Containers have to exist in Macaw first, through whatever means.
Susan -- one example to look at might be Pensoft — Zookeys, all xml files available online
Joel will take a look
Trish - talked to Rod about how often he downloads BHL data. He started with a complete download and now just grab ones he's interested in via the API. He looks on Sundays when new items are loaded. There are lots of items he considers irrelevant to his needs so doesn’t really look at those.
Susan - I think we need to ask more questions about potential for duplication. I don’t think he’s making decisions in absence of data. He pulls in what he doesn’t have but needs. I'd still want to ask about potential duplication.
Hyperlinking items in a monographic series
AMNH interested in treating field notes like a multi-volume set
Basically like a series thing that appears that collects them altogether
Series - used for statistics, Judy's preference. She felt that using a collection didn’t accomplish what she needed
Table this until we can get some clarification on what people would like to do and why they prefer certain strategies over others
New fields in Item table
Do we need to add field or fields to keep track?
Mike’s email suggests some possible ways to capture this.
Currently, it's impossible to parse. This extra field would track when populated and when not, when is a flag set and how reliable is that.
Priority review (EABL, DB changes, Full Text Search, Taking over the world)
How to prioritize our technical initiatives, i.e. for changes to contributors and items and segments and titles, how urgent are those changes in relation to full text?
Are these needed by EABL and if so by when?
Contributor changes are a higher priority than full text search.
Needs to be implemented within a couple of months. Will help prevent extra work on Mariah's part.
Full text search — what level of research is needed at this stage?
Are there any kind of parameters, deliverables or outcomes, fit into a schedule for either or both of you
We need to figure out how to ingest stuff, what data or structure it is, size needs, disk needs. Will need to break it down a little more than that. At this point, we're at the stage where we really need to start trying it in order to get a better sense of what might be involved and needed.
Contributor Display
For single volume monographs - The contributor will be shown
For serials or multi-volume monographs - we will show "Contributed by Multiple Institutions"