discussion
Suzanne,
Don’t know if you have heard from anyone else, but I’ll take a brief shot at these.
First, one that you missed…
BHLFEED-57279 Retrospective Title Clean Up (removing extra data)
Discovery Layer Working Group
KBART Needs:
publication_title
Incorrectly includes 245 $c (Statement of responsibility). Serials example: “Annual report of the Laguna Marine Laboratory / Pomona College.” Monograph example: “Genera of European and Northamerican Bryineae (Mosses) /synoptically disposed by N. C. Kindberg.”
Retrospective Cleanup
Go back to ingest database where title field is available.
Challenge is we will also need to re-register the DOIs to change the titles; DOI update process would need to be created, since it does not currently exist.
This is fairly straightforward. I see two tasks that should be done in the order listed here:
1) Modify the DOI assignment process to handle updates to existing DOIs.
2) Remove the 245c portion of the title from those titles that include it.
Moving on to the ones you included…
BHLFEED-57286 Documenation/Requirements for Metadata -KBart Needs
This is pretty good as-is. Probably needs a link to the Working Group’s documentation, and should be assigned to someone other than (or in addition to) Joel and I.
BHLFEED-57285 Indication of Publication type requirement
See my previous email for many comments about this one.
BHLFEED-57280 First issue/ Last issue Format and information
BHLFEED-57281 Formating and Recording Issues online
BHLFEED-57282 Break out Serial volume numbers for first and last
BHLFEED-57284 Serial data capture for Gap filling
There is a whole lot of overlap here, and I’m not quite sure even where to start. The problem isn’t so much the way you wrote them up, Suzanne, as it is the nature of this problem. BHL data (messy, parsing/automation very hard), data model (lacking a number of fields that would make holdings identification easier), processes (weekly automated ingests can throw off order of volumes, which are crucial to determining “first” and “last” holdings), and available resources (BHL has 57000+ volumes in 3800+ serials, and a lot of manual work will be needed) simply do not map well to KBART’s “holdings” requirements, and I do not see an easy remedy.
MIKE
From: Pilsk, Suzanne [
mailto:PilskS@si.edu]
Sent: Thursday, February 11, 2016 12:36 PM
To: Mike Lichtenberg; 'Adam L. Chandler'; Crowley, Bianca; Michael Loran; William Ulate; Matt Person; Diana Duncan; Lynch, Susan
Cc: Richard, Joel M
Subject: RE: BHL Discovery Tools Task Force
This is EXACTLY the feedback that is needed as I took my first stab at creating these tickets.
What I do not know is how the Tech Team/Tech Committee works so I don’t know what is something to send to them to figure out the outcome and when is it this committee’s recommendations for an action to take place? I don’t know that what the committee recommendations are an action or an outcome or just a recommendation. I also am confused as to when / who accepts the recommendations.
See where my confusion lays?
Should all of the recommendations be just a need for policy decisions on yes/no and then turned into actionable tickets? And I have no idea who to assign a policy question to regarding these things.
It could be that the Tech Committee needs to review all these tickets ( I believe there are only 6?) and tease out what is a ticket that is an actionable item, what needs to be reassigned to others for a decision, etc. I was told to at least put Joel and Mike on them so that they could be vetted.
Sounds like I need to tease all these apart more. Which if good feedback.
Where should we put the committee’s final recommendation paper so that it is not lost? I am happy to load it up and attach it to each of the gemini tickets- or is that redundant and not really kosher?
Below are some text from the tickets as they stand now – this might be better for the committee to see and indicate what they think are separate tickets for me to adjust and add etc.
What say you all?
BHLFEED-57280 First issue/ Last issue Format and information
Discovery Layer Working Group
KBART Needs:
date_first_issue_online
date_last_issue_online
in ISO 8601 format (YYYY-MM-DD)
Retrospectively:
3818 serials that need some sort of combination attack of scripting and manual clean up.
Complications are with merging titles and knowing when more records need to be examined.
Estimated 2 minutes per titleWe need to identify volunteers who can take on some of this project or have it as a data clean up assignment for internships or some other creative idea.
Cornell has offed some assistance
BHLFEED-57281 Formating and Recording Issues online
Discovery Layer Working Group
KBART Needs:
data_first_issue_online
date_last_issue_online
in ISO 8601 format (YYYY-MM-DD)
Going forward
Data providers need to be trained and materials need to be documented about the format of this data. Going forward, we may need to run periodic reports for clean up. If this is the path then we need to gather lists before they are too dauntingly long.
We need to investigate what to do with serial gaps
BHLFEED-57282 Break out Serial volume numbers for first and last
Discovery Layer Working Group
KBART Needs:
num_first_vol_online,
num_last_vol_online
KBART needs one number, not a MARC holdings statement. Serials only. For example, for this entry “v.1:no.1 (1890:March)” KBART needs only: “1”.
This expected by be 3818 serials. Combination of scripted and manual cleanup.
Going forward we need a way for this to be automagically completed and stored to be mapped appropriately
BHLFEED-57284 Serial data capture for Gap filling
BHL Feedback
Created 10-Feb-2016 21:23 → Suzanne Pilsk
Revised 10-Feb-2016 21:29 → Suzanne Pilsk
Elapsed 0d 21h 2m
[Hide]
Discovery Layer Working Group
KBART Needs:
date_first_issue_online Should be ISO 8601 format (YYYY-MM-DD)
num_first_vol_online KBART needs one number, not a MARC holdings statement. Serials only. For example, for this entry “v.1:no.1 (1890:March)” KBART needs only: “1”.
date_last_issue_online Should be ISO 8601 format (YYYY-MM-DD)
num_last_vol_online
A review of BHL workflow for “gap filling.” We need a scheduled database review. Quarterly? After X ingest amount? Working from reports. Alternatively, we could rethink how we add data to database - use database format instead of library holdings format. For date_ , yyyy is adequate.
BHLFEED-57285 Indication of Publication type requirement
Discovery Layer Working Group
KBART Needs:
publication_type
KBART states that this field should be coded as either serial or monograph. We have those plus some other types:
Over 3000 are coded with values that KBART does not recognize.
Collection: 881
Monographic component part: 252
NA: 1,920
Serial component part: 3
monograph: 90,330
serial: 3,818
We will include only monograph and serial types in the KBART feed (policy). Over time we will make policy decisions or do clean up to include more of the content. For example, some records could be retrospectively cleaned up or mapped to the monograph(e.g., map thesis to monograph) or serial type. Map Monographic component part: to monograph. Map Serial component part to serial (policy).
N.B.:Others that don’t have a MARC record we will ignore. That will not be part of our KBART mapping and sharing with other services/vendors/etc.
Going forward we need this to be a value that is mandatory, part of the training documentation, or some how handled so that we continually insure that the value is provided and correct.
See Working Group's documentation for specifics.
BHLFEED-57286 Documenation/Requirements for Metadata -KBart Needs
Discovery Layer Working Group
KBart Needs:
Over all, the documentation, training materials, and requirements specifications for metadata need to be reviewed to help keep the required data in the proper fields and formatted appropriately.
See the Working Group's documentation for details.
From: Mike Lichtenberg [
mailto:mike.lichtenberg@mobot.org]
Sent: Thursday, February 11, 2016 12:47 PM
To: Pilsk, Suzanne <
PilskS@si.edu>; 'Adam L. Chandler' <
alc28@cornell.edu>; Crowley, Bianca <
CrowleyB@si.edu>; Michael Loran <
m.loran@nhm.ac.uk>; William Ulate <
william.ulate@mobot.org>; Matt Person <
mperson@mbl.edu>; Diana Duncan <
dduncan@fieldmuseum.org>; Lynch, Susan <
slynch@nybg.org>
Subject: RE: BHL Discovery Tools Task Force
I think we need to review them. In general, I feel that more specificity is needed, because I am not always sure what is the actionable item (or items). I think a review may help us tease out the actionable tasks better than reviewing the original document did. Speaking just for myself, I am seeing a lot of things now after reviewing the Gemini issues that I did not see when looking at the document.
As an illustration, see issue #57285. I understand the general idea: we need to map our titles with non-“Monograph” and non-“Serial” bibliographic types to “Monograph” and “Serial” for inclusion in the KBART feed. But, I do not see many actionable tasks for achieving this. Sorry Suzanne, I have a lot of questions… specifically:
Suzanne wrote: “We will include only monograph and serial types in the KBART feed (policy). Over time we will make policy decisions or do clean up to include more of the content. For example, some records could be retrospectively cleaned up or mapped to the monograph(e.g., map thesis to monograph) or serial type. Map Monographic component part: to monograph. Map Serial component part to serial (policy).”
My questions are:
- “make policy decisions or do clean up” Which will we do? Just one, or could we do both?
- “policy decisions” What are the decisions that need to be made and who is making them?
- “For example, some records could be… cleaned… or mapped…” This is just a suggestion of something that might happen. Not actionable.
- “Map Monographic component part…” and “Map Serial component part…” Those look like specific actionable tasks, but as written they might simply be continuations of the “For example” thought that comes immediately before. If they are specific tasks, then should this mapping be done on data ingest so that the BHL title records are never assigned “component part” types at all, or should the mapping be done for the KBART feed only?
Suzanne wrote: “Going forward we need this to be a value that is mandatory, part of the training documentation, or some how handled so that we continually insure that the value is provided and correct.”
My questions are:
- What is the actionable item here? It reads “we need this, this, or this”. Which is it? All of the suggestions? Just one (which one)? Two of them?
- Joel and I have not been involved in preparing training documentation. If a specific suggestion is to update training materials, then I think we need another name or two attached. What are the updates that are needed?
- “some how handled so that we continually insure that the value… is correct.” What does “some how” mean? It could be suggesting that we want a report of titles by publication type (Bibliographic Level) for periodic review. If so, “Build a report that takes parameters A, B, and C and shows 1, 2, and 3 so that BHL staff can review all titles to confirm that they are assigned a correct publication type” would be a better, more specific, actionable task. But, maybe “some how” means something else? I do not recall a report being discussed.
Suzanne wrote: “See Working Group's documentation for specifics.”
My question is:
- Do we have a link for this documentation? If not, it may be hard to find in the future.
I apologize for taking this apart as extensively as I have, but I am worried that without more specific details these issues will not be clear when it comes time to work on them.
MIKE
From: Pilsk, Suzanne [
mailto:PilskS@si.edu]
Sent: Thursday, February 11, 2016 10:37 AM
To: 'Adam L. Chandler'; Mike Lichtenberg; Crowley, Bianca; Michael Loran; William Ulate; Matt Person; Diana Duncan; Lynch, Susan
Subject: RE: BHL Discovery Tools Task Force
If people can respond via email that they really did look at my work – then we might not need to talk unless I made some complete wrong assumptions.
Possibly just email conversations will work between now and then.
I did not put in a ticket for JATS
Should I? Or Susan, are you? Or ummm yeah, uh?
Suz
From: Adam L. Chandler [
mailto:alc28@cornell.edu]
Sent: Thursday, February 11, 2016 10:19 AM
To: Pilsk, Suzanne <
PilskS@si.edu>; Mike Lichtenberg <
mike.lichtenberg@mobot.org>; Crowley, Bianca <
CrowleyB@si.edu>; Michael Loran <
m.loran@nhm.ac.uk>; William Ulate-Rodriguez <
william.ulate@mobot.org>; Matt Person <
mperson@mbl.edu>; Diana Duncan <
dduncan@fieldmuseum.org>; Lynch, Susan <
slynch@nybg.org>
Subject: RE: BHL Discovery Tools Task Force
Thanks you, Suzanne!!! So the only remaining work for our task force, I think, is to add a ticket for the JATS export -- plus collectively review the content of your tickets together?
Suzanne, do you think we need to meet and review the content of your tickets together? Do you think that is necessary?
Adam
From: Pilsk, Suzanne [
mailto:PilskS@si.edu]
Sent: Thursday, February 11, 2016 9:44 AM
To: Adam L. Chandler <
alc28@cornell.edu>; Mike Lichtenberg <
mike.lichtenberg@mobot.org>; Crowley, Bianca <
CrowleyB@si.edu>; Michael Loran <
m.loran@nhm.ac.uk>; William Ulate-Rodriguez <
william.ulate@mobot.org>; Matt Person <
mperson@mbl.edu>; Diana Duncan <
dduncan@fieldmuseum.org>; Lynch, Susan <
slynch@nybg.org>
Subject: RE: BHL Discovery Tools Task Force
Adam and Team,
I have put in gemini tickets for things I think we wanted from the document below
https://docs.google.com/document/d/14xQz5mTdg4y9v_12tqk-J0bXOCii0y4HGAsUhh_JdrM/edit#heading=h.9p37z1xt69g8
I recorded the gemini ticket numbers in the spread sheet as well.
Each ticket has the same phrasing:
“Discovery Layer Working Group
KBart Needs:”
To facilitate finding them again. For now they are all assigned to Joel and Mike except for one where the note said the Cornell volunteered to help with something so I tagged Marty.
Time for more eyes now. Please review my work. Review the document and see if the Gemini Tickets convey enough information that down the road … oh say 6 months to a year or more (I shure hope not!) the tickets will make sense.
And if more are needed to be made!
Adam, I hope all is well with you soon. Take care all.
Suzanne
-----Original Appointment-----
From: Adam L. Chandler [
mailto:alc28@cornell.edu]
Sent: Thursday, February 11, 2016 9:21 AM
To: Pilsk, Suzanne; Mike Lichtenberg; Crowley, Bianca; Michael Loran; William Ulate-Rodriguez; Matt Person; Diana Duncan; Lynch, Susan
Subject: Canceled: BHL Discovery Tools Task Force
When: Wednesday, February 17, 2016 10:30 AM-11:30 AM (UTC-05:00) Eastern Time (US & Canada).
Where: webex
Importance: High
Hi all,
Due to a family medical event I need to rearrange my schedule next week. Let’s shoot for March 2 being our last meeting.
Thank you,
Adam