Monograph Dedup Tool
Monograph Dedup Tool
De-Duper Instructions:
MonographicDe-duper.pdf
Overview
The MBLWHOI Library has been working on a tool that will assist with deduping the monographs that BHL members are sending to IA for scanning. While still in an early stage, the application is ready for use and its development will benefit from usage and feedback.
The application is entirely web-based, requiring no client or user configuration, and is temporarily located at
http://dedup.mblwhoilibrary.org/
Features
- NEW! If duplicates are found, picklists can be edited online and downloaded as a .csv file.
- Separate accounts for each institution
- Designed to ingest packlists in excel (.xls) format
- Ability to track your institution’s packlist upload activity
- If duplicates are found, you’ll be able to see which institution scanned it, and when
NOTICE
This tool now requires picklists to contain the following column header names (
NOT including punctuation marks): "Local Number", "OCLC", "Title", "Author", "Volume", "Chronology", "Call Number", "Publisher", "Publisher Place". These are the standard names that we agreed to. Please note that your picklist can contain other information, but the tool will ignore it.
Usage
- Select your institution and log in.
- The application will accept excel (.xls) files. This was thought to be best considering that this process will take place at the end of a selection cycle and our packlists will already be in place.
- Follow the steps to upload a new picklist. The tool requires the following column headers: "Local Number", "OCLC", "Title", "Author", "Volume", "Chronology", "Call Number", "Publisher", "Publisher Place". If your packlist does not contain any of these columns, it’s best if you just create the column headers without any data.
- The .xls file is parsed and all records are added to the database. The next screen will display the duplicates based on 5 SQL statement queries. If you have ideas for better ones just let me know. Right now we're looking for: Duplicates By Oclc And Volume, Duplicates By Oclc Only, Duplicates By Title And Author, Duplicates By Title And Chronology, Duplicates By Title.
One note here, we've not yet tested many different versions of excel with this tool. So, if the upload croaks please let me know what version, platform, etc. that you're using. Remember, to upload a .xls file, I believe that Excel 2007 saves as .xslx by default.