ALCTS, ALCTS_CMMS, ALCTS_CRS
Presentation Title: Automating XML remediation with Python’s lxml package and schematron
Presenter: Jeremy Bartczak – Metadata Librarian
Affiliation: University of Virginia
Abstract: The University of Virginia (UVa.) contributes thousands of digitized photographs to the Digital Public Library of America (DPLA). Plans are underway to submit additional objects from multiple legacy digital conversion projects. These projects were implemented in MODS over the course of several years. As local policies evolved, descriptive metadata practices differed across collections. The UVa. Library’s Metadata Analysis and Design team is now in the midst of a large-scale project to remediate this data. Thanks to detailed documentation online about the DPLA’s metadata application profile, and helpful analysis from DPLA staff, a strategy has been implemented to ensure consistent metadata display for UVa. content. Remediation is accomplished using the Python programming language’s lxml package and validated with a custom schematron file. This lightning talk will present some of the changes required for the remediation and review how lxml and schematron automated the process.
Presentation Title: Overcoming the Challenges of Implementing Standardized Metadata Practices in a Digital Repository
Presenter: Sai Deng – Metadata Librarian
Affiliation: University of Central Florida
Abstract: While implementing standards in cataloging digital collections is often a Metadata Librarian’s conscience or inner desire, sometimes it’s a challenge to do so if a system is not built to accommodate such standardized practices. This kind of dilemma is not uncommon in the metadata and digital repository arena. This presentation will address the various challenges in working with metadata in digital repositories such as, name authority control for authors, departments and colleges, type values selection, keywords and subject choices, whether to add linked data URIs to various fields in the records and data discrepancies in harvesting data into the OCLC’s Digital Collection Gateway. Sometimes trying to follow controlled vocabularies or standardized metadata practices seems to be at odds with what the system can accommodate or what many non-catalogers prefer. This presentation will discuss how the Metadata Librarian, Digital Initiatives people and other librarians work together to make careful, practical and conscientious choices.
Presentation Title: Using MarcEdit to retool existing MARC records of paper maps for use in an online geoportal
Presenter: Tim Kiser – Special Materials Catalog Librarian
Presenter: Nicole Smeltekop – Special Materials Catalog Librarian
Affiliation: Michigan State University
Abstract: The Michigan State University Libraries recently joined the Big Ten Academic Alliance Geoportal, a consortial online discovery tool for maps and geographic data. Contributing our scanned paper maps to the geoportal required submission of metadata suitable for the generation of ISO 19115-compliant records. To accomplish this, we devised a workflow using MarcEdit to convert our existing MARC records for paper maps to MARC records for digital maps — which could then be delivered to the geoportal as MARCXML records. This lightning talk will outline our considerations for the project and the steps taken to accomplish it.
Presentation Title: Metadata Migration to Leverage Linked Data in an Institutional Repository
Presenter: Brian Luna Lucero – Digital Repository Coordinator
Affiliation: Columbia University
Abstract: This talk will present the project of migrating records to a new cataloging tool for Academic Commons, Columbia’s institutional repository, with an emphasis on metadata modeling for the new application and transformation of the subjects for all records from the ProQuest vocabulary to FAST.
Over the last year, Columbia University Libraries has supported development of a new cataloging tool, codenamed Hyacinth, for digital collections in order to unify the workflows of several departments and ease the demands for maintenance of multiple platforms. Hyacinth also provides an upgrade over older tools by operating on Hydra architecture and incorporating linked data at its core. Creating one tool that suits the cataloging needs of different departments and projects presented its own technical challenges, however.
Hyacinth serializes records in MODS XML, but was designed to be scheme-agnostic. Achieving this aim required input from metadata experts familiar with the various projects and materials that would be handled by Hyacinth. Normalizing labels for names, genres, academic units, and subjects across numerous projects and departments also presented a challenge. This led to the creation of a URI service that is integral to Hyacinth. The URI service can pull information from external authorities as well as mint local URIs for entities not identified elsewhere.
The migration of Academic Commons records also required a transformation of subjects for approximately 20,000 records to the FAST vocabulary in order to capitalize on Hyacinth’s linked data architecture. We used OpenRefine and a mapping table to replace ProQuest subjects with equivalent FAST terms and add FAST URIs to the records. We also piloted text matching processes to see if any can automatically suggest FAST subjects that match keywords in abstracts. These experiments have produced mixed results.
Presentation Title: Metadata Librarian’s Little Helper: OpenRefine Reconciliation Services
Presenter: Greer Martin – Discovery & Metadata Librarian
Affiliation: Illinois Institute of Technology
Abstract: OpenRefine has many vocabulary reconciliation options, not only with Library of Congress Authorities and VIAF, but also with homegrown data such as a local authority file. With unruly legacy metadata, reconciliation was a major chapter in the story of our records migration to ArchivesSpace. Taking a systematic approach to our vocabulary reconciliation and using OpenRefine’s reconciliation services allowed non-catalogers to assist in this crucial stage of metadata cleanup. This lightning talk will explain how two OpenRefine reconciliation services were incorporated into our migration workflow, with special attention paid to Reconcile-csv, which resolves to a CSV file.
Presentation Title: Git a Grip: Using GitHub to Manage your Metadata Application Profile
Presenter: Anne Washington – Metadata Librarian
Affiliation: University of Houston
Abstract: Local Metadata Application Profiles and input guidelines are always evolving. GitHub provides a simple way to manage metadata documentation with the added benefit of versioning. This allows metadata specialists to see changes in practice over time. Learn how University of Houston Libraries is using GitHub to create and manage their Metadata Application Profile.
Meeting Type: Discussion/Interest Group
Interests: Cataloging, Digitial Libraries, Guidelines and Standards, Metadata, RDA
Library Type: Academic, Consortium, Corporate, Federal, Government, Law, Medical, Public, Regional System, Research Library, School/Media Center, Undergraduate
Sponsors: ALCTS, ALCTS_CMMS, ALCTS_CRS
Cost: Included with full conference registration.