Intro Diglibstandards Ala07
Intro Diglibstandards Ala07
Intro Diglibstandards Ala07
Digital Libraries:
Implementing METS, MODS,
PREMIS and MIX:
Introduction
Rebecca Guenther
Library of Congress
LITA Standards IG Program, ALA Annual
2007
Program overview
• Introduction To METS, MODS, PREMIS and MIX
(Guenther)
• Using METS and MODS for presentations of LC
content (Cundiff, Trail)
• Using METS in special collections at CDL
(Tingle)
• Creating rich shareable metadata: the DLF
Aquifer MODS implementation guidelines
(Shreeves)
• METS, MODS and PREMIS, Oh My!: Integrating
digital library standards for interoperability
and preservation (Habing)
• MODS as metadata Hub (Olson)
Metadata standards in digital
libraries
• XML is the de-facto standard for metadata descriptions
on the Internet
• Interoperability and object exchange requires the use of
established standards
• Many digital objects are complex and are comprised of
multiple files
• Complex digital objects require many more forms of
metadata than analog for their management and use
• Descriptive
• Technical
• Digital provenance/events
• Structural
• Rights/Terms and conditions
Descriptive metadata: MARCXML
• Millions of rich descriptive records in MARC
systems: can be reused in an XML
environment using MARCXML
• MARCXML uses the MARC data element set in
an XML syntax
• Allows interoperability with other XML schemes
by taking advantage of free XML tools
• Allows for collaborative use of metadata for
access (e.g. OAI)
• Provides continuity with current data and
flexible transition options
MARC 21 evolution to XML
MARCXML
• MARCXML record
– XML exact equivalent of MARC (2709)
record
– Lossless/roundtrip conversion to/from MARC
21 record
– Simple flexible XML schema, no need to
change when MARC 21 changes
– Presentations using XML stylesheets
– LC provides converters (open source)
• http://www.loc.gov/standards/marcxml
• Music record in MARCXML
What is MODS?
• Metadata Object Description Schema
• An XML descriptive metadata standard
• A derivative of MARC
– Uses language based tags
– Contains a subset of MARC data elements
– Repackages elements to eliminate redundancies
• MODS does not assume the use of any specific rules
for description
• Element set is particularly applicable to digital
resources
Uses of MODS
• Extension schema to METS
– Rich description works well with hierarchical METS
objects
• To represent metadata for harvesting (OAI)
– Language based tags are more user friendly
• As a specified XML format for SRU
• As a core element set for convergence
between MARC and non-MARC XML
descriptions
• For original resource description in XML syntax
that is simpler than full MARC
MODS high-level elements
• Title Info • Note
• Name • Subject
• Type of resource • Classification
• Genre • Related item
• Origin Info • Identifier
• Language • Location
• Physical description • Access conditions
• Abstract • Part
• Table of contents • Extension
• Target audience • Record Info
• Authenticity:
– Is the digital object what it purports to be? 10 years on
• Rights Management:
– What IPR must be observed?
Makes digital objects self-documenting across time
Guiding principles and assumptions …
• “Implementable, core, preservation metadata”:
– “Preservation metadata”: maintain viability, renderability,
understandability, authenticity, identity in a preservation context
– “Core”: What most preservation repositories need to know to
preserve digital materials over the long-term
– “Implementable”: rigorously defined; supported by usage
guidelines/recommendations; emphasis on automated workflows
• Implementation neutral:
– No assumptions on specific implementation
– Promote flexibility/interoperability
– Focus on semantic units: what you need to know
(implementation-neutral) vs. metadata elements: how you
record it (implementation-specific)
– Information that needs to be “recoverable” from the digital
archiving system, independent of local implementation
Scope
• What PREMIS is:
– Common data model for organizing/thinking about
preservation metadata
– Guidance for local implementations
– Standard for exchanging information packages between
repositories
• What PREMIS is not:
– Out-of-the-box solution: need to instantiate as metadata
elements in repository system
– All needed metadata: excludes business rules, format-
specific technical metadata, descriptive metadata for access,
non-core preservation metadata
– Lifecycle management of objects outside repository
– Rights management: limited to permissions regarding
actions taken within repository
PREMIS data model
Intellectual
Entities
Rights
Objects Agents
Events
Semantic units pertaining to
objects: technical metadata
• objectIdentifier • signatureInformation
• preservationLevel • relationship
• objectCategory • linkingEventIdentifier
• objectCharacteristics • linkingIntellectual
• creatingApplication Entity Identifier
• originalName • linkingPermission
StatementIdentifier
• storage
• environment
Semantic units pertaining to Events:
provenance and preservation activity
• eventIdentifier
• eventType
• eventDateTime
• eventDetail
• eventOutcome
• eventOutcomeDetail
• linkingAgentIdentifier
• linkingObjectIdentifier
Semantic units pertaining to Rights:
terms and conditions
permissionStatement
permissionStatementIdentifier
relatedObject
grantingAgent
grantingAgreement
permissionGranted
act
restriction
termOfGrant
permissionNote
Semantic units pertaining to
Agents
• agentIdentifier
• agentName
• agentType
PREMIS maintenance activities
• First revision of Data Dictionary (PREMIS 2.0)
– Documenting errata and proposed revisions to Data
Dictionary (feedback through PIG list)
– http://www.loc.gov/standards/premis/changes.html
• PREMIS Implementers’ Registry
– http://www.loc.gov/standards/premis/premis-
registry.html
• Consultancies (funded by Library of Congress):
– Rights issues for digital preservation (Karen Coyle)
– PREMIS implementation guidelines and recommendations
(Deborah Woodyard-Robinson)
• PREMIS Tutorials:
– Glasgow, Boston, Stockholm, Albuquerque, Washington
What is MIX?
• Metadata For Images in XML
• An XML Schema designed for expressing technical
metadata for digital still images
• Based on the NISO Z39.87 Data Dictionary –
Technical Metadata for Digitial Still Images
• Used to express attributes of digital images such as
file format, file size, dimensions, resolution,
compression, etc.
• Version 1.0 (recently released) includes support for
GIS images and JPEG 2000 images; data element
names harmonized with PREMIS
• Can be used standalone or as an extension schema
with METS
How do these standards work
together for digital libraries?
• A container format such as METS allows for packaging
together forms of metadata with objects or pointers to
objects
• There are about 5 years of experimentation experience
using METS in combination with other standards for
managing and using digital objects in digital libraries
• These standards are all freely available
• METS profiles detail how METS is used for particular
object types or applications
• Best practices are needed (and being developed) for
use of PREMIS with METS and MIX
• Using METS, MODS, PREMIS and MIX: http://www.loc.
gov/premis/louis.xml