UTK MODS to RDF Mapping¶
Style Guide¶
This document aims to provide all of the information a member of Digital Initiatives needs to transform UT’s existing MODS XML to RDF, regardless of the platform chosen. In order to achieve this goal in a consistent and accessible manner, we will compose the mapping document according to the following practices.
The document will be structured according to MODS top level elements and provide examples of use cases associated with each element. For each use case, example XML for the element being mapped, along with a link to the full MODS record, should be included. Turtle notation, with semicolons separating separate statements about the same subject, will be used as the RDF serialization format. All of the namespaces used as prefixes in the example turtle should be included so that it can be validated. RDF should use example.org “URIs” that include a number (use /1 for the first instance of a minted URI in each example, as though each individual example section is independent of all the others). For more complex examples, graphs illustrating the RDF should also be included to make the relationships more easily understood. More complex examples might include those with minted object or elements that include several relationships (like a geographic subject with coordinates). Graphs should not be necessary for the RDF representing flat elements, like abstract.
Beyond the focused attention on individual elements, the document will also include broader examples and information. A section listing all prefixes used will be present as well as a complete example of a single MODS record transformed according to the guide’s specifications. The elements are listed in the order outlined on DLTN technical documentation.
Simple Example¶
Example record - knoxgardens:115.
<abstract>Photograph slide of the Tennessee state tree, the tulip tree</abstract>
@prefix dcterms: <http://purl.org/dc/terms/> .
<https://example.org/objects/1> dcterms:abstract "Photograph slide of the Tennessee state tree, the tulip tree" .
Namespaces¶
Vocabulary Name |
Prefix |
URI |
|---|---|---|
BIBFRAME |
bf |
|
Classification Schemes |
classSchemes |
|
DBpedia Ontology |
dbo |
|
DCMI Metadata Terms |
dcterms |
|
Dublin Core Metadata Element Set, Version 1.1 |
dce |
|
Europeana Data Model |
edm |
|
IDS Information Model |
iim |
|
MARC Code List for Relators |
relators |
|
Opaque Namespace |
opaque |
|
RDA Unconstrained |
rdau |
|
SKOS Simple Knowledge Organization System |
skos |
|
Standard Identifier Scheme |
identifiers |
|
WGS84 Geo Positioning |
wgs |
https://www.w3.org/2003/01/geo/wgs84_pos# |
Mapping¶
Contents¶
identifier¶
Predicate |
Value Type |
Usage Notes |
|---|---|---|
dbo:isbn |
Literal |
Use for identifiers with type=”isbn” |
dbo:issn |
Literal |
Use for identifiers with type=”issn” |
dbo:oclc |
Literal |
Use for identifiers with type=”oclc” |
identifiers:ark |
Literal |
Use for arks. |
identifiers:local |
Literal |
Use for the majority of identifiers (all those that do not fit into other categories) |
opaque:accessionNumber |
Literal |
Use for identifiers with type=”acquisition” |
Local Identifiers¶
Use Case¶
This is the catch-all category for identifiers that is important to keep but that do not need to be separated into individual categories for discovery. UT’s adminDB values as well as a range of different locally created identifiers are present. A great deal of the values were initially created by Special Collections in finding aids - for instance identifiers with a type attribute of “slide number”, “archival number”, “cw”, and “film number”. If an identifier type attribute of “opac” is present, this means that the resource also has a full MARC record present in the Alma catalog. The strings values for opac identifiers are fourteen to sixteen digits, with the last five digits always being ‘02311.’ The PID value is the main identifier within the Islandora7 platform and is present in the records of collections that have undergone remediation. Collections that were migrated from Omeka to Islandora7 often include identifiers with a type of “spc.” These collections include the Anna Catherine Wiley Sketches, Images of East Tennessee, and Photographs of the Ruskin Cooperative Association.
Justification¶
These values are being kept because they may be helpful to users in finding specific materials. For instance, while @type="pid"
identifiers will no longer be the primary identifiers on UT’s next digital collections platform, they could be used to
identify cited resources that have broken links. Many of the identifiers associated with Special Collections allow users
to see how the same resource might be referenced within finding aids. Have @type="opac" identifiers helps staff at UT
know immediately whether a resource has a MARC record, which could prove useful if descriptive metadata is needed in this
form. Overall, little effort needs to be exerted to keep all of these values and they all have the potential to be helpful
in the future.
XPath¶
identifier[@type="Vendor ID"] OR
identifier[@type="archival number"] OR
identifier[@type="catalog"] OR
identifier[@type="circular"] OR
identifier[@type="cw"] OR
identifier[@type="document ID"] OR
identifier[@type="documentID"] OR
identifier[@type="filename"] OR
identifier[@type="film number"] OR
identifier[@type="legacy"] OR
identifier[@type="local"] OR
identifier[@type="original ID"] OR
identifier[@type="photograph number"] OR
identifier[@type="slide number"] OR
identifier[@type="pid"] OR
identifier[@type="opac"] OR
identifier[@type="spc"]
Decision¶
Example of a record with a PID identifier - egypt:8
<identifier type="pid">egypt:8</identifier>
@prefix identifiers: <http://id.loc.gov/vocabulary/identifiers/> .
<https://example.org/objects/1>
identifiers:local "egypt:8" .
Exception that requires pre-pending a string - agrutesc:
<identifier type="circular">79</identifier>
@prefix identifiers: <http://id.loc.gov/vocabulary/identifiers/> .
<https://example.org/objects/1>
identifiers:local "Circular 79" .
Acquisition Identifier¶
Use Case¶
Several of UT’s collections come from institutions outside the library and include identifiers assigned by those institutions. The McClung Museum of Natural History and Culture on campus is one of these institutions. In the Nineteenth and Early Twentieth Century Images of Egypt collection shared by McClung, traditional museum acquisition numbers consisting of the year three numbers separated by periods (year.acquisition group.item) are present.
Justification¶
Both OpaqueNamespace and CIDOC-CRM properties were considered for mapping these values. Both opaque:accessionNumber and crm:E8 (Acquisition) were defined appropriately for UT’s use cases. Because CIDOC-CRM is particularly used in a museum context, we decided to use opaque:accessionNumber as it is arguably more flexible. This allows us to use the same property for accession numbers from a wide variety of institutions. Both properties support content negotiation.
XPath¶
identifier[@type="acquisition"]
Decision¶
The property opaque:accessionNumber was selected.
<identifier type="acquisition">1996.10.1</identifier>
@prefix opaque: <http://opaquenamespace.org/ns/> .
<https://example.org/objects/1>
opaque:accessionNumber "1996.10.1" .
OCLC numbers¶
Use Case¶
Records from the Tennessee Documentary History collection include OCLC identifiers. These values can be used to identify corresponding records in Worldcat.
Justification¶
OCLC identifiers could be useful if these materials are ever shared with HathiTrust, as this value is a requirement for
submission. Only one property, dbo:oclc, was identified to use and it aligns with our philosophy guidelines.
XPath¶
identifier[@type="oclc"]
Decision¶
<identifier type="oclc">44394278</identifier>
@prefix dbo: <http://dbpedia.org/ontology/> .
<https://example.org/objects/1>
dbo:oclc "44394278" .
ISSNs¶
Use Case¶
Approximately 10% of our records describe periodicals. Effort has been invested in establishing official e-ISSNs for several titles through the Library of Congress. These titles include:
Agricultural & Home Economics News
Agricultural & Home Economics Packet
Agricultural News
Alumnus
Circular
Farm News
Phoenix
Special Circular
Tennessee Farm and Home News
Tennessee Farm and Home Science
Tennessee Farm News
Torchbearer
Note: Some resources within the Children’s Defense Fund collection have both a ISSN and a ISBN.
More information on assigning an e-ISSN can be found here - https://www.loc.gov/issn/basics/basics-brochure-eserials.html.
UT currently has a specific Solr field for publication identifiers (ISBNs and ISSNs) so that these identifiers can be displayed and searched for separately: utk_mods_publication_identifier_ms.
Justification¶
As these identifiers have meaning outside of the context of UT and might be used by patrons in a search to find these materials, it is important that we continue to support a unique field for these values rather than including them in a generic identifier category with other types of identifier values. In addition, having a persistent link for resources with a particular ISSN is essential to the Libraries’ HathiTrust submission records. A title-level MARC XML record with a link to all issues with the same ISSN is shared for this purpose.
Properties for ISSN values are established in DBpedia and the Standard Identifiers Scheme. Both follow our philosophy guidelines and could be used to accurately represent the ISSN values. Ultimately we decided to use DBpedia because it is a widely used core ontology whereas the Standard Identifiers Scheme is more library specific.
XPath¶
identifier[@type="issn"]
Decision¶
Example record - agrutesc:2130
<identifier type="issn">2687-7325</identifier>
@prefix dbo: <http://dbpedia.org/ontology/> .
<https://example.org/objects/1>
dbo:issn "2687-7325" .
ISBNs¶
Use Case¶
International Standard Book Numbers are present as identifier values in the Children’s Defense Fund collection. UT currently has a specific Solr field for publication identifiers (ISBNs and ISSNs) so that these identifiers can be displayed and searched for separately: utk_mods_publication_identifier_ms.
Note: WikiData splits this field into 2: wikidata:P212 and wikidata:P957.
Justification¶
As these identifiers have meaning outside of the context of UTK and might be used by patrons in a search to find these materials,
it is important that we continue to support a unique field for these values. Properties for ISBN values are established
in DBpedia and the Standard Identifiers Scheme. Because preference is given to core ontologies rather than library specific
ones, we selected dbo:isbn.
XPath¶
identifier[@type="isbn"]
Decision¶
<identifier type="isbn">0938008501</identifier>
@prefix dbo: <http://dbpedia.org/ontology/> .
<https://example.org/objects/1>
dbo:isbn "0938008501" .
ARKs¶
Use Case¶
Some works have a minted ARK in its MODS at identifer[@type="ark"].
Justification¶
The ARK represents a persistent identifier and is leveraged by HathiTrust for referring to our works rather than the current URL. These need to be migrated to a special field in our next system separate from other local identifiers in order to continue the similar practice.
XPath¶
identifier[@type="ark"]
Decision¶
<identifier type="ark">ark:/87290/v8pv6hjx</identifier>
@prefix identifiers: <http://id.loc.gov/vocabulary/identifiers/> .
<https://example.org/objects/1>
identifiers:ark "ark:/87290/v8pv6hjx" .
titleInfo¶
Predicate |
Value Type |
Usage Notes |
|---|---|---|
dcterms:title |
Literal |
A name given to the resource. If multiple titleInfo elements are present, supplied title is displayed as the title. |
dcterms:alternative |
Literal |
An alternative name for the resource. This property is used if there is more than one title given. |
titleInfo - one titleInfo element¶
Use Case¶
This category refers to records with a single titleInfo element. All records within UT’s collections contain at
least one title value. Typically, in the case of traditional bibliographic materials, this value is transcribed
directly from the source (title page, etc.). In UT’s collections, titleInfo/title is not restricted to transcribed
titles only and also contains supplied title strings constructed by the cataloger.
Justification¶
Titles are required values for DPLA and are used as the main way of identifying a resource within Islandora, PrimoVE, and
Worldcat, so it is essential that these values are kept. This mapping document consistently designates the displayed
title as the primary title rather than privileging transcribed titles. Currently within Islandora, the fgsLabel is by
default associated with the value within titleInfo/title. Looking to possible future platforms, the equivalent
property for the title which is given preference by default in display is dcterms:title.
XPath¶
titleInfo/title
Decision¶
The string within titleInfo/title can easily translate to the dcterms:title property. In the case below, the single
title value given is a supplied value (since there is no writing on the actual resource to transcribe). This shows the
inconsistency with which @supplied="yes" is used.
<titleInfo>
<title>Pencil drawn portrait study of woman</title>
</titleInfo>
@prefix dcterms: <http://purl.org/dc/terms/> .
<https://example.org/objects/1> dcterms:title "Pencil drawn portrait study of woman" .
titleInfo - single titleInfo element having a supplied attribute of yes¶
Use Case¶
This category refers to single titleInfo element having an attribute of supplied="yes". titleInfo[@supplied="yes"]
is used currently to indicate that a title is constructed by a cataloger rather than transcribed from the source. As mentioned
previously, this is not consistently used to indicate whether a title is supplied or not, particularly when the only title
value has to be supplied because the materials being described have no linguistic content to transcribe.
Justification¶
While the title values themselves need to be retained, it was decided that it is not important to keep values within
titleInfo[@supplied="yes"] separate from values within titleInfo without the attribute value. Therefore both
single title values are mapped to the same property - dcterms:title. In traditional MARC records and in Samvera’s mapping,
brackets are used to wrap title strings that are supplied as a way to distinguish supplied and transcribed titles within the
same field. The decision to not use brackets was made because these characters do not have intuitive meeting to users. This
decision is supported by the Digital Public Library of America’s Aggregation Overview document
that recommends contributors do “not have brackets or ending periods” in their title values.
XPath¶
titleInfo[@supplied="yes"]/title
Decision¶
Supplied titles will be represented as dcterms:title. Supplied titles will not be distinguished from transcribed titles
by using brackets. It is felt that this convention focuses more on cataloging conventions than on users’ needs.
<titleInfo supplied="yes">
<title>Coprinus notebook 1</title>
</titleInfo>
@prefix dcterms: <http://purl.org/dc/terms/> .
<https://example.org/objects/1> dcterms:title "Coprinus notebook 1" .
titleInfo - Multiple titleInfo elements with one having a supplied attribute of yes¶
Use Case¶
This category is defined by the presence of multiple titleInfo elements and one having a attribute of supplied="yes".
Multiple titleInfo/title values are typically present for materials where a title can be transcribed, but an additional
value is desired for display purposes. This is particularly prevalent for serial publications, in which titles often change
over time.
Justification¶
For consistency within collections, the best title to display for users is the supplied title. In current practice, collections
with supplied titles require that the fgsLabel be updated following ingest so that the value within titleInfo[@supplied="yes"]/title
shows while browsing. It was decided to map these supplied titles to dcterms:title rather than dcterms:alternative so
that additional actions like fgsLabel updates are not necessary and to make description practices more easily align with
display practices.
XPath¶
titleInfo[@supplied="yes"]/title AND
titleInfo/title
Decision¶
In cases where supplied="yes" are present for one titleInfo element the titleInfo[@supplied]/title value will be used as dcterms:title.
<titleInfo>
<title>Swimming 1969: The University of Tennessee </title>
</titleInfo>
<titleInfo supplied="yes">
<title>University of Tennessee Swimming-Diving media guide, 1969</title>
</titleInfo>
@prefix dcterms: <http://purl.org/dc/terms/> .
<https://example.org/objects/1>
dcterms:title "University of Tennessee Swimming-Diving media guide, 1969" ;
dcterms:alternative "Swimming 1969: The University of Tennessee " .
titleInfo - titleInfo has partName sub-element¶
Use Case¶
This category consists of records containing a titleInfo element and sub-element of partName.
The Sanborn Fire Insurance Maps collection contains the only records with partName.
Justification¶
The values in partName are essential to keep as they uniquely distinguish each map, but they do not need to be kept
distinct from the title. While they were historically separated because MODS had the granularity to define these values as
distinct from yet related to the title, this separation does not serve any practical purpose. For sharing with DPLA,
titleInfo/title has to be concatenated to partName. It therefore makes sense to remove this granularity
in UT’s data itself to make it easier to share. Consistent with previous UT descriptive practices, commas rather than
periods will be used to indicate enumeration of an object within a string.
XPath¶
titleInfo/partName
Decision¶
In these cases the string contained in partName will be appended to the title. A ‘,’
character followed by a space will be used as glue when concatenating the strings.
<titleInfo>
<title>Knoxville -- 1917</title>
<partName>Sheet 56</partName>
</titleInfo>
@prefix dcterms: <http://purl.org/dc/terms/> .
<https://example.org/objects/1> dcterms:title "Knoxville -- 1917, Sheet 56" .
titleInfo - titleInfo has partNumber sub-element¶
Use Case¶
This category consists of 39 records that contain titleInfo/partNumber. These records are all from the Phoenix collection.
Values within partNumber share volume and issue numbers of the periodical.
Justification¶
Values within partNumber should not be treated the same as partName because titleInfo/title values
within the Phoenix collection already include a season and year to enumerate them. Phoenix is an odd collection that includes
both volume/number and season/year. The volume/issue number is not included with the title because there are several
known instances where the numbers printed on the issue are inaccurate. Still, this information could be useful in identifying
an issue. Ultimately these values should be moved so that they are part of an alternative title for the resource - either
through remediation or during migration.
XPath¶
titleInfo/partNumber
Decision¶
<titleInfo supplied="yes">
<title>Phoenix, fall 1968</title>
<partNumber>volume 10, number 1</partNumber>
</titleInfo>
@prefix dcterms: <http://purl.org/dc/terms/> .
<https://example.org/objects/1> dcterms:alternative "Phoenix, volume 10, number 1" .
titleInfo - titleInfo has nonSort sub-element¶
Use Case¶
This category consists of records with a titleInfo element and sub-element of nonSort. The nonSort
sub-element is used in MODS to mirror how the second indicator in a MARC title statement (245) is used to document nonfiling
characters (“A”, “The”, etc.). This removes definite or indefinite articles at the start of a title so that only significant
content within the string is used for sorting purposes.
Justification¶
The use of nonSort is historical and the values do not need to be retained separately in a modern repository. Stop words
like “A” and “The” can be recognized for sorting purposes without being in a separate element. As the values present within
nonSort are also part of the official title, when they are separated out into a sub-element within UT’s repository,
work must be done to concatenate them to titleInfo/title when sharing. This work is unnecessary and therefore
we will not retain nonSort elements moving forward.
XPath¶
titleInfo/nonSort
Decision¶
The string contained within the nonSort element will be prepended to the title value.
Example record from volvoices:2890
<titleInfo>
<nonSort>The </nonSort>
<title>Guard at the Mountain Branch of the National Home for Disabled Volunteer Soldiers</title>
</titleInfo>
@prefix dcterms: <http://purl.org/dc/terms/> .
<https://example.org/objects/1> dcterms:title "The Guard at the Mountain Branch of the National Home for Disabled Volunteer Soldiers" .
titleInfo - Multiple titleInfo elements with one having a type of alternative¶
Use Case¶
This category consists of records with two titleInfo elements and one having an attribute of type="alternative".
This situation occurs when a resource has more than one title that can be transcribed from it.
Justification¶
Resources are often known by more than one title, so including all known titles will help with discovery. It is important for the title that is displayed as the main title to be separate from any secondary titles, so both need their own properties.
XPath¶
titleInfo AND
titleInfo[@type="alternative"]
Decision¶
titleInfo elements with @type="alternative" will defined as dcterms:alternative.
<titleInfo>
<title>Prussian heroes march</title>
</titleInfo>
<titleInfo type="alternative">
<title>Prussian heroes: Prussen helden march</title>
</titleInfo>
@prefix dcterms: <http://purl.org/dc/terms/> .
<https://example.org/objects/1>
dcterms:title "Prussian heroes march" ;
dcterms:alternative "Prussian heroes: Prussen helden march" .
@displayLabel additional example record - womenbball:653
<titleInfo supplied="yes">
<title>Tennessee Lady Volunteers basketball media guide, 1984-1985</title>
</titleInfo>
<titleInfo type="alternative" displayLabel="Cover Title">
<title>Tennessee Lady Vols 1984-85: reaching for the Summitt of women's basketball</title>
</titleInfo>
@prefix dcterms: <http://purl.org/dc/terms/> .
<https://example.org/objects/1>
dcterms:title "Tennessee Lady Volunteers basketball media guide, 1984-1985" ;
dcterms:alternative "Tennessee Lady Vols 1984-85: reaching for the Summitt of women's basketball" .
abstract¶
Predicate |
Value Type |
Usage Notes |
|---|---|---|
dcterms:abstract |
Literal |
Use for all mods:abstracts that are not blank nodes |
Abstracts that are not Blank Nodes¶
Use Case¶
If a record has an abstract or many abstracts, they will each be mapped to dcterms:abstract as long as the abstract
does not have an empty text node.
Justification¶
Regardless of the number, the value has the same semantic relationship to the object as it did in MODS. When more than
one abstract value is present, these values will be kept as separate strings associated with dcterms:abstract.
This separation is desired because often the separate abstract values contain information structured differently
from one another or information that comes from different sources (one abstract may be transcribed from the source while
another is supplied by the cataloger).
XPath¶
abstract[text()]
Decision¶
If it has one abstract like gamble:124, map to dcterms:abstract.
<abstract>
Prosecutor John Keker gives his closing statement to the jury, explaining Col. John North's involvement in the Iran-Contra affair even though the majority of his statement is censored due to classified information.
</abstract>
@prefix dcterms: <http://purl.org/dc/terms/> .
<https://example.org/objects/1> dcterms:abstract "Prosecutor John Keker gives his closing statement to the jury, explaining Col. John North's involvement in the Iran-Contra affair even though the majority of his statement is censored due to classified information." .
If it has more than one abstract like 1001:1,
we will still map to dcterms:abstract.
<abstract>
Postcard with handwritten note sent from Knoxville to Miss Virginia Bogart, Loudon, Tennessee on March 2, 1944 for a postage of 1 cent.
</abstract>
<abstract>
The hardwood forest of America, and probably of the entire world, originated in the Great Smoky Mountains, where remains the nation's largest body of virgin hardwood forest, and the world's greatest variety of trees, flowering shrubs and wild flowers.
</abstract>
@prefix dcterms: <http://purl.org/dc/terms/> .
<https://example.org/objects/1> dcterms:abstract "Postcard with handwritten note sent from Knoxville to Miss Virginia Bogart, Loudon, Tennessee on March 2, 1944 for a postage of 1 cent.", "The hardwood forest of America, and probably of the entire world, originated in the Great Smoky Mountains, where remains the nation's largest body of virgin hardwood forest, and the world's greatest variety of trees, flowering shrubs and wild flowers." .
Blank Abstracts¶
Use Case¶
UT has a fair number of records with empty abstracts. These likely were unintentionally added while using Islandora
forms or transforming XML with XSLT.
Justification¶
When an abstract is an empty node, do not map it. The value of the text node has no semantic meaning or value so there is no content to retain.
XPaths¶
abstract[string()=""]
Decision¶
Don’t map!
Example record - roth:1595 <https://digital.lib.utk.edu/collections/islandora/object/roth%3A1595/datastream/MODS/view>
</abstract>
tableOfContents¶
Use Case¶
The following collections include tableOfContents - David Van Vactor Music Collection, Tennessee Farm and Home Science,
The Arrow of Pi Beta Phi. There are a total of 455 unique values. This element contains the names of individually titled
parts that make up the larger resource. It is used to provide more detailed information on the content of a resource in
a non-structured way. Note that punctuation separating part titles varies depending on the string values being separated.
The following punctuation is present in UT’s tableOfContents elements: ” – “, ” - “, and “;”.
Justification¶
This information aides keyword discovery by adding more text to the record and providing users with a listing of parts within the larger resource.
XPath¶
tableOfContents
Decision¶
Below are examples showing the punctuation variations present in this element’s values.
Example record with “;” as separators - arrow:305.
<tableOfContents>Library Fund Honors Marian; Noted Craftsman Lauds Arrowmont; Gatlinburg Residents Enjoy Craft Courses;
Tennessee Gammas Honor Prof. Heard</tableOfContents>
Example record with “-” as separators - agrtfhs:2119.
<tableOfContents>Snap beans: machine vs. hand harvest - New bulletins - Protein with high silage rations -- dairy
- Pepper yields and fertility, plant spacing - Stripping vs. spindle picking of 4 cottons - Personnel changes -
Soybean irrigation - Alfalfa crown rot - Bedding for better cotton stands - Controlling bagworms -
Nitrogen on shade trees</tableOfContents>
Example record with ” – ” as separators - vanvactor:15772.
<tableOfContents>Preface -- David Van Vactor: life and works -- David Van Vactor: catalog of manuscripts --
Catalog of books, scores, and manuscripts in Special Collections -- Books and scores in the George F. DeVine Music
Library -- Sound recordings, 1942-1979</tableOfContents>
All values within tableOfContents will be mapped to RDF in the same way. Below is a representation of arrow:305.
@prefix dcterms: <http://purl.org/dc/terms/> .
<https://example.org/objects/1>
dcterms:tableOfContents "Library Fund Honors Marian; Noted Craftsman Lauds Arrowmont; Gatlinburg Residents Enjoy Craft Courses; Tennessee Gammas Honor Prof. Heard" .
name¶
Predicate |
Value Type |
Usage Notes |
|---|---|---|
relators:[term] |
Literal or URI |
Use with a role from MARC Code List of Relators role terms. Value is either text or URI from a controlled vocabulary (like Library of CongressName Authority File). |
Leverage Marc Relators for Name RDF Property and Relationship to the Digital Object¶
Use Case¶
A name/namePart value shares the name of an individual who is related to the digital object. All instances of name
have a role/roleTerm that can be leveraged to determine the name’s particular relationship to the object. In some cases,
there is a roleTerm/@valueURI, but this is not always the case.
Justification¶
Names are important access points for users. The relator terms are also essential to retain because they indicate how a name is relevant to the object.
XPaths¶
name/namePart OR
name[@valueURI!=""]
Decisions¶
For all instances of name, leverage the marcrelator value found in its role/roleTerm for
associating the name with the digital object.
A lookup table is included as an appendix to help with this.
If the name has a valueURI attribute, use it for the object of the triple. If it does not, use
the text value of name/namePart.
When you have a name with a valueURI attribute like tdh:8803:
<name valueURI="http://id.loc.gov/authorities/names/n2017180154">
<namePart>White, Hugh Lawson, 1773-1840</namePart>
<role>
<roleTerm authority="marcrelator" valueURI="http://id.loc.gov/vocabulary/relators/crp">
Correspondent
</roleTerm>
</role>
</name>
Leverage the @valueURI and make it the object of the triple:
@prefix relators: <http://id.loc.gov/vocabulary/relators/> .
<https://example.org/objects/1>
relators:crp <http://id.loc.gov/authorities/names/n2017180154> .
When there is no name/@valueURI, use the string literal from name/namePart. cDanielCartoon:1000
is an example record containing a name value missing a @valueURI:
<name type="personal">
<namePart>Daniel, Charles R. (Charlie), Jr., 1930-</namePart>
<role>
<roleTerm type="text" authority="marcrelator" valueURI="http://id.loc.gov/vocabulary/relators/cre">Creator</roleTerm>
</role>
</name>
@prefix relators: <http://id.loc.gov/vocabulary/relators/> .
<https://example.org/objects/1>
relators:cre "Daniel, Charles R. (Charlie), Jr., 1930-" .
If there is a name/@valueURI but it’s empty, use the string literal instead. ‘volvoices:2495
is an example of this:
<name authority="naf" type="corporate" valueURI="">
<namePart>Bemis Bro. Bag Company</namePart>
<role>
<roleTerm authority="marcrelator" type="text" valueURI="http://id.loc.gov/vocabulary/relators/asn">Associated name</roleTerm>
</role>
</name>
@prefix relators: <http://id.loc.gov/vocabulary/relators/> .
<https://example.org/objects/1>
relators:asn "Bemis Bro. Bag Company" .
Names with Multiple Role Terms¶
Use Case¶
Occasionally, a name will have multiple roles. For instance, a person might be both the “Copyright holder” and
the “Photographer”.
Justification¶
In order to not lose any information, it is essential that all the relationships between people and our digital object are kept.
This means that the same namePart value may be present more than once to account for the variety of ways in which
it may be related to the object being described.
XPaths¶
count(name/role)>1
Decision¶
Example record - harp:1 MODS record:
<name authority="naf" valueURI="http://id.loc.gov/authorities/names/no2002022963">
<namePart>Swan, W. H. (William H.)</namePart>
<role>
<roleTerm authority="marcrelator" valueURI="http://id.loc.gov/vocabulary/relators/cmp">
Composer
</roleTerm>
</role>
<role>
<roleTerm authority="marcrelator" valueURI="http://id.loc.gov/vocabulary/relators/com">
Compiler
</roleTerm>
</role>
</name>
@prefix relators: <http://id.loc.gov/vocabulary/relators/> .
<https://example.org/objects/1>
relators:cmp <http://id.loc.gov/authorities/names/no2002022963> ;
relators:com <http://id.loc.gov/authorities/names/no2002022963> .
Do Not Keep Any Other Values Associated with a Name¶
Use Case¶
There are other XPaths in our system that are associated with names that are no longer needed. Information present in these
Xpaths includes the nationality of a named individual as well as their birth and/or death dates or dates of artistic activity.
The Archivision collection includes the most added sub-elements within name. All of those not mentioned previously
will be dropped.
Justification¶
In an RDF based system that leverages linked data, it’s unnecessary to keep traditional name information
like authority, displayForm, type, or description. Authorities are present in the URI itself and information such as
description or displayForm are available from the class our object refers to. We recognize that type is not available
and are willing to lose this information in the interest of making our data more manageable.
XPaths¶
name/role/roleTerm/@authority OR
name/@authority OR
name/role/roleTerm/@authorityURI OR
name/@type OR
name/displayForm OR
name/description
Decision¶
Several of these values which will be dropped are illustrated in this example record - archivision:1959
<name type="personal" authority="ulan" valueURI="http://vocab.getty.edu/ulan/500009663">
<namePart>Burgee, John Henry</namePart>
<displayForm>John Henry Burgee</displayForm>
<namePart type="date">born 1933</namePart>
<description>American</description>
<role>
<roleTerm type="text" authority="marcrelator" valueURI="ttp://id.loc.gov/vocabulary/relators/cre">Creator</roleTerm>
</role>
</name>
originInfo¶
Predicate |
Value Type |
Usage Notes |
|---|---|---|
dcterms:created |
Literal or URI |
The date a resource was created, formatted as an EDTF string. |
dcterms:issued |
Literal or URI |
The date a resource was issued, formatted as an EDTF string. |
dcterms:date |
Literal or URI |
An unspecified date associated with a resource, formatted as an EDTF string. |
relators:pbl |
Literal or URI |
The publisher associated with the resource. |
relators:pup |
Literal or URI |
A place associated with the publication of the resource. |
originInfo/dateCreated¶
Use Case¶
dateCreated captures dates and date ranges identifying or approximating when the physical object was created. Most of
UT’s records currently have both a human-readable date and a machine-readable date (following the extended date time format).
Justification¶
dateCreated values provide important access points for users and can be easily mapped to an equivalent property -
dcterms:created. This mapping allows dateCreated values to remain distinct from other types of date values.
XPath¶
originInfo/dateCreated OR
originInfo/dateCreated[@encoding='edtf'] OR
originInfo/dateCreated[@encoding='edtf'][@keyDate='yes'] OR
originInfo/dateCreated[@encoding='edtf'][@keyDate='yes'][@point='end'] OR
originInfo/dateCreated[@encoding='edtf'][@keyDate='yes'][@point='end'][@qualifier='approximate'] OR
originInfo/dateCreated[@encoding='edtf'][@keyDate='yes'][@point='end'][@qualifier='inferred'] OR
originInfo/dateCreated[@encoding='edtf'][@keyDate='yes'][@point='start'] OR
originInfo/dateCreated[@encoding='edtf'][@keyDate='yes'][@point='start'][@qualifier='approximate'] OR
originInfo/dateCreated[@encoding='edtf'][@keyDate='yes'][@point='start'][@qualifier='inferred'] OR
originInfo/dateCreated[@encoding='edtf'][@keyDate='yes'][@point='start'][@qualifier='questionable'] OR
originInfo/dateCreated[@encoding='edtf'][@keyDate='yes'][@qualifier='approximate'] OR
originInfo/dateCreated[@encoding='edtf'][@keyDate='yes'][@qualifier='inferred'] OR
originInfo/dateCreated[@encoding='edtf'][@keyDate='yes'][@qualifier='questionable'] OR
originInfo/dateCreated[@encoding='edtf'][@point='end'] OR
originInfo/dateCreated[@encoding='edtf'][@point='end'][@qualifier='approximate'] OR
originInfo/dateCreated[@encoding='edtf'][@point='end'][@qualifier='inferred'] OR
originInfo/dateCreated[@encoding='edtf'][@point='start'] OR
originInfo/dateCreated[@encoding='edtf'][@point='start'][@keyDate='yes'] OR
originInfo/dateCreated[@encoding='edtf'][@point='start'][@keyDate='yes'][@qualifier='approximate'] OR
originInfo/dateCreated[@encoding='edtf'][@point='start'][@qualifier='approximate'] OR
originInfo/dateCreated[@encoding='edtf'][@point='start'][@qualifier='inferred'][@keyDate='yes'] OR
originInfo/dateCreated[@encoding='edtf'][@qualifier='approximate'] OR
originInfo/dateCreated[@encoding='edtf'][@qualifier='approximate'][@keyDate='yes'][@point='start'] OR
originInfo/dateCreated[@encoding='edtf'][@qualifier='approximate'][@point='end'] OR
originInfo/dateCreated[@encoding='edtf'][@qualifier='inferred'][@keyDate='yes'][@point='start'] OR
originInfo/dateCreated[@encoding='edtf'][@qualifier='inferred'][@point='end'] OR
originInfo/dateCreated[@encoding='w3cdtf'][@keyDate='yes'][@point='start'] OR
originInfo/dateCreated[@encoding='w3cdtf'][@point='start'][@keyDate='yes'] OR
originInfo/dateCreated[@point='end'] OR
originInfo/dateCreated[@qualifier='approximate'] OR
originInfo/dateCreated[@qualifier='approximate'][@encoding='edtf'][@keyDate='yes'] OR
originInfo/dateCreated[@qualifier='approximate'][@encoding='edtf'][@keyDate='yes'][@point='end'] OR
originInfo/dateCreated[@qualifier='approximate'][@encoding='edtf'][@keyDate='yes'][@point='start'] OR
originInfo/dateCreated[@qualifier='inferred'] OR
originInfo/dateCreated[@qualifier='inferred'][@encoding='edtf'][@keyDate='yes'][@point='start'] OR
originInfo/dateCreated[@qualifier='questionable'] OR
originInfo/dateCreated[@qualifier='questionable'][@encoding='edtf'][@keyDate='yes']
Decisions¶
We will convert w3cdtf to edtf values as part of our migration process; additionally, we will integrate EDTF Level 2 features where necessary. The dcterms:created property was selected.
<originInfo>
<dateCreated qualifier="inferred">1955</dateCreated>
<dateCreated encoding="edtf" keyDate="yes">1955</dateCreated>
</originInfo>
@prefix dcterms: <http://purl.org/dc/terms/> .
<https://example.org/objects/1> dcterms:created "1955", "1955~" .
Example record - volvoices:3849
<originInfo>
<dateCreated>approximately between 1940 and 1950</dateCreated>
<dateCreated encoding="edtf" keyDate="yes" point="start" qualifier="approximate">1940</dateCreated>
<dateCreated encoding="edtf" keyDate="yes" point="end">1950</dateCreated>
</originInfo>
@prefix dcterms: <http://purl.org/dc/terms/> .
<https://example.org/objects/1> dcterms:created "approximately between 1940 and 1950", "1940~/1950" .
originInfo/dateIssued¶
Use Case¶
dateIssued captures dates and date ranges identifying or approximating when the physical object was issued. Typically
“issued” is associated with the act of publication. Serials, sheet music, and other published materials will have a dateIssued
value rather than a dateCreated value.
Justification¶
dateIssued values provide important access points for users and can be easily mapped to an equivalent property -
dcterms:issued. This mapping allows dateIssued values to remain distinct from other types of date values.
XPaths¶
originInfo/dateIssued OR
originInfo/dateIssued[@encoding='edtf'] OR
originInfo/dateIssued[@encoding='edtf'][@keyDate='yes'] OR
originInfo/dateIssued[@encoding='edtf'][@keyDate='yes'][@point='end'][@qualifier='inferred'] OR
originInfo/dateIssued[@encoding='edtf'][@keyDate='yes'][@point='start'] OR
originInfo/dateIssued[@encoding='edtf'][@keyDate='yes'][@point='start'][@qualifier='inferred'] OR
originInfo/dateIssued[@encoding='edtf'][@keyDate='yes'][@qualifier='approximate'] OR
originInfo/dateIssued[@encoding='edtf'][@keyDate='yes'][@qualifier='inferred'] OR
originInfo/dateIssued[@encoding='edtf'][@keyDate='yes'][@qualifier='questionable'] OR
originInfo/dateIssued[@encoding='edtf'][@point='end'] OR
originInfo/dateIssued[@encoding='edtf'][@point='start'] OR
originInfo/dateIssued[@encoding='edtf'][@point='start'][@keyDate='yes'] OR
originInfo/dateIssued[@point='end'] OR
originInfo/dateIssued[@qualifier='approximate'] OR
originInfo/dateIssued[@qualifier='approximate'][@encoding='edtf'][@keyDate='yes'] OR
originInfo/dateIssued[@qualifier='inferred'] OR
originInfo/dateIssued[@qualifier='inferred'][@encoding='edtf'][@keyDate='yes'][@point='end'] OR
originInfo/dateIssued[@qualifier='inferred'][@encoding='edtf'][@keyDate='yes'][@point='start']
Decision¶
We will integrate EDTF Level 2 features where applicable. The dcterms:issued property was selected.
Example record - volvoices:2993
<originInfo>
<dateCreated>1948-01</dateCreated>
<dateCreated encoding="edtf" keyDate="yes">1948-01</dateCreated>
<dateIssued encoding="edtf" keyDate="yes" qualifier="approximate">1948</dateIssued>
</originInfo>
@prefix dcterms: <http://purl.org/dc/terms/> .
<https://example.org/objects/1> dcterms:created "1948-01", "1948-01" ;
dcterms:issued "1948~" .
originInfo/dateOther¶
Use Case¶
dateOther captures other significant dates associated with the resource. In UT’s data it is primarily present in
collections that have not been fully remediated. When UT’s metadata was migrated from Dublin Core to MODS and the standard
LoC transform was applied, all dates were set to dateOther because it was impossible to individually distinguish whether
dateIssued or dateCreated would be accurate.
Justification¶
While some of the values within dateOther may be ultimately better assigned to dateIssued or dateCreated,
in migrating to a new system and RDF we can only aim to keep the accuracy we already have. Some date values, like those given
in the example below, will always be distinct from dateIssued or dateCreated, so a separate category is
needed.
XPath¶
originInfo/dateOther OR
originInfo/dateOther[@encoding='edtf'] OR
originInfo/dateOther[@encoding='edtf'][@point='end'] OR
originInfo/dateOther[@encoding='edtf'][@point='start']
Decisions¶
As part of leveraging the EDTF format, some conversion will be necessary; e.g. translating date strings to EDTF values as in the following example. The dcterms:date property was selected.
<originInfo>
<dateIssued>Jun 30, 1965</dateIssued>
<dateIssued encoding="edtf">1965-06-30</dateIssued>
<dateOther encoding="edtf">1964/1965</dateOther>
<place>
<placeTerm valueURI="http://id.loc.gov/authorities/names/n80003889">University of Tennessee, Knoxville</placeTerm>
</place>
<publisher>University of Tennessee Theatre Department </publisher>
</originInfo>
@prefix dcterms: <http://purl.org/dc/terms/> .
@prefix relators: <http://id.loc.gov/vocabulary/relators/> .
<https://example.org/objects/1> dcterms:issued "Jun 30, 1965", "1965-06-30" ;
dcterms:date "1964/1965" ;
relators:pbl "University of Tennessee Theatre Department" ;
relators:pub <http://id.loc.gov/authorities/names/n80003889> .
originInfo/place/placeTerm¶
Use Case¶
This XPath identifies a place associated with the publication or creation of the resource. Some values follow a controlled vocabulary while others do not.
Justification¶
Values in place/placeTerm share origin information that is distinct from geographic subjects that describe places
the resource is “about.” For those researching publishing in particular regions, place/placeTerm values will be
very helpful. Note that whether or not the place of publication was supplied will not be retained in migration, though
the value itself will be regardless of the presence of @supplied.
XPath¶
originInfo/place/placeTerm[@text] OR
originInfo/place/placeTerm[@text][@valueURI] OR
originInfo/place[@supplied]/placeTerm[@text][@valueURI]
Decision¶
The majority of the applicable values are associated with a @valueURI. The relators:pup property was selected.
<originInfo>
<place supplied="yes">
<placeTerm type="text" valueURI="http://id.loc.gov/authorities/names/n79072935">Meadville (Crawford County, Pa.)</placeTerm>
</place>
<publisher>Keystone View Company</publisher>
<dateCreated>between 1890 and 1930?</dateCreated>
<dateCreated encoding="edtf" keyDate="yes" point="start" qualifier="questionable">1890</dateCreated>
<dateCreated encoding="edtf" keyDate="yes" point="end">1930</dateCreated>
</originInfo>
@prefix relators: <http://id.loc.gov/vocabulary/relators/> .
@prefix dcterms: <http://purl.org/dc/terms/> .
<https://example.org/objects/1> relators:pbl "Keystone View Company" ;
relators:pup <http://id.loc.gov/authorities/names/n79072935> ;
dcterms:created "between 1890 and 1930?", "1890?/1930" .
Empty placeTerm elements will be ignored.
originInfo/publisher¶
Use Case¶
Identifies a publisher associated with the resource. Note that while many of the publishers are associated with controlled
vocabularies and have URIs, MODS 3.5 does not support @valueURI on publisher. Therefore only strings will
be migrated.
Justification¶
publisher values share important information about who produced a publication. It will be treated similarly to
name/namePart values mentioned. relators:pbl can be used to show that the values share corporations responsible
for the publication of a resource.
XPath¶
originInfo/publisher
Decision¶
The relators:pbl property was selected.
Example record -:
<originInfo>
<place>
<placeTerm valueURI="http://id.loc.gov/authorities/names/n79006530">Baltimore (Md.)</placeTerm>
</place>
<publisher>Frederick D. Benteen</publisher>
</originInfo>
@prefix relators: <http://id.loc.gov/vocabulary/relators/> .
<https://example.org/objects/1> relators:pbl "Frederick D. Benteen" ;
relators:pup <http://id.loc.gov/authorities/names/n79006530> .
originInfo/issuance¶
Use Case¶
This XPath provides details for how the resource was published. All 4207 of our instances of issuance have the value “serial”.
Currently this is not displayed in facets or the “Click for Details” section. These values are also not shared with DPLA.
Justification¶
As UT is not actively using these values for search and discovery and the element is only selectively applied to a particular set of records, these values should be dropped.
XPath¶
originInfo/issuance
Decision¶
We will not be migrating issuance values. Here’s an example record with this element - agrutesc:2439:
<issuance>serial</issuance>
physicalDescription¶
Predicate |
Value Type |
Usage Notes |
|---|---|---|
dcterms:abstract |
Literal |
Use for form values with @type=”material”. |
edm:hasType |
URI or Literal |
Use for form values without attributes. |
rdau:P60550 |
Literal |
Use for all extent values. |
skos:note |
Literal |
Use for notes nested within physicalDescription. |
digitalOrigin¶
Use Case¶
Currently there are 28,137 records that have a digitalOrigin value. This value is absent from 23,190 records. While present
in the MODS record, these values (UT metadata contains “born digital”, “digitized other analog”, and “reformatted digital”)
are not publicly displayed anywhere. These values communicate the “method by which a resource achieved digital form.”
Justification¶
We have decided for a number of reasons that migrating our digitalOrigin values is not beneficial. As mentioned above,
these values are not currently viewable by users. Arguably, these values will also already be apparent from the technical
metadata and do not need to be captured in the descriptive metadata. In addition, we are unaware of any backend technical
use case for this data at present. While knowing if something is “born digital” might be useful, all of the content within
Digital Collections is curated and meets our technical expectations. A “born digital” label would be more actionable for
resources gathered outside of the Digital Collections creation process. These born digital resources from “the wild” would
likely not be on the same platform as Digital Collections resources.
XPath¶
physicalDescription/digitalOrigin
Decision¶
We have decided to not migrate these values as is justified above. Here’s an example record - voloh:10
<digitalOrigin>born digital</digitalOrigin>
note¶
Use Case¶
Two collections, the Botanical Photography of Alan S. Heilman and the William Derris Film Collection, include note elements
within physicalDescription. These values are of two types. The majority of the values communicate camera settings for the
Heilman collection, while a smaller number of values share the “Film type” that was used to produce the print that was
digitized. Below is a small sample of these values:
Camera setting: 7@50 on 25; with filter
0.18x magnification, 100 Velvia
Film type: Kodachrome Transparency
zoomA -> 70 [A], Auto f16E100s
Film type: GEMounts
These values are somewhat problematic because they do not describe the digitized resource, but instead provide information about
the process that created these resources. This is useful information to know, but it is not tied directly to the resource, making
the inclusion of the values within physicalDescription inaccurate.
Justification¶
Since UT does not use physicalDescription/note regularly, it would streamline the data if these values could be
appropriately placed elsewhere. An attempt was made to match film type values (“GEMounts” and “Kodachrome Transparency”) with AAT
terms, but it was not possible to find anything appropriate for “GEMounts.” The accuracy of some of this information is questionable
(for instance, GEMounts are likely a brand instead of a film type), but without access to the actual materials during the quarantine, it is
impossible to make an informed judgement on what should be changed. To retain this contextual information that might
prove useful to researchers interested in photographic processes and techniques, it seems best to simply put these values
in a generic note field. If additional attention can be given to these two collections in the future, we can remediate
the metadata following migration with the benefit of having access to the physical materials.
XPath¶
physicalDescription/note
Decision¶
All values will be moved to a generic note field.
<physicalDescription>
<form authority="aat" valueURI="http://vocab.getty.edu/aat/300127478">transparencies</form>
<digitalOrigin>digitized other analog</digitalOrigin>
<note>Film type: GEMounts</note>
<note>Camera setting: 10@50 at 4ft</note>
</physicalDescription>
@prefix skos: <http://www.w3.org/2004/02/skos/core#> .
<https://example.org/objects/1>
skos:note "Film type: GEMounts", "Camera setting: 10@50 at 4ft" .
extent¶
Use Case¶
The extent element includes values that indicate time and physical dimensions. Time is consistently shared in hours, minutes
and seconds. Physical dimensions are most consistently represented in inches and feet, but cm are also used for smaller
items that might benefit from a more granular measurement.
Justification¶
While this kind of information has historically been included in MARC records to ensure that books are not larger than the shelf height, extent values can also provide important contextual information that is relevant to better understanding resources in a digital environment. Particularly in the case of photography, the dimensions can be used to help determine the type of film.
The working group’s shared philosophies were influential in decided on the best property to use for extent values. The
Islandora Metadata Interest Group’s default mapping suggests using dcterms:extent and using a blank node with a literal as
a RDF value. This group is against using blank nodes when at all possible because they make it more difficult for the
user to consume content. The Samvera mapping uses rdau:P60550, which is less than ideal because rdau does not support
content negotiation. This means that the URI provided for the desired property does not allow a user to directly request
RDF. No other more suitable properties could be found for extent values. Given this predicament, the working group
decided to use rdau:P60550 because it is dereferenceable, which a blank node is not. Still, the inability to retrieve
RDF directly will limit users wishing to interact with our data in this way.
XPath¶
physicalDescription/extent
Decision¶
Example record - knoxgardens:125
<extent>3 1/4 x 5 inches</extent>
@prefix rdau: <http://rdaregistry.info/Elements/u/> .
<https://example.org/objects/1>
rdau:P60550 "3 1/4 x 5 inches" .
extent - @unit¶
Use Case¶
The Great Smoky Mountains Colloquy collection is the only collection that includes @unit on extent. The
collection consists of 34 total records. This is another case where increased granularity was possible through MODS, but
it has not been found to be helpful in sharing UT’s metadata more effectively. The established practice is to share the
unit along with the measurement in a single string.
Justification¶
It is important for the user to know what the unit of measurement is for a value within the extent field. It is also
important for us to share this information consistently. In order to retain the needed information while also conforming
the metadata from this collection with the rest of our records, we propose that the @unit value is added to the extent
string during migration. This would involve simply taking the existing value in extent and then adding ‘ pages’ to the
string. Note that all of the resources within the Colloquy collection have more than one page, so the plural form of the
word will always be accurate. See the Decision section of extent above for more explanation of rdau:P60550.
XPath¶
physicalDescription/extent[@unit="pages"]
Decision¶
<extent unit="pages">4</extent>
@prefix rdau: <http://rdaregistry.info/Elements/u/> .
<https://example.org/objects/1>
rdau:P60550 "4 pages" .
form - No URI¶
Use Case¶
At the time of analysis, there were 10,853 records that contained a form term without an associated @valueURI.
Presently form values are displayed in facets and within the “Click for details” section (regardless of whether
they follow an authority or not).
Justification¶
Form values are important access points that provide more specific information than is provided in higher-level elements
like typeOfResource. Through individually assessing the values, it was determined that all of these values come from the
Art and Architecture Thesaurus (AAT), but without additional remediation the relationship of these values to the controlled
vocabulary is not actionable. In the coming months, work will be done to add the appropriate valueURIs to these records,
but we want to make sure that this work is not a blocker to migration. In order to leverage the capabilities of Linked
Data, we plan to remediate as many of these records as possible while choosing a mapping that allows flexibility in the
value type. Anything values that are not remediated to include URIs before migration can be addressed via SPARQL queries
afterwards.
XPath¶
physicalDescription/form
Decision¶
We will use edm:hasType instead of dcterms:format in order to accommodate form values without a URI. We need to move all
of the form values over, so using edm:hasType will make sure that we bring every form term regardless of whether it is
defined as a URI or a literal.
Here’s an example record - gamble:1
<form>cartoons (humorous images)</form>
@prefix edm: <http://www.europeana.eu/schemas/edm/> .
<https://example.org/objects/1>
edm:hasType "cartoons (humorous images)" .
form - Has URI¶
Use Case¶
The majority of UT’s form values include a valueURI from the Art and Architecture Thesaurus (AAT). form
values are not currently displayed in DPLA’s interface, but DPLA’s MAP 5
lists preferred from subtype values that will eventually be implemented. Work has been done to align as many of our form
terms as possible with this preferred list.
Justification¶
form values are important access points that provide more specific information than is provided in higher-level elements
like typeOfResource.
XPath¶
physicalDescription/form[@valueURI]
Decision¶
Here’s an example record - ruskin:108
<form authority="aat" valueURI="http://vocab.getty.edu/aat/300046300">photographs</form>
@prefix edm: <http://www.europeana.eu/schemas/edm/> .
<https://example.org/objects/1>
edm:hasType <http://vocab.getty.edu/aat/300046300> .
form - @type=”material”¶
Use Case¶
The Archivision collection has a special type attribute so that the list of materials used to create specific buildings
can be faceted. The material types are consistently listed in the same order within the string to make this possible.
Justification¶
In order to attempt to streamline this data to better align with UT’s existing records, all existing terms were compared
with similar terms from the Art and Architecture Thesaurus. The hope was to split the string field on commas and find
controlled terms for each individual value so that these could simply be presented in physicalDescription/form
without the need for a unique type attribute. Analysis showed that a number of values included very specific descriptions
of the material type in parentheses following the broader term. For instance, ‘marble (white Carrara and green Prato marble).’
This specificity made it impossible to use the AAT without losing some of the information present in the original records.
Treating these values as part of the abstract will ensure that they display prominently, which would not be the case with
a note value necessarily. To make this read more fluidly, ‘Made of ‘ can be added to the front of the string and an ending
period added (‘.’).
XPath¶
physicalDescription/form[@type="material"]
Decision¶
Example record - archvision:8477
<form type="material">granite, tile (pink Vermont granite, Spanish tile)</form>
@prefix dcterms: <http://purl.org/dc/terms/> .
<https://example.org/objects/1> dcterms:abstract "Made of granite, tile (pink Vermont granite, Spanish tile)." .
internetMediaType¶
Use Case¶
A total of 14,725 records have an internetMediaType while this element is not present in 36,602 records. It is used to indicate
the MIME type of the access file for the digitized resource. It is displayed in the “Click for Details” section.
Justification¶
This information within the descriptive metadata should not be migrated as it will be captured automatically during file characterization in the new system. In addition, many of the current values over from the existing metadata are inaccurate and therefore should not be shared.
XPath¶
physicalDescription/internetMediaType
Decision¶
Do not migrate.
<internetMediaType>audio/wav</internetMediaType>
note¶
Predicate |
Value Type |
Usage Notes |
|---|---|---|
bf:IntendedAudience |
Literal or URI |
Use for information that identifies the specific audience or intellectual level for which the content of the resource is considered appropriate. |
dce:subject |
Literal or URI |
Use for name, topical subjects, and uncontrolled keywords. Use of a URI from a controlled subject vocabulary is preferred over a literal value |
opaque:sheetmusic_instrumentation |
Literal or URI |
Use for sheet music, a listing of the performing forces called for by a particular piece of sheet music, including both voices and external instruments. |
opaque:sheetmusic_firstLine |
Literal or URI |
Use for sheet music, entering a direct transcription of the first line of lyrics appearing in the song. |
skos:note |
Literal |
Use for the note value. |
note - Just a note¶
Use Case¶
note values contain a great variety of information in an unstructured string form. Currently they are displayed
in the brief results in Islandora as well as within the “Click for Details” section. Unlike abstract, note
values often share supplemental information rather than a summary of the resource’s aboutness. Information shared includes
donor information, transcriptions of written content, contact information, and suggested citation formats.
Justification¶
Because of their unstructured nature, usually a note is just a note. It is not essential that all different
types of notes are distinct from one another. UT’s MODS current contains more granularity than it is essential to retain,
as is apparent from the variety of @type values present in the Xpath section below. While these different types of
notes have unique Xpaths, nothing is currently being done beyond the XML to make these distinctions apparent to users.
Therefore unique properties do not need to be identified for each type of note.
The Samvera community attempts to keep some of the granularity of MODS by prepending the text value of the attribute
to the text node when one exists. UT has decided to follow this general approach. When @type does not exist, simply take
the text node.
In BIBFRAME, there was no attempt to convert the 562 MARC field. For this reason, “handwritten” documents are just regular notes.
XPath¶
When the XPath has a specific attribute and value, prepend the value to the text node.
note OR
note[@type="handwritten"] OR
note[@displayLabel="Attribution"] OR
note[@displayLabel="use and reproduction"] OR
note[@displayLabel="Local Rights"]
Decision¶
<note>
A_0:51:21 / B_0:59:44
</note>
<note>
(Original, for: Mrs. Dirksen, Compliments: Tony Janak)
</note>
<note>
No issues.
</note>
@prefix skos: <http://www.w3.org/2004/02/skos/core#> .
<https://example.org/objects/1>
skos:note "A_0:51:21 / B_0:59:44", "(Original, for: Mrs. Dirksen, Compliments: Tony Janak)", "No issues." .
Example record showing prepending - egypt:109
<note displayLabel="Local Rights">Permission granted for reproduction for use in research and teaching, provided proper attribution of source.
Credit line should read: [description of item, including photographic number], 'Courtesy of McClung Museum of Natural History and Culture, The
University of Tennessee.' For all other uses consult https://mcclungmuseum.utk.edu/research/image-services/rights-reproductions/ or call 865-974-2144.
</note>
@prefix skos: <http://www.w3.org/2004/02/skos/core#> .
<https://example.org/objects/1>
skos:note "Local Rights: Permission granted for reproduction for use in research and teaching, provided proper attribution of source. Credit line should read: [description of item, including photographic number], 'Courtesy of McClung Museum of Natural History and Culture, The University of Tennessee.' For all other uses consult https://mcclungmuseum.utk.edu/research/image-services/rights-reproductions/ or call 865-974-2144." .
note - Instrumentation¶
Use Case¶
@type="Instrumentation" is used in the Van Vactor Music collection as a listing of the performing forces called for by
a particular piece of music. While only used for a single collection at this point, the intention is to use it for any future
records for music resources involving more than simply voice and piano. Documentation was created to share what UT considers
“score order”, as there is some variation on the order in which instruments should be listed. Having established what
UT considers “score order”, it is possible to use note[@type="Instrumentation"] as a facet in addition to showing
the string value in the “Click for Details” section.
Justification¶
Because of the desire to be able to facet on instrumentation, a separate property is needed to distinguish it from other note values. We reviewed several bibliographic and music ontologies including the Music Ontology, the Internet of Music Thingz, and MusicBrainz, but none seemed to have a predicate to represent this idea. We did notice that Opaque Namespace by Oregon Digital did have a matching predicate. In the Samvera community, not only is this ontology used, but occasionally the community has suggested new predicates to be created within Opaque Namespaces.
XPath¶
note[@type="Instrumentation"]
Decision¶
Example record - vanvactor:15773
<note type="instrumentation">
For soprano, mezzo-soprano, contralto, 2 flutes, 2 oboes, 2 clarinets, 2 bassoons, 2 horns, 2 trumpets, timpani, 2 violins, viola, cello, and double bass.
</note>
@prefix opaque: <http://opaquenamespace.org/ns/> .
<https://example.org/objects/1>
opaque:sheetmusic_instrumentation "For soprano, mezzo-soprano, contralto, 2 flutes, 2 oboes, 2 clarinets, 2 bassoons, 2 horns, 2 trumpets, timpani, 2 violins, viola, cello, and double bass." .
note - First Line¶
Use Case¶
When a note has a @type = "First line" or @type = "first line", it is not a general note. Instead, this element is
a direct transcription of the first line of lyrics appearing in a song.
Justification¶
We reviewed several bibliographic and music ontologies including the Music Ontology, the Internet of Music Thingz, and MusicBrainz, but none seemed to have a predicate to represent this idea. We did notice that Opaque Namespace by Oregon Digital did have a matching predicate. In the Samvera community, not only is this ontology used, but occasionally the community has suggested new predicates to be created within Opaque Namespaces.
XPath¶
note[@type="First line"] OR
note[@type="first line"]
Decision¶
Example record - vanvactor:15773
<note type="First line">
Ojitos de pena carita de luna, lloraba la niña sin causa ninguna.
</note>
@prefix opaque: <http://opaquenamespace.org/ns/> .
<https://example.org/objects/1>
opaque:sheetmusic_firstLine "Ojitos de pena carita de luna, lloraba la niña sin causa ninguna." .
note - Target audience¶
Use Case¶
A note with @displayLabel with the value of “Grade level” refers to the target audience of the resource. This Xpath
is present solely within the Arrowmont Curriculum documents, but could be used more broadly for other resources with an
educational focus.
Justification¶
The MARC 521 field should be mapped to the BIBFRAME intended audience field. The field is defined as information that identifies the specific audience or intellectual level for which the content of the resource is considered appropriate.
XPath¶
note[@displayLabel="Grade level"]
Decision¶
Example record from arrowmont:9
<note displayLabel="Grade level">
Second Grade
</note>
@prefix bf: <http://id.loc.gov/ontologies/bibframe/> .
<https://example.org/objects/1>
bf:IntendedAudience "Second Grade" .
note - DPN Deposits and Other Things to Ignore¶
Use Case¶
We have several notes that we do not need to migrate.
Justification¶
The data here is no longer important.
XPath¶
note[@displayLabel="DPN"] OR
note[string()=""] OR
note[@displayLabel="Intermediate provider"] OR
note[@displayLabel="Intermediate Provider"] OR
note[@displayLabel="Transcribed from Original Collection"] OR
note[@displayLabel="Project Part"]
Decision¶
Example record from heilman:1000
<note displayLabel="dpn">
This object was added to the Digital Preservation Network in November 2016.
</note>
Do not migrate!
subject¶
Properties |
Value Type |
Usage Notes |
|---|---|---|
bf:geographicCoverage |
Literal |
Use for uncontrolled geographic place namess. |
dcterms:spatial |
URI |
Use for controlled geographic place names. |
dcterms:subject |
URI |
Use for topic and name subjects. |
dcterms:temporal |
Literal |
|
iim:keyword |
Literal |
Use for topic and name subjects without a URI. |
wgs:lat_long |
Literal |
|
None type¶
Use Case¶
Several subject elements contain unintentional null values. There are five within Tennessee Documentary History. Additional null
subjects include vpmoore:133 and adams:76. Most of roth seems to have null subject/name/namePart values.
It appears we might have inserted some blank nodes using the Islandora form entry. As there is no information, these
“values” are not used and have no true use case.
Justification¶
These nodes contain no information.
XPath¶
subject/topic[string() = '']OR
subject/geographic[string() = '']OR
subject/name/namePart[string() = '']
Decision¶
Do not migrate.
Here’s an example of a null topic value - tdh:366.
<subject>
<topic/>
</subject>
Here’s an example of a null geographic value - vpmoore:133.
<subject>
<geographic/>
</subject>
Here’s an example of a null namePart value - roth:1587.
<subject>
<name authority="" valueURI="">
<namePart/>
</name>
</subject>
Topical and name subjects with URIs¶
Use Case¶
Remediated collections include subject values with URIs.
Justification¶
In migration, subjects with name and topic values will be treated in the same way. We have decided that the previous
distinction between name and topic values as subjects is not essential - only the presence of all the values in the
metadata is important.
XPath¶
Note that there is inconsistency in where the valueURI attribute is placed.
subject[@valueURI]/topicOR
subject/topic[@valueURI]OR
subject[@valueURI]/name/namePartOR
subject/name[@valueURI]/namePart
Decision¶
When a valueURI is present for topic or name subject, it will be the value used in migration. Examples showing each
of the distinct XPaths are given below:
acwiley:280 as an example of subject[@valueURI]/topic
<subject authority="lcsh" valueURI="http://id.loc.gov/authorities/subjects/sh85147554">
<topic>Women in art</topic>
</subject>
<subject authority="lcsh" valueURI="http://id.loc.gov/authorities/subjects/sh85147447">
<topic>Women artists</topic>
</subject>
<subject authority="tgm" valueURI="http://id.loc.gov/vocabulary/graphicMaterials/tgm008085">
<topic>Portraits</topic>
</subject>
@prefix dcterms: <http://purl.org/dc/terms/> .
<https://example.org/objects/1> dcterms:subject <http://id.loc.gov/authorities/subjects/sh85147554> ;
dcterms:subject <http://id.loc.gov/authorities/subjects/sh85147447> ;
dcterms:subject <http://id.loc.gov/vocabulary/graphicMaterials/tgm008085> .
cdf:5384 as an example of subject/topic[@valueURI]
<subject>
<topic valueURI="http://id.loc.gov/authorities/subjects/sh85023396">Child welfare</topic>
</subject>
@prefix dcterms: <http://purl.org/dc/terms/> .
<https://example.org/objects/1> dcterms:subject <http://id.loc.gov/authorities/subjects/sh85023396> .
wwiioh:2451 as an example of subject[@valueURI]/name/namePart.
<subject authority="naf" valueURI="http://id.loc.gov/authorities/names/n85185770">
<name>
<namePart>United States. Army. Medical Corps</namePart>
</name>
</subject>
@prefix dcterms: <http://purl.org/dc/terms/> .
<https://example.org/objects/1> dcterms:subject <http://id.loc.gov/authorities/names/n85185770> .
helser:24792 as an example of subject/name[@valueURI].
<subject>
<name authority="naf" valueURI="http://id.loc.gov/authorities/names/n87116131">
<namePart>Atkinson, George Francis, 1854-1918</namePart>
</name>
</subject>
<subject>
<name authority="naf" valueURI="http://id.loc.gov/authorities/names/n88144876">
<namePart>Arthur, Joseph Charles, 1850-1942</namePart>
</name>
</subject>
@prefix dcterms: <http://purl.org/dc/terms/> .
<https://example.org/objects/1> dcterms:subject <http://id.loc.gov/authorities/names/n88144876> ;
dcterms:subject <http://id.loc.gov/authorities/names/n87116131> .
Name and topical subjects without URIs¶
Use Case¶
UT will need to treat any of these subjects that are not able to be reconciled as string values. For the postcard collection,
the use of dots (Database of the Smokies) as the authority makes it impossible to include a URI presently. Other collections
with string values are the Charlie Daniel Cartoon Collection, Ed Gamble Cartoon Collection, Football Programs, Insurance Company of
North America Records, American Civil War Collection, Ramsey Family Papers, Tennessee Documentary History,
and Volunteer Voices.
The Volunteer Voices collection includes subjects with three different displayLabel values - “Volunteer Voices Curriculum Topics”,
“Tennessee Social Studies K-12 Eras in American History”, and “Broad Topics”. These subjects are currently given separate
facets in Islandora’s metadata display. Discovery to the collection via two of these subject categories is also featured
on the Tennessee State Library and Archives website (“Broad Topics”
and “Tennessee Social Studies K-12 Eras in American History”). While these subjects have been distinguished previously from
other subjects in the past by their distinct XPath, having so many different types of subjects was found to be unnecessary
going forward. “Broad Topics” and “Curriculum Topics” will be folded in with all other subjects. For links to external websites,
like TSLA’s, we can use the string values to supply a link without needing to place them in a separate property. Note that
subjects associated with “Tennessee Social Studies K-12 Eras in American History” are dealt with
separately below.
Justification¶
subject values are important access points for users that require migration. While URIs would be ideal from a technical
standpoint, strings still support discovery.
XPath¶
mods/subject[not(@valueURI)]/topic[not(@valueURI)]OR
mods/subject[not(@valueURI)]/name[not(@valueURI)]/namePart[not(@valueURI)]
Decision¶
String values for topic or name subjects will be migrated when a valueURI is not present.
Here’s an example record where only string values are available for topical subjects - gamble:123.
<subject>
<topic>Environmentalism</topic>
</subject>
<subject>
<topic>Factory and trade waste--Environmental aspects</topic>
</subject>
<subject>
<topic>Pollution</topic>
</subject>
<subject>
<topic>Knight</topic>
</subject>
@prefix iim: <https://w3id.org/idsa/core/> .
<https://example.org/objects/1> iim:keyword "Environmentalism" ;
iim:keyword "Factory and trade waste--Environmental aspects" ;
iim:keyword "Pollution" ;
iim:keyword "Knight" .
Here’s an example where only a string value is available for a name - gamble:144.
<subject>
<name>
<namePart>Xerox Corporation</namePart>
</name>
</subject>
@prefix iim: <https://w3id.org/idsa/core/> .
<https://example.org/objects/1> iim:keyword "Xerox Corporation" .
Here’s an example from Volunteer Voices of a “Broad Topics” subject - volvoices:4058.
<subject displayLabel="Broad Topics">
<topic>Frontier Settlement and Migration</topic>
</subject>
@prefix iim: <https://w3id.org/idsa/core/> .
<https://example.org/objects/1> iim:keyword "Frontier Settlement and Migration" .
Here’s an example of @displayLabel=”Volunteer Voices Curriculum Topics” - volvoices:2141.
<subject displayLabel="Volunteer Voices Curriculum Topics">
<topic>Civil Rights movement in Tennessee</topic>
</subject>
@prefix iim: <https://w3id.org/idsa/core/> .
<https://example.org/objects/1> iim:keyword "Civil Rights movement in Tennessee" .
Temporal subjects¶
Use Case¶
subject/temporal values share information about a time period using text or a date (edtf). None of our existing subject/temporal
values include URIs. These values are prominent in Volunteer Voices and the Pi Beta Phi to Arrowmont collections. While not from established controlled
vocabularies like LCSH, subject/temporal values are present in facets as the strings are often constructed consistently.
Justification¶
subject/temporal values provide important access points. While not associated with a URI, the values are often from controlled
vocabularies created as part of a grant project. Because they are associated with grants and cross-institutional projects,
retaining these values is particularly important.
XPath¶
mods/subject/temporal
Decision¶
subject/temporal values without the displayLabel attribute will be directly mapped as strings to dcterms:temporal. schema:temporalCoverage
was considered because of how flexible it is, but ultimately it was decided that we can disregard the recommendation in dcterms:temporal to enter values appropriate for
the class dcterms:PeriodOfTime (that have both start and end dates). We are ignoring http://purl.org/dc/dcam/rangeIncludes in this
case as it is only a suggestion.
Example of temporal subject - arrow:268.
<subject>
<temporal>The Birth of Arrowmont, Gatlinburg, Tennessee, 1965-1979</temporal>
</subject>
@prefix dcterms: <http://purl.org/dc/terms/> .
<https://example.org/objects/1> dcterms:temporal "The Birth of Arrowmont, Gatlinburg, Tennessee, 1965-1979" .
In addition to these textual values, UT does have subject/temporal values that share numeric dates in EDTF format. When a single
date is shared, these values should be dropped as they only duplicate information already found in originInfo. These are primarily from
the Volunteer Voices collection.
Here’s an example record - volvoices:2945.
<subject>
<temporal>1970-09-30</temporal>
</subject>
<originInfo>
<dateCreated>1970-09-30</dateCreated>
<dateCreated encoding="edtf" keyDate="yes">1970-09-30</dateCreated>
</originInfo>
Temporal subjects from Volunteer Voices (K-12 Eras) with string and XPath inconsistencies¶
Use Case¶
While two of the subject categories associated with the Volunteer Voices collection can be folded into dcterms:subject
directly (“Broad Topics” and “Volunteer Voices Curriculum Topics”), special attention needs to be given to subjects associated
with “Tennessee Social Studies K-12 Eras in American History”. There are instances in which a value associated with one
of these topics is used, but the displayLabel has been left off and they have incorrectly been categorized as subject/geographics.
Justification¶
It is important to treat these values as a separate category to ensure that the text value is not split across separate
categories (aka dcterms:temporal and dcterms:subject). In addition, some standardization of the label needs to be
done for all the records associated with a given concept to be colocated. As mentioned earlier, subject/temporal values
will be directly mapped as strings to dcterms:temporal. `schema:temporalCoverage was considered because of how flexible it is,
but ultimately it was decided that we can disregard the recommendation in dcterms:temporal to enter values appropriate
for the class PeriodOfTime (that have both start and end dates). We are ignoring http://purl.org/dc/dcam/rangeIncludes in this
case as it is only a suggestion.
XPath¶
subject/geographic[string()="Contemporary United States (1968-present)."]OR
subject/geographic[string()="Postwar United States (1945-1970)."]OR
subject/geographic[string()="The Great Depression and World War II (1929-1945)."]OR
subject/geographic[string()="The Emergence of Modern America (1890-1930)."]OR
subject/geographic[string()="The Development of the Industrial United States (1870-1900)."]OR
subject/geographic[string()="Expansion and Reform (1801-1861)."]OR
subject/geographic[string()="Revolution and the New Nation (1754-1820)."]OR
subject/geographic[string()="Colonization and Settlement (1585-1763)."]
Decision¶
<subject>
<geographic>Expansion and Reform (1801-1861).</geographic>
</subject>
The final subject/geographic value actually matches one of the values listed in the “Tennessee Social Studies K-12 Eras
in American History”. While it is placed in a geographic subject here in the XML, it should be in a temporal subject (as
the date range following the text suggests). One value is placed in subject/topic. The following values are all
of the exceptions:
We will want to remediate before migration, match on and transform these values during migration, or deal with them after migration. The string values
also don’t exactly match the string values present in topic[@displayLabel="Tennessee Social Studies K-12 Eras in American History"].
The eras (“Era 2 - “, “Era 3 - “, etc.) need to be added and the trailing periods removed for these to match. Below is a
table of the values that need to be edited along with their appropriate match.
Incorrect Value |
Established Era Term |
Contemporary United States (1968-present). |
Era 10 - Contemporary United States (1968 to the present) |
Postwar United States (1945-1970). |
Era 9 - Postwar United States (1945-1970’s) |
The Great Depression and World War II (1929-1945). |
Era 8 - The Great Depression and World War II (1929-1945) |
The Emergence of Modern America (1890-1930). |
Era 7 - The Emergence of Modern America (1890-1930) |
The Development of the Industrial United States (1870-1900). |
Era 6 - The Development of the Industrial United States (1870-1900) |
Expansion and Reform (1801-1861). |
Era 4 - Expansion and Reform (1801-1861) |
Revolution and the New Nation (1754-1820). |
Era 3 -Revolution and the New Nation (1754-1820) |
Colonization and Settlement (1585-1763). |
Era 2 - Colonization and Settlement (1585-1763) |
@prefix dcterms: <http://purl.org/dc/terms/> .
<https://example.org/objects/1> dcterms:temporal "Era 4 - Expansion and Reform (1801-1861)" .
Example of @displayLabel=”Tennessee Social Studies K-12 Eras in American History” - volvoices:1833.
<subject displayLabel="Tennessee Social Studies K-12 Eras in American History">
<temporal>Era 9 - Postwar United States (1945-1970's)</temporal>
</subject>
These will simply be treated as other subject/temporal values are. Note that we only have strings for subject/temporal values.
@prefix dcterms: <http://purl.org/dc/terms/> .
<https://example.org/objects/1> dcterms:temporal "Era 9 - Postwar United States (1945-1970's)" .
Geographic subjects¶
Use Case¶
UT has subject/geographic values associated with and without URIs. Like with other elements, the placement of the URIs is not consistent.
URIs will be used when present, but strings can be used when there is no URI.
Justification¶
subject/geographic values warrant a separate property from both subject/temporal and subject/topic so that they can be displayed
separately on the interface.
XPath¶
subject[@valueURI]/geographicOR
subject/geographic[@valueURI]
As noted previously, there are a handful of string values in geographic elements within volvoices that need to be moved
to be treated differently than other geographic values.
subject/geographic[not(string()="Contemporary United States (1968-present).")]OR
subject/geographic[not(string()="Postwar United States (1945-1970).")]OR
subject/geographic[not(string()="The Great Depression and World War II (1929-1945).")]OR
subject/geographic[not(string()="The Emergence of Modern America (1890-1930).")]OR
subject/geographic[not(string()="The Development of the Industrial United States (1870-1900).")]OR
subject/geographic[not(string()="Expansion and Reform (1801-1861).")]OR
subject/geographic[not(string()="Revolution and the New Nation (1754-1820).")]OR
subject/geographic[not(string()="Colonization and Settlement (1585-1763).")]
Decision¶
Here’s an example where the URI is present on the subject - webster:1127.
<subject authority="geonames" valueURI="http://sws.geonames.org/4050810">
<geographic>The Sawteeth</geographic>
<cartographics>
<coordinates>35.64342, -83.36237</coordinates>
</cartographics>
</subject>
<subject authority="geonames" valueURI="http://sws.geonames.org/4609260">
<geographic>Brushy Mountain</geographic>
<cartographics>
<coordinates>35.67787, -83.43016</coordinates>
</cartographics>
</subject>
<subject authority="lcsh" valueURI="http://id.loc.gov/authorities/subjects/sh85057008">
<geographic>Great Smoky Mountains (N.C. and Tenn.)</geographic>
</subject>
Here’s an example where the URI is present on the geographic element - roth:2165.
<subject>
<geographic authority="geonames" valueURI="http://sws.geonames.org/4178924/about.rdf">Yulee Sugar Mill Ruins Historic State Park</geographic>
</subject>
Regardless of URI placement, we will map the values the same.
@prefix dcterms: <http://purl.org/dc/terms/> .
<https://example.org/objects/1> dcterms:spatial <http://sws.geonames.org/4050810> ;
dcterms:spatial <http://sws.geonames.org/4609260> ;
dcterms:spatial <http://id.loc.gov/authorities/subjects/sh85057008> .
If only strings are present, like with volvoices:14173,
then the string value will be kept but in another property ( bf:geographicCoverage ).
<subject>
<geographic>Covington (Tenn.)</geographic>
</subject>
@prefix bf: <http://id.loc.gov/ontologies/bibframe/> .
<https://example.org/objects/1> bf:geographicCoverage "Covington (Tenn.)" .
Coordinates¶
Use Case¶
There are a total of 702 unique coordinate values in UT’s collections. Many are associated with geonames terms, but there are 8 coordinates associated with Library of Congress terms. These terms are “Great Smoky Mountains National Park (N.C. And Tenn.)”, “Knoxville (Tenn.)”, “Sevier County (Tenn.)”, “Dickson County (Tenn.)”, “Hardin County (Tenn.)”, “Bluff City (Tenn.)”, and “Saint Andrews (Tenn.)”. In addition, there are 120 geographic names that are not associated with an authority through the use of a URI, but they contain coordinates. The following lists some: “Abrams Creek”, “Anthony Creek (Tenn.)”, “Arcadia Dam (Okla.)”, “Arch Rock”, “Arizona”, “Arkansas”, “Becky Cable House (Tenn.)”, “Boston (Mass.)”, “Bote Mountain Trail (Tenn.)”, “Bristol (Tenn.)”, “Cades Cove Campground (Tenn.)”, “Cades Cove Loop Road (Tenn.)”, “Cades Cove Picnic Area (Tenn.)”, “Calderwood Dam (Tenn.)”, “California”, “Chattanooga (Tenn.)”, “Cherokee Orchard (Tenn.)”, “Chestnut Flats”, “Chilhowee (Extinct city)”, “Chimney Tops”, “Chimney Tops (Tenn.)”, “Chimney Tops Foot Bridge (Tenn.)”, “Chimney Tops Trail”, “Clingmans Dome Road”, “Davenport Gap (Tenn.)”, “Deals Gap (Tenn.)”, “Dry Sluice Gap (Tenn.)”, “Dry Valley (Tenn.)”, “Elijah Oliver Place (Tenn.)”, “Fighting Creek Gap (Tenn.)”, “Florida”, “Fontana Dam (N.C.)”, “Foothills Parkway”, “Forge Creek”, “Forney Ridge Parking Lot (N.C.)”, “Fort George Site”, “Fort Manuel Site”, “Fowler (Kan.)”, “Gatlinburg (Tenn.)”, “Greenbrier Pinnacle (Tenn.)”, “Gregory Bald (Tenn.)”, “Guyot, Mount (Tenn.)”, “Harrison, Mount (Tenn.)”, “Headrick Chapel (Tenn.)”, and many more.
Justification¶
Having coordinates to leverage support mapping and digital humanities projects. coordinates increase the number of
ways in which our data can be used.
XPath¶
subject/cartographics/coordinates
Decision¶
Here’s an example record - webster:1005.
<subject authority="geonames" valueURI="https://sws.geonames.org/4630912">
<geographic>House Mountain</geographic>
<cartographics>
<coordinates>36.11175, -83.76657</coordinates>
</cartographics>
</subject>
All that is needed in this case is to bring over the URI.
@prefix wgs: <https://www.w3.org/2003/01/geo/wgs84_pos#> .
<https://example.org/objects/1> wgs:lat_long <https://sws.geonames.org/4630912> .
Given the extent of coordinates that cannot be retrieved using a URI (120), a separate solution is needed to preserve these values.
Here’s an example record - derris:610.
<subject>
<geographic>Becky Cable House (Tenn.)</geographic>
<cartographics>
<coordinates>35.58546, -83.84444</coordinates>
</cartographics>
</subject>
@prefix dcterms: <http://purl.org/dc/terms/> .
@prefix wgs: <https://www.w3.org/2003/01/geo/wgs84_pos#> .
<https://example.org/objects/1> dcterms:spatial <https://sws.geonames.org/4630912> ;
wgs:lat_long "35.58546, -83.84444" .
genre¶
Predicate |
Value Type |
Usage Notes |
|---|---|---|
dcterms:type |
URI/String Literal |
Use for MODS genre values of ‘cartographic’ or ‘notated music.’ Also use when genre[@authority=’dct’]= ‘text’, ‘image’, ‘still image’. |
dcterms:subject |
URI/String Literal |
Use for MODS genre values with an authority of ‘aat’ or ‘lcsh’. |
edm:hasType |
URI/String Literal |
Use for MODS genre values without attributes that do not equal ‘cartographic’ OR ‘notated music.’ Also use when genre[@authority=”lcgft”]. |
genre: values that map to dcterms:type¶
Use Case¶
genre, without any attributes, has been used as a catch-all descriptive element that may or may not hold values from a controlled vocabulary, and that may or may not provide appropriate descriptive information about the resource. genre[@authority='dct'] has three distinct values: “text”, “still image”, and “image”, that broadly indicate the type of the resource being described. This category consists of typeOfResource values that are present in genre due to the use of the LoC Dublin Core to MODS transform. In many remediated collections, these values have already been moved to typeOfResource, but there are still many that remain in genre that should be addressed for consistency’s sake during migration.
Justification¶
The justification for keeping genre values that map to dcterms:type, is the same as the justification for keeping typeOfResource values generally.
Values within typeOfResource are used for initial faceting in search for both UT’s local Digital Collections website
and for DPLA’s interface. As DPLA doesn’t display physicalDescription/form values, it is important to share this
less granular indication of the resource type.
For values outside of the following table, we selected the edm:hasType property as it aligns well with the possible overlap between genre and physicalDescription/form. To help prevent duplicating string literals and URIs, the following table suggests a mapping for a limited subset of the union of values in genre[not(@*)] and genre[@authority='dct'].
(//genre[not(@*] | //genre[@authority=’dct’]) |
RDF Predicate |
URI |
dcterms text value |
cartographic |
dcterms:type |
Cartographic |
|
image |
dcterms:type |
Image |
|
notated music |
dcterms:type |
Notated music |
|
still image |
dcterms:type |
Still image |
|
text |
dcterms:type |
Text |
XPaths¶
genre[not(@*)][string() = 'cartographic'] OR
genre[not(@*)][string() = 'notated music'] OR
genre[@authority = 'dct'][string() = 'image'] OR
genre[@authority = 'dct'][string() = 'still image'] OR
genre[@authority = 'dct'][string() = 'text']
Alternately, these XPaths can be notated as:
genre[(not(@*) and (string() = ('cartographic', 'notated music')) or (@authority = 'dct' and (string() = ('text', 'image', 'still image')))]
Decision¶
The dcterms:type property has been selected.
Example record - volvoices:11551
<genre>notated music</genre>
<genre>sheet music</genre>
@prefix edm: <http://www.europeana.eu/schemas/edm/> .
@prefix dcterms: <http://purl.org/dc/terms/> .
<https://example.org/objects/1> edm:hasType "sheet music" ;
dcterms:type <http://id.loc.gov/vocabulary/resourceTypes/not> .
Example record - volvoices:11262
<genre>notated music</genre>
<genre authority="dct">still image</genre>
<genre>sheet music</genre>
@prefix edm: <http://www.europeana.eu/schemas/edm/> .
@prefix dcterms: <http://purl.org/dc/terms/> .
<https://example.org/objects/1> edm:hasType "sheet music" ;
dcterms:type <http://id.loc.gov/vocabulary/resourceTypes/not> ;
dcterms:type <http://id.loc.gov/vocabulary/resourceTypes/img> .
genre values that map to edm:hasType¶
Use Case¶
genre[not(@*)] has been used a catch-all descriptive element that may or may not hold values from a controlled vocabulary, and that may or may not provide appropriate descriptive information about the resource. Unlike the previous category, values within genre[not(@*)] generally contain more specific terms related to the physical characteristics of a resource and closely mirror MODS physicalDescription/form values.
Justification¶
The justification for keeping these values is similar to that expressed for physicalDescription/form values that do not
have a @valueURI. This category contains terms that should be in physicalDescription/form if more time for
remediation had been possible.
The values in this XPath fall outside of the table presented in the preceding section (“genre values that map to dcterms:type”).
XPath¶
genre[not(@*) and not(string() = ('cartographic','notated music'))]
Decision¶
Use the edm:hasType property for these values.
Example record - volvoices:3827
<genre>Hogsheads</genre>
@prefix edm: <http://www.europeana.eu/schemas/edm/> .
<https://example.org/objects/1> edm:hasType "Hogsheads" .
genre[not(text())]¶
Use Case¶
Empty genre elements should not be migrated.
Justification¶
There is no pertinent information to migrate.
XPath¶
genre[not(text())]
Decision¶
Do not migrate.
<genre valueURI=""/>
<genre authority="lcgft" authorityURI="http://id.loc.gov/authorities/genreForms"/>
language¶
Predicate |
Value Type |
Usage Notes |
|---|---|---|
dcterms:language |
URI |
The language of the resource. Preference is to use a value from a controlled vocabulary, such as ISO 639-2. |
item has one language¶
Use Case¶
Single instance of languageTerm where item language is known. Many of our resources will have one instance of a
language element with a single subelement of languageTerm. The :code`type` attribute for languageTerm may be either
“text” or “code”.
Justification¶
Both Samvera and Islandora handle this case similarly, directly mapping the URI, however, Islandora does offer an alternative with additional minting of objects required. We will opt to go with the cleanest possible route of direct mapping to the controlled vocabulary, ISO 639-2, and avoid minting new objects.
XPath¶
language/languageTerm[@type="text"] OR
language/languageTerm[@type="code"]
Decision¶
Language in text example record - tatum:188 :
<language>
<languageTerm authority="iso639-2b" type="text">English</languageTerm>
</language>
Language in code example record - ekcd:9:
<language>
<languageTerm authority="iso639-2b" type="code">eng</languageTerm>
</language>
Turtle would map the same in both cases.
@prefix dcterms: <http://purl.org/dc/terms/> .
<https://example.org/objects/1> dcterms:language <http://id.loc.gov/vocabulary/iso639-2/eng> .
“No linguistic content” cases can be found across some of our resources. In these cases, a code attribute is present with a “zxx”
value or type attribute with a text value, and the languageTerm element has a value of “No linguistic content”. Justifications from the single language case above also apply here. These are handled just like other languages in ISO 639-2 Collection of Bibliographic Codes. In this case, the “zxx” code denotes a declared absence of linguistic information.
No linguistic content example record - tdh:911:
<language>
<languageTerm authority="iso639-2b" type="text">No linguistic content</languageTerm>
</language>
Zxx example record - heilman:1009:
<language>
<languageTerm type="code" authority="iso639-2b">zxx</languageTerm>
</language>
Turtle would map the same in both cases.
@prefix dcterms: <http://purl.org/dc/terms/> .
<https://example.org/objects/1> dcterms:language <http://id.loc.gov/vocabulary/iso639-2/zxx> .
item has multiple languages¶
Use Case¶
Multiple instances of a languageTerm present. In very few cases (13 total), multiple languages can be found for an item.
In all cases, languages are assigned a known authority, with the type attribute’s value as “text: or “code”.
Justification¶
Similar to items with one language, URIs are directly mapped in the Samvera recommendations. Islandora does not have
recommendations for this use case. We could separate languages onto new lines with a duplicate predicate. However,
as style choice and to simplify in mapped turtle, multiple languages in our items will be delineated by a comma.
Justifications from the single language case also apply here.
XPath¶
language/languageTerm[@type="text"] OR
language/languageTerm[@type="code"]
Decision¶
<language>
<languageTerm authority="iso639-2b" type="text">French</languageTerm>
</language>
<language>
<languageTerm authority="iso639-2b" type="text">Italian</languageTerm>
</language>
@prefix dcterms: <http://purl.org/dc/terms/> .
<https://example.org/objects/1>
dcterms:language <http://id.loc.gov/vocabulary/iso639-2/fre> , <http://id.loc.gov/vocabulary/iso639-2/ita> .
typeOfResource¶
Predicate |
Value Type |
Usage Notes |
|---|---|---|
dcterms:type |
URI |
Use with a type from a controlled vocabulary (such as the LoC Resource Types Scheme or DCMI Type Vocabulary). |
typeOfResource with no attributes¶
Use case¶
Most records currently have a typeOfResource value with no attributes. Depending on the item being described, it is possible
for there to be multiple typeOfResource values in a single record. The Islandora Metadata Interest Group has carefully
created a mapping to translate MODS typeOfResource values to dcterms resource types. A selection of the mapping is
included below that addresses all of the values UT has within its metadata. Note that the final row, collection=”yes”
is addressed in a subsequent category.
MODS typeOfResource |
RDF Predicate |
RDF Value |
dcterms text value |
text |
dcterms:type |
Text |
|
cartographic |
dcterms:type |
Cartographic |
|
notated music |
dcterms:type |
Notated music |
|
sound recording-nonmusical |
dcterms:type |
Audio non-musical |
|
sound recording |
dcterms:type |
Audio |
|
still image |
dcterms:type |
Still image |
|
moving image |
dcterms:type |
Moving image |
|
three dimensional object |
dcterms:type |
Artifact |
|
collection=”yes” |
dcterms:type |
Collection |
Justification¶
Values within typeOfResource are used for initial faceting in search for both UT’s local Digital Collections website
and for DPLA’s interface. As DPLA doesn’t display physicalDescription/form values, it is important to share this
less granular indication of the resource type.
XPath¶
typeOfResource
Decision¶
Here’s an example record - vanvactor:1.
<typeOfResource collection="yes">notated music</typeOfResource>
@prefix dcterms: <http://purl.org/dc/terms/> .
<https://example.org/objects/1> dcterms:type <http://id.loc.gov/vocabulary/resourceTypes/not> .
typeOfResource with @collection=”yes”¶
Use case¶
In MODS, an attribute can be used on typeOfResource to indicate that the record refers to an entire collection rather
than an individual resource. This is useful because it makes it possible to distinguish between object and collection
records in the catalog so that patrons understand more quickly how much content is associated with the record. The
Islandora Metadata Interest Group has come up with the solution of using the dcterms resource type of “Collection.” In
this situation we will need multiple triples to preserve the information currently present - one for indicating the record is
for a collection and one (or more) for indicating prevalent resource type(s) in the collection. In MODS typeOfResource is
a repeatable field. Note that we will need to make sure that we do not repeat the collection resource type in cases
where there are multiple typeOfResource[@collection="yes"] instances.
collection=”yes” |
dcterms:type |
Collection |
Justification¶
We need to be able to distinguish between an item and collection resource, so retaining this information is necessary.
XPath¶
typeOfResource[@collection="yes"]
Decision¶
Here’s a complex example that includes two typeOfResource values - gsmrc:smhc.
<typeOfResource collection="yes">text</typeOfResource>
<typeOfResource collection="yes">still image</typeOfResource>
@prefix dcterms: <http://purl.org/dc/terms/> .
<https://example.org/objects/1> dcterms:type <http://id.loc.gov/vocabulary/resourceTypes/col> ;
dcterms:type <http://id.loc.gov/vocabulary/resourceTypes/txt> ;
dcterms:type <http://id.loc.gov/vocabulary/resourceTypes/img> .
Missing typeOfResource value¶
Use case¶
Currently 9,993 records are missing a typeOfResource value. The affected collections include Volunteer Voices (not entire
collection), Roth, the Howard Baker Speeches and Remarks, Great Smoky Mountains Colloquy, and the Great Smoky Mountains Postcard Collection. We can consider if we would like to apply a blanket value to a collection at the time of migration. For monolithic collections like Roth and Baker, this would be easy to achieve (roth = "still image" and baker = "text" in MODS). For collections with varied formats, like Volunteer Voices, this will not be possible.
Justification¶
Given that the Digital Collections home page currently uses typeOfResource to initially limit searches, it would be
beneficial for this value to be more consistently present. It would also assist with discovery in DPLA.
XPath¶
not(typeOfResource)
Decision¶
During or post migration we will plan to add typeOfResource on a collection basis if possible. See the chart below for decisions.
collection PID |
dcterms:type |
colloquy |
|
hbs |
|
pcard00 |
|
roth |
|
volvoices |
cannot assign blanket value |
Here’s an example record with no typeOfResource value - roth:100.
@prefix dcterms: <http://purl.org/dc/terms/> .
<https://example.org/objects/1> dcterms:type <http://id.loc.gov/vocabulary/resourceTypes/img> .
classification¶
Predicate |
Value Type |
Usage Notes |
classSchemes:lcc |
Literal |
Use for values from Library of Congress Classification. |
Use case¶
Some of our resources have already been formally cataloged and have a classification number. When these are available,
they are included in the MODS metadata. Serials like the Alumnus and many of the Athletics media guides are good examples.
Some collections, like the University of Tennessee Commencements collection include full shelfLocators in the classification
field (e.g. LD5297 .U55 2013). These should be edited before migration.
Justification¶
This information is helpful to include as it provides information about where the physical item is shelved (though this
is not a complete shelfLocator) and the broad subject the materials relate to.
XPath¶
classification[@authority="lcc"] OR
classification
Decision¶
Example record without authority - tenngirl:977
<classification>LD5296 .W6</classification>
Example record with authority - agrtfhs:2275
<classification authority="lcc">S1 .T43</classification>
@prefix classSchemes: <http://id.loc.gov/vocabulary/classSchemes/> .
<https://example.org/objects/1> classSchemes:lcc "S1 .T43" .
part¶
Use Case¶
The MODS part element is infrequently used to describe a portion of a larger resource. In UT’s metadata, part is used
in two collections - Great Smoky Mountains Colloquy and Sanborn Fire Insurance Map Collection.
Justification¶
Ultimately it was decided that this information is not important to keep because it is already present in the title field
in both instances. With the Sanborn maps there is a difference between how the part is named - Sheet versus District-Ward,
but it was not felt strongly that any additional remediation needed to be done.
XPath¶
part
Decision¶
Drop all values in part.
<titleInfo>
<title>Knoxville -- 1917</title>
<partName>Sheet 99</partName>
</titleInfo>
<part>
<detail>
<title>District-Ward 99</title>
</detail>
</part>
location¶
Predicate |
Value Type |
Usage Notes |
|---|---|---|
relators:rps |
Literal |
Use for |
skos:note |
Literal |
Use to note |
dbo:collection |
Literal |
Use to note |
physicalLocation - minus UT variation¶
Use Case¶
Across our collections, there is a great deal of variation in how the name of the holding repository is represented. Particularly, there is a mix of URI values and strings without URIs. At the time of analysis, 76 out of the 99 distinct values in this field were strings. If possible, it would be beneficial to streamline string values for UT Libraries during remediation via the migration spreadsheets. A separate use case is described below.
Justification¶
Given the prominence of string values, the time remediation would require, and limited use of the URI beyond having controlled string values, we will use string values. This way a separate property for URI values in the MAP is not needed. Translating these to a relative URIs would require significant effort, and the value added may be trivial at this point.
XPath¶
location/physicalLocation[not(text()="University of Tennessee Knoxville. Libraries")] OR
location/physicalLocation[not(@displayLabel="Collection")] OR
location/physicalLocation[not(@displayLabel="Address")] OR
location/physicalLocation[not(@displayLabel="City")] OR
location/physicalLocation[not(@displayLabel="Detailed Location")] OR
location/physicalLocation[not(@displayLabel="State")]
Decision¶
We will use string values for physicalLocation.
<location>
<physicalLocation>Blount County Public Library</physicalLocation>
<holdingExternal>
<holding xsi:schemaLocation="info:ofi/fmt:xml:xsd:iso20775 http://www.loc.gov/standards/iso20775/N130_ISOholdings_v6_1.xsd">
<physicalAddress>
<text>City: Maryville</text>
<text>County: Blount County</text>
<text>State: Tennessee</text>
</physicalAddress>
</holding>
</holdingExternal>
</location>
@prefix relators: <http://id.loc.gov/vocabulary/relators/> .
<https://example.org/objects/1>
relators:rps "Blount County Public Library" .
physicalLocation - UT variation¶
Use Case¶
In many of our own collections, we use strings to describe physicalLocation. There are 665 instances (primarily across the TEI migration collections) in which we use the main library
heading rather than that of Special Collections as the location. That string is as follows:
“University of Tennessee Knoxville. Libraries”
Justification¶
To create better consistency and cleanliness going forward, we will isolate all instances of these strings and update them to the controlled value for UT Libraries Special Collections.
XPath¶
location/physicalLocation[text()="University of Tennessee Knoxville. Libraries"]
Decision¶
We will map variations of “The University of Tennessee Libraries, Knoxville” to the string “University of Tennessee, Knoxville. Special Collections.”
Example record - civilwar:1438
<location>
<physicalLocation>University of Tennessee Knoxville. Libraries</physicalLocation>
</location>
@prefix relators: <http://id.loc.gov/vocabulary/relators/> .
<https://example.org/objects/1>
relators:rps "University of Tennessee, Knoxville. Special Collections" .
physicalLocation with shelfLocator (UT)¶
Use Case¶
In many cases, some of our collection items will have shelfLocator information. This shares where a physical copy
of the resource is shelved. This information may not currently be accurate and can be found via Special Collections’ finding aids.
Justification¶
Because our MODS records may not be accurate and this information is located elsewhere, and perhaps more accurately, we will drop this information when shelfLocator is used in conjunction with our repositories.
XPath¶
location[physicalLocation[text()[contains(., "University of Tennessee")]]]/shelfLocator
Decision¶
We will drop shelfLocator data when present for UT Knoxville records.
<location>
<physicalLocation valueURI="http://id.loc.gov/authorities/names/no2014027633">University of Tennessee, Knoxville. Special Collections</physicalLocation>
<shelfLocator>Box 5, Folder 8</shelfLocator>
</location>
@prefix relators: <http://id.loc.gov/vocabulary/relators/> .
<https://example.org/objects/1>
relators:rps "University of Tennessee, Knoxville. Special Collections" .
physicalLocation with shelfLocator (non-UT)¶
Use Case¶
Instances where non-UT held items have shelfLocator information.
Justification¶
While, we do not not know if this shelfLocator information is accurate, we will opt to retain it going forward as a string and map to skos:note. Samvera does note some possible future availability of opaque:locationShelfLocator, however this predicate does not exist yet.
XPath¶
location[physicalLocation[text()[not(contains(., "University of Tennessee"))]]]/shelfLocator
location[physicalLocation[not(contains(.,'University of Tennessee'))] and holdingSimple/copyInformation/shelfLocator]
Decision¶
We will retain shelfLocator data when present for non-UT records, and transcribe this to a skos:note.
Example record - volvoices:2136
<location>
<physicalLocation>Cleveland State Community College</physicalLocation>
<holdingSimple>
<copyInformation>
<shelfLocator>Photograph Collection 2, People</shelfLocator>
</copyInformation>
</holdingSimple>
<holdingExternal>
<holding xsi:schemaLocation="info:ofi/fmt:xml:xsd:iso20775 http://www.loc.gov/standards/iso20775/N130_ISOholdings_v6_1.xsd">
<physicalAddress>
<text>City: Cleveland</text>
<text>County: Bradley County</text>
<text>State: Tennessee</text>
</physicalAddress>
</holding>
</holdingExternal>
</location>
@prefix relators: <http://id.loc.gov/vocabulary/relators/> .
@prefix skos: <http://www.w3.org/2004/02/skos/core#> .
<https://example.org/objects/1>
relators:rps "Cleveland State Community College" ;
skos:note "Shelf locator: Photograph Collection 2, People" .
physicalLocation with holdingExternal¶
Use Case¶
Some instances in our collections contain nested subelements for holdingExternal and the further nested physicalAddress information.
Justification¶
To keep our metadata as simple as possible from a technical standpoint we will drop all information for holdingExternal. This type of information has little additive value when physicalLocation is already referenced.
XPath¶
location/holdingExternal
Decision¶
We will drop all information for holdingExternal.
Example record from volvoices:2199
<location>
<physicalLocation>University of Memphis. Special Collections</physicalLocation>
<holdingSimple>
<copyInformation>
<shelfLocator>Manuscript Number 5</shelfLocator>
</copyInformation>
</holdingSimple>
<holdingExternal>
<holding xsi:schemaLocation="info:ofi/fmt:xml:xsd:iso20775 http://www.loc.gov/standards/iso20775/N130_ISOholdings_v6_1.xsd">
<physicalAddress>
<text>City: Memphis</text>
<text>County: Shelby County</text>
<text>State: Tennessee</text>
</physicalAddress>
</holding>
</holdingExternal>
</location>
@prefix relators: <http://id.loc.gov/vocabulary/relators/> .
@prefix skos: <http://www.w3.org/2004/02/skos/core#> .
<https://example.org/objects/1>
relators:rps "University of Memphis. Special Collections" ;
skos:note "Shelf locator: Manuscript Number 5" .
physicalLocation with @displayLabel=”Address”¶
Use Case¶
Some of items with the physicalLocation of Pi Beta Phi Fraternity also have a :code:physicalLocation subelement with the displayLabel attribute value of “Address”.
Justification¶
Similar to the holdingExternal, we will opt drop this information to maintain simplicity of our data from a technical standpoint.
XPath¶
location/physicalLocation[@displayLabel="Address"]
Decision¶
Drop this.
Example record from volvoices:2199
<location>
<physicalLocation>Pi Beta Phi Fraternity</physicalLocation>
<physicalLocation displayLabel="Address">1154 Town and Country Commons Drive, Town and Country, Missouri 63017</physicalLocation>
<shelfLocator>Box 36, Folder 14</shelfLocator>
</location>
@prefix relators: <http://id.loc.gov/vocabulary/relators/> .
@prefix skos: <http://www.w3.org/2004/02/skos/core#> .
<https://example.org/objects/1>
relators:rps "Pi Beta Phi Fraternity" ;
skos:note "Shelf locator: Box 36, Folder 14" .
physicalLocation with @displayLabel=”Collection”¶
Use Case¶
In a some collections for Arrowmont, we will find items having a physicalLocation subelement with the displayLabel attribute value of “Collection” with text containing “Archives Collection”. We also have extra physicalLocation subelements with displayLabel attribute values of “Detailed Location”, “City” and “State”.
Justification¶
Because these records do not already have a dbo:collection predicate, we will transcribe the string literal to
dbo:collection for location/physicalLocation[@displayLabel="Collection"]. No other data here needs to be
retained and will be dropped.
XPath¶
location/physicalLocation[@displayLabel="Collection" and text()[contains(.,"Archives Collection")]]
location/physicalLocation[@displayLabel="Repository"]
location/physicalLocation[@displayLabel="Detailed Location"]
location/physicalLocation[@displayLabel="City"]
location/physicalLocation[@displayLabel="State"]
Decision¶
We will keep the string for the physicalLocation instance with displayLabel="Collection" and transcribe
this to literal for dbo:collection.
Similar to when physicalLocation has no displayLabel attribute, physicalLocation with an
displayLabel attribute value of “Repository” is retained as relators:rps.
All other physicalLocation (“Detailed Location”, “City”, “State”) data is dropped.
Example record from volvoices:2199
<location>
<physicalLocation displayLabel="Collection">Archives Collection</physicalLocation>
<physicalLocation displayLabel="Repository">Arrowmont School of Arts and Crafts</physicalLocation>
<physicalLocation displayLabel="Detailed Location"/>
<physicalLocation displayLabel="City">Gatlinburg</physicalLocation>
<physicalLocation displayLabel="State">Tennessee</physicalLocation>
</location>
@prefix relators: <http://id.loc.gov/vocabulary/relators/> .
@prefix dbo: <http://dbpedia.org/ontology/> .
<https://example.org/objects/1>
relators:rps "Arrowmont School of Arts and Crafts" ;
dbo:collection "Archives Collection" .
physicalLocation within volvoices used for provider information¶
Use Case¶
There is one collection in which location information needs to be used to correct inaccuracies for other metadata fields. The Volunteer Voices collection lists the University of Tennessee as the recordContentSource for all records. While UT may have created the metadata records, our mapping with DPLA makes it so that we are noted as the source of these records rather than the actual contributing institution. There are instances within volvoices in which the institution in physicalLocation and the one listed in recordContent Source are the same. This action doesn’t need to be taken for those records.
Justification¶
Providing contributing institutions with the proper credit is important for inter-institutional projects.
XPath¶
location/physicalLocation[. != recordInfo/recordContentSource]
Note: an easier way to resolve this particular XPath expression might be to start at the document node, mods; e.g. mods[location/physicalLocation != recordInfo/recordContentSource].
Decision¶
Will be mapped to both relators:rps and edm:dataProvider. edm:dataProvider is being used because the value is an “organisation who contributes data indirectly to an aggregation service” (aka to UT first and then to DPLA).
Here’s an example record -
Example record - volvoices:2737
<recordInfo>
<recordContentSource>University of Tennessee, Knoxville. Special Collections</recordContentSource>
</recordInfo>
<location>
<physicalLocation>Chattanooga-Hamilton County Bicentennial Library</physicalLocation>
</location>
@prefix edm: <http://www.europeana.eu/schemas/edm/> .
@prefix relators: <http://id.loc.gov/vocabulary/relators/> .
<https://example.org/objects/1>
edm:dataProvider "Chattanooga-Hamilton County Bicentennial Library" ;
relators:rps "Chattanooga-Hamilton County Bicentennial Library" .
url¶
Use Case¶
Some of our metadata for volvoices may contain self-referential URL locations. The data contained in these directly reference objects in our current Islandora 7 build and the object’s datastreams.
Justification¶
This is self-referential and has no value in a new system.
XPath¶
location/url
Decision¶
Do not migrate.
Example record - volvoices:9999
<location>
<url access="object in context" usage="primary display">https://digital.lib.utk.edu/collections/islandora/object/volvoices%3A9999</url>
<url access="preview">https://digital.lib.utk.edu/collections/islandora/object/volvoices%3A9999/datastream/TN/view</url>
</location>
recordInfo¶
Predicate |
Value Type |
Usage Notes |
|---|---|---|
edm:dataProvider |
URI or Literal |
Use the name of the organization who contributes data indirectly to an aggregation service. Note that we have decided to only use literals even though the property allows URIs. |
edm:provider |
URI or Literal |
Use the name of the organization (typically UT) who delivers data directly to an aggregation service. Note that we have decided to only use literals even though the property allows URIs. |
recordIdentifier¶
Use Case¶
Unremediated records often contain identifiers for the record. These take a couple of different forms. The Heilman collection and Volunteer Voices collections contain this element. In Volunteer Voices the identifier is simply the adminDB value with ‘record’ appended to the beginning (e.g. volvoices:2352).
Justification¶
As the basic root of the recordIdentifier value is already present in the identifier element in all cases and the
recordIdentifier value is never used on its own, there is no reason to retain these values.
XPath¶
recordInfo/recordIdentifier
Decision¶
All recordIdentifier values should be dropped, so no RDF example is included below.
Here’s an example record - heilman:1001.
<identifier type="local">Pseudolarix_0858</identifier>
<recordInfo>
<recordIdentifier>record_Pseudolarix_0858</recordIdentifier>
</recordInfo>
languageOfCataloging¶
Use Case¶
All of the recently migrated SCOUT to TEI collections (e.g. American Civil War Collection, Tennessee Documentary History, etc.)
as well as some of UT’s less recent collections (e.g. Sanborn, mpabaker, etc.) contain the element languageOfCataloging.
In total, it is found in approximately 6,000 records. Note that in all cases the language is English, but this information
is represented as both a code (“eng”) and a text value (“English”).
Justification¶
Currently languageOfCataloging is not publicly displayed anywhere outside of the MODS XML. The values of this element
do have the potential to be used if UT has materials that might warrant cataloging in another language, but currently
this is not the case. An example of a project that includes two records, one catalogued in Spanish and one in English, is
UNC’s New Roots / Nuevas Raíces. While UT may want to pursue a project like this in the
future, presently it seems unlikely that it will. More importantly, if such a project became a priority, it would not be
difficult to distinguish via code UT’s existing English records from records in another language. If we did want to create
a project like this, information on the language of cataloging could be added across the repository with minimal effort.
XPath¶
recordInfo/languageOfCataloging
Decision¶
Since we are not currently utilizing these values in any way, these should be dropped in the mapping.
Example record - sanborn:1002.
<recordInfo>
<languageOfCataloging>
<languageTerm authority="iso639-2b" type="code">eng</languageTerm>
</languageOfCataloging>
</recordInfo>
recordOrigin¶
Use Case¶
The recordOrigin element includes information about what methods or transformations were used to prepare a record. There
are six different distinct values in UT’s metadata.
Justification¶
Because the existing values all relate to MODS XML, the string values present will no longer be applicable in a RDF-based
platform. Discussion indicated that there might be some use for the general property of recordOrigin if a link to this
mapping document or some other relevant resource was shared in place of the existing values. This administrative information
could also be shared on the Digital Collections website or elsewhere rather than the record. As a convincing argument
was not made that this information is essential, it was decided to drop these values
XPath¶
recordInfo/recordOrigin
Decision¶
Do not migrate.
Example record - cDanielCartoon:1178
<recordInfo>
<recordOrigin>Created and edited in general conformance to MODS Guidelines (Version 3.5).</recordOrigin>
</recordInfo>
recordChangeDate¶
Use Case¶
This element is used sparingly in UT’s metadata records. Currently there are five distinct values, all indicating that the last change to the record was made in 2015, which simply isn’t sharing accurate information.
Justification¶
Keeping this information is not be useful as it doesn’t allow someone viewing the record to see when it was actually last updated. Inaccurate information is shared. In addition, in a system like Islandora it’s easy enough for an internal staff member to view when the metadata datastream has been updated without tracking this in the record. This element can be dropped.
XPath¶
recordInfo/recordChangeDate
Decision¶
Do not migrate.
Example record - volvoices:3435.
<recordInfo>
<recordChangeDate encoding="edtf">2015-03-23</recordChangeDate>
<recordChangeDate encoding="edtf">2015-03-31</recordChangeDate>
<recordChangeDate encoding="edtf">2015-04-01</recordChangeDate>
</recordInfo>
recordCreationDate¶
Use Case¶
A total of 167 values are present for recordCreationDate. This value shows when the record was originally created. All
but one of these values precedes 2010. All of the recently migrated TEI SCOUT records (2,386) have a value of
“2020-04-23-04:00”. This is the only value not presented in EDTF format. Otherwise all of the values appear to come from
Volunteer Voices.
Justification¶
Unlike recordChangeDate, all of the values within recordCreationDate are at least accurate. Currently this information
is not used or displayed for users. Given this and the fact that this element is present in a very small percentage of
UT records, it does not seem useful to keep this information. Again, a repository system should have a way to track
when a metadata datastream for a particular digital object was created. Therefore keeping this information adds unnecessary
complexity.
XPath¶
recordInfo/recordCreationDate
Decision¶
Do not migrate.
Example record - volvoices:1857.
<recordInfo>
<recordCreationDate encoding="edtf">2007-10-26</recordCreationDate>
</recordInfo>
recordContentSource - University of Tennessee, Knoxville as value¶
Use Case¶
The recordContentSource element is one of the most essential elements within recordInfo, as we currently use it to
communicate the provider in DPLA. Because DPLA cannot handle URIs, the decision has been made to only deliver strings.
We do not feel strongly that the added functionality provided by using a URI for this field warrants the effort needed
to process URIs into strings for delivery to DPLA. We recognize that this goes against our general philosophy to use URIs
when possible.
To better understand UT’s use of this element some background information is helpful. At UT the information we share in
this element is not consistent with the definition of recordContentSource - “The code or name of the entity (e.g. an
organization and/or database) that either created or modified the original record.” While we work with other partners,
like the Children’s Defense Fund and the McClung Museum, we are still technically the creators of the records in these
situations. Despite this, we typically list these institutions as the record creator because we set up recordContentSource
as the element that DPLA should map to for content provider. In actuality, when the content provider is not UT, this
information should be communicated in physicalLocation and our DPLA mapping should be updated. Despite these semantic
issues, UT has consistently put this information in an incorrect element, so the mapping is not affected.
Justification¶
A content provider is required in DPLA. This value also provides UT with the opportunity to attribute collections to the institution that provided them, which is important for maintaining respectful relationships. Because of DPLA’s limitations, we will provide this information as a string.
XPath¶
recordInfo/recordContentSource
Decision¶
Because UT acts as a service hub for DPLA and it delivers data directly to this aggregator, it can be considered an edm:provider. This is defined as “The name or identifier of the organization who delivers data directly to an aggregation service (e.g. Europeana).”
When UT physically holds the material and created the record, the metadata resembles this example record - acwiley:284.
<recordInfo>
<recordContentSource valueURI="http://id.loc.gov/authorities/names/n87808088">University of Tennessee, Knoxville. Libraries</recordContentSource>
</recordInfo>
@prefix edm: <http://www.europeana.eu/schemas/edm/> .
<https://example.org/objects/1> edm:provider "University of Tennessee, Knoxville. Libraries" .
recordContentSource - not University of Tennessee, Knoxville as value¶
Use Case¶
When a resource comes from a non-UT institution, its name is typically placed in recordContentSource. An exception to
this is Volunteer Voices, which only includes the contributing institution in location/physicalLocation. See
location for more information.
Justification¶
A content provider is required in DPLA. Sharing the names of institutional partners within the Digital Library of Tennessee is also a great way to recognize the contributions of these libraries. Because of DPLA’s limitations, we will provide this information as a string.
XPath¶
recordInfo/recordContentSource
Decision¶
When the institution listed as providing the information is not UT, edm:dataProvider should be used instead of
edm:provider. edm:dataProvider is defined as “The name or identifier of the organisation who contributes data indirectly
to an aggregation service.”
Example record - cdf:70. It is also coupled with an “Intermediate Provider” note, as shown below. McClung’s Egypt collection is also treated similarly.
<recordInfo>
<recordContentSource valueURI="http://id.loc.gov/authorities/names/no2017113530">Langston Hughes Library (Children's Defense Fund Haley Farm)</recordContentSource>
</recordInfo>
<note displayLabel="Intermediate Provider">University of Tennessee, Knoxville. Libraries</note>
<location>
<physicalLocation valueURI="http://id.loc.gov/authorities/names/no2017113530">Langston Hughes Library (Children's Defense Fund Haley Farm)</physicalLocation>
</location>
For the purposes of DPLA, we only need the recordContentSource value and not also the physicalLocation value. Because
these institutions are not directly contributing to DPLA, they are listed as an edm:dataProvider instead of an
edm:provider.
@prefix edm: <http://www.europeana.eu/schemas/edm/> .
<https://example.org/objects/1> edm:dataProvider "Langston Hughes Library (Children's Defense Fund Haley Farm)" .
accessCondition¶
Predicate |
Value Type |
Usage Notes |
|---|---|---|
edm:rights |
URI |
Use for rights URIs from RightsStatements or Creative Commons. |
skos:note |
Literal |
Use for accessConditions with @type=”restrictions on access”. |
accessCondition - Rights Statements and Creative Commons Licenses¶
Use Case¶
This category is defined by the presence of either one of the twelve standardized rights statements from https://righsstatements.org or one of the CC licenses. These values are used to provide users with standard and clear information on the copyright status of an item and how or if it can be reused. These values are currently displayed in a facet and are recommended for sharing with DPLA.
All creative commons licenses should be valid and follow a pattern that results in valid XML against the CreativeCommons REST Endpoint. For this to happen, one of these two patterns must be followed:
http://creativecommons.org/licenses/*/*/http://creativecommons.org/publicdomain/mark/*/
This means:
Use
httpinstead ofhttpsas the protocol (for content negotiation and validity)Do not end code in
/rdf. While this is dereferenceable and content negotiable, it causes problems for developers by forcing them to strip away therdfstring for easy license lookup.
Justification¶
DPLA maps both CC licenses and Rights Statements to edm:rights. So does Samvera.
Creative Commons licenses should be content negotiable against the CreativeCommons REST Endpoint for easy lookup by developers.
XPath¶
accessCondition[@xlink:href]
Decision¶
Example record for Rights Statements
<accessCondition type="use and reproduction"
xlink:href="http://rightsstatements.org/vocab/CNE/1.0/">
Copyright Not Evaluated
</accessCondition>
@prefix edm: <http://www.europeana.eu/schemas/edm/> .
<https://example.org/objects/1> edm:rights <http://rightsstatements.org/vocab/CNE/1.0/> .
<accessCondition type="use and reproduction"
xlink:href="https://creativecommons.org/licenses/by-nc-nd/3.0/">
Attribution-NonCommercial-NoDerivs 3.0 Unported (CC BY-NC-ND 3.0)
</accessCondition>
@prefix edm: <http://www.europeana.eu/schemas/edm/> .
<https://example.org/objects/1> edm:rights <https://creativecommons.org/licenses/by-nc-nd/3.0/rdf> .
accessCondition - Restrictions on Access¶
Use case¶
The Howard Baker Audiovisual Collection includes 46 items that are “In Copyright” and therefore have restricted access to avoid any potential copyright conflicts. Only on campus access is provided to the actual recordings, though the metadata records are accessible from anywhere. Having the metadata accessible makes it so that anyone can discover these materials and decide for themselves if it is worth a trip into campus. Some of the recordings had some deterioration and were therefore digitized as a preservation measure. Having digitized copies also made providing on site access easier. In order to make sure that users are aware of the on campus only restriction, a note needed to be added to the metadata. When off campus users visit the metadata, this note makes it clear why they cannot access the recording.
Justification¶
As the value present in the current accessCondition node is not associated with a controlled vocabulary and simply needs to
be displayed to the user within the record, there is no reason to connect it with other accessCondition values. A note is
sufficient for this use case.
XPath¶
accessCondition[@type="restriction on access"]
Decision¶
<accessCondition type="restriction on access">
This item can only be accessed on the University of Tennessee (Knoxville) campus
</accessCondition>
@prefix skos: <http://www.w3.org/2004/02/skos/core#> .
<https://example.org/objects/1> skos:note 'This item can only be accessed on the University of Tennessee (Knoxville) campus' .