CIDOC-CRM in RDF Application Profile

Guidelines how to use CIDOC-CRM in RDF for interoperability

Author
Affiliation

Jakob Voß

Verbundzentrale des GBV (VZG)

Published

January 27, 2025

1 Introduction

The CIDOC Conceptual Reference Model (CRM) is a conceptual data model used in the cultural heritage domain. The Resource Description Framework (RDF) is a graph-based data format. Both CRM and RDF have been created independently for integration of information. RDF is a good fit to express CRM in data and it has been used to do so. The expression of CRM in RDF is not trivial though, so some guidelines are needed. This document provides an application profile to best use CRM in RDF for integration with other RDF data. The primary use case is data integration into the Knowledge Graph of NFDI4Objects.

Note

This document is still in an early draft. Contributions and feedback are very welcome! The document sources are managed in a git repository at https://github.com/nfdi4objects/crm-rdf-ap.

Expressing CRM and RDF

CRM defines abstract types of entities (CRM classes) such as events, measurements, places, and actors with relationship types (CRM properties) to connect instances of these entity types. RDF and its most common extensions define how to identify entities (resources), entity types (RDF classes) and relationship types (CRM properties) with URIs and values with Unicode strings (RDF literals) optionally having a language or a data type to encode values such as numbers and dates. RDF is used with ontologies that define RDF classes, properties, and constraints. CRM looks like an ontology or like it could directly be mapped to an RDF ontology, but this is not the case. CRM is agnostic to data formats: CRM classes are not RDF classes and CRM has no concept of data types and values, so any expression of CRM in RDF comes with choices of design. It is possible to express the same information modeled with CRM in different forms of RDF, so data cannot be integrated flawlessly.

Some recommendation exist to express CRM in RDF (Doerr, Light, and Hiebel (2020)) and to combine it with other ontologies (e.g. Nys, Ruymbeke, and Billen (2018)).

2 Primitive values

E59 Primitive Value and its subclasses are not expressed as RDF classes. Instead

  • instances of E62 String are expressed as RDF literals with optional language tag, and

  • instances of E60 Number are expressed as RDF literals with numeric data type such as xsd:integer

The CRM classes E61 Time Primitive, E94 Space Primitive, and E95 Spacetime Primitive are both subclasses of E59 Primitive Value and of E41 Appellation, so the latter can be used when a mapping to established RDF data types is not applicable.

Temporal data

Temporal values (instances of of E61 Time Primitive and E52 Time-Span in CRM) SHOULD be expressed by RDF literals with one of the data types from XML Schema (XSD) Data Types or from Extended Date/Time Format (EDTF) listed in Table 1.

Table 1: Temporal data types
Datatype Description Example
xsd:date Year, month and day and optional time zone 2010-12-17
xsd:time Time with o 13:20:00-05:00
xsd:dateTime Full date, time, and optional time zone 1912-04-15T02:38–05:18
xsd:dateTimeStamp Full date, time and mandatory time zone 1912-04-15T02:38–05:18Z
xsd:gYear Year and optional time zone 2010
xsd:gYearMonth Year, month, and optional time zone 2010-12
xsd:gMonth Month and optional time zone --12
xsd:gMonthDay Month, day, and optional time zone --12-17
xsd:gDay Day and optional time zone ---17
edtf:EDTF Complex temporal value in EDTF syntax 2024~
edtf:EDTF-level0 EDTF limited to level 1 features 2010-12
edtf:EDTF-level1 EDTF limited to level 1 and 2 features 2010-12?
edtf:EDTF-level2 EDTF with all features up to level 2 15XX-?12
Listing 1: An event with date/time given with different data types
@prefix crm: <http://www.cidoc-crm.org/cidoc-crm/> .
@prefix edtf: <http://id.loc.gov/datatypes/edtf/> .
@prefix unit: <http://qudt.org/vocab/unit/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

<TitanticSinking> a crm:E81_Transformation ;
  crm:P124_transformed <RMSTitanic> ;
  crm:P123_resulted_in <TitanticWreck> ;
  crm:P4_has_time-span 
    "1912-04-15"^^xsd:date ,                        # date only
    "1912-04-15T02:38–05:18Z"^^xsd:dateTimeStamp ,  # precise date and time with timezone
    "1912-04-15?"^^edtf:EDTF .                      # uncertain date with EDTF

More complex temporal values MAY be expressed as instance of time:TemporalEntity or its subclasses from Time Ontology as discussed by Nys, Ruymbeke, and Billen (2018) and described in EDTF in RDF. RDF literals are preferred because it is easer to derive time:TemporalEntity than the other way round.

Listing 2: Extended temporal value expressed with Time Ontology
<WW1> a time:Interval ; # instead of E52_Time-Span
  edtfo:hasEDTFDateTimeDescription "1914/1918" ;
  time:hasBeginning [ time:inXSDgYear "1914"^^xsd:gYear ] ;
  time:hasEnd [ time:inXSDgYear "1918"^^xsd:gYear ] .
Tip

Support of simple temporal data with XSD data types is not part of SPARQL 1.2 specification (see proposal SEP-0002) but included in most SPARQL processors, so typed date values – in contrast to plain strings – can directly be calculated with.

Note

Doerr, Light, and Hiebel (2020) recommended to use additional properties for temporal intervals (E52 Time-Span):

The use of these properties may lead to false assumption of precision and it introduced a solitary solution to a problem also addressed outside of CIDOC. For this reasons these additional properties should not be used in favour of EDTF and/or Time Ontology.

Temporal CRM properties SHOULD be expressed with corresponding properties from Time Ontology:

CRM class or property in RDF
E52 Time-Span Literal or time:Interval
E61 Time Primitive time:TemporalEntity
P79 beginning is qualified by = ?
P80 end is qualified by = ?
P81 ongoing throughout ?
P82 at some time within ?
P86 falls within time:intervalIn
P160 has temporal projection ?
P164 is temporally specified by ?
P170 defines time ?
P183 ends before the start of time:before
P183i starts after the end of time:after
P173 starts before or with the end of ?
P174 starts before the end of ?
P175 starts before or with the start of ?
P176 starts before the start of ?
P182 ends before or with the start of ?
P184 ends before or with the end of ?
P185 ends before the end of ?
P191 has duration ?

Spatial data

Instances of E94 Space Primitive should be expressed using GeoSPARQL Ontology as instance of geo:Geometry, compatible with various geographic data formats (WKT, GeoJSON, GML…).1 CRM Property P168 place is defined by should be expressed with RDF property geo:hasGeometry. CRM Properties P171 at some place within, and P172 contains can be used as RDF properties to link places (E53 Place) to outer and inner geometries but geo:hasBoundingBox and geo:hasCentroid should be preferred, if applicable.

The preferred serialization of spatial coordinates is WKT because this allows for spatial queries with GeoSPARQL. GeoJSON can be derived automatically for display in web applications. For simple WKT POINT coordinates in WGS 84 coordinate system, data providers MAY use the Basic Geo (WGS84 lat/long) Vocabulary in addition.

Listing 3: A place with geographic coordinates
@prefix crm: <http://www.cidoc-crm.org/cidoc-crm/> .
@prefix geo: <http://www.opengis.net/ont/geosparql#> .
@prefix wgs84: <http://www.w3.org/2003/01/geo/wgs84_pos#> .

<TitanticWreckLocation> a crm:E53_Place ;
  crm:P89_falls_within <AtlanticOcean> ;
  geo:hasGeometry [
    a geo:Geometry ;
    geo:asWKT "POINT (-49.946944 41.7325 -3803)"^^geo:wktLiteral ;
    geo:asGeoJSON '{"type": "Point","coordinates": [-49.946944,41.7325,-3803]}' ;
    wgs84:long "-49.946944" ; wgs84:lat "41.7325" ; wgs84:alt "-3803"
  ] .

GeoSPARQL properties geo:hasMetricSpatialResolution and/or geo:hasSpatialAccuracy can be used to indicate level of detail.

Geotemporal data

The CRM class E95 Spacetime Primitive and its corresponding property P169i spacetime volume is defined by MUST NOT be used in RDF. Their purpose in CRM is to define the time and place of an abstract E92 Spacetime Volume or one of its subclasses. In RDF this can be done by combination of P4 has time-span or P160 has temporal projection for time (see Temporal data) and P161 has spatial projection or geo:hasGeometry for place (see Spatial data):

Listing 4: A spacetime primitive in time and place
@prefix crm: <http://www.cidoc-crm.org/cidoc-crm/> .
@prefix geo: <http://www.opengis.net/ont/geosparql#> .

<AssassinationOfArchdukeFranzFerdinand>
  crm:P4_has_time-span "1914-06-28"^^xsd:date ;
  geo:hasGeometry [
    # 43°51'28.5"N 18°25'43.9"E
    geo:asWKT "POINT (18.4283426 43.8576859)"^^geo:wktLiteral
  ] .

Applications MAY assume:

crm:E92_Spacetime_Volume rdfs:subClassOf
  geosparql:SpatialObject ,
  time:TemporalEntity .
Note

The CRMgeo extension of CRM combines CRM and GeoSPARQL in a similar but more complex way (Hiebel2016?).

3 Authority files and types

CRM class E32 Authority Document and CRM property P71 lists MUST NOT be used in RDF but corresponding SKOS RDF classes ConceptScheme and inScheme instead.

CRM also defines class E55 Type with properties P127 has broader term and P127i has narrower term. The class MUST NOT be used in RDF. Instead it can be mapped to one of:

  • skos:Concept and skos:broader/skos:narrower, or to
  • individual RDF classes, connected with rdfs:subClassOf, or
  • a more specific sublass or more generic superclass of E55 Type, such as E56 Language and E28 Conceptual Object.

Applications MAY define ConceptScheme as subclass of E31 Document, Concept a subclass of [E27 Conceptual Object] and inScheme as subproperty of P67 refers to.

4 CRM Classes to use with caution

E58 Measurement Unit

Defintion of instances of E58 Measurement Unit should be avoided but either taken from an established vocabulary of units such as QUDT or expressed as RDF value with UCUM datatype.2

@prefix unit: <http://qudt.org/vocab/unit/> .
@prefix cdt: <https://w3id.org/cdt/> .

<TitanticSinking>
  crm:P191_had_duration [ a crm:E54_Dimension ;
    crm:P90_has_value 160 ; crm:P91_has_unit unit:MIN ;   # value and QUDT unit
    rdf:value "7 min"^^cdt:ucum                           # UCUM string
  ] .

E41 Appellation

E41 Appellation and its subclasses (E35 Title and E42 Identifier) should be avoided (see above for additional subclasses E61 Time Primitive, E94 Space Primitive, and E94 Space Primitive), unless a name cannot uniquely be identified with a sequence of Unicode characters and an optional language tag:

<RMSTitantic>
  crm:P102_has_title "RMS Titanic"@en ;
  crm:P1_is_identified_by [
    rdfs:value "MGY" ; 
    crm:P2_has_type <http://www.wikidata.org/entity/Q353659> # call sign
  ] .

If there are multiple names with one preferred name per language and optional name alias, use skos:prefLabel and skos:altLabel:

<RMSTitantic>
  skos:prefLabel "RMS Titanic"@en ;
  skos:altLabel "Titanic"@en, "Royal Mail Steamship Titanic"@en .

The RDF property skos:prefLabel should not be confused with [P48 has preferred identifier] to be used for identifiers only.

If information about the act of naming is required, use E13 Attribute Assignment for simple appelations or E15 Identifier Assignment for identifiers.

If an identifier E42 Identifier is an URI meant to identify an RDF resource, dont use plain strings but resource URIs in RDF. If a resource happens to have multiple equivalent URIs, choose a preferred URI and use owl:sameAs to record aliases:

  <RMSTitantic> a crm:E18_Physical Thing ;
  owl:sameAs
    <http://www.wikidata.org/entity/Q3018259> ,
    <http://kbpedia.org/kko/rc/RMS-Titanic-TheShip> .

instead of

<RMSTitanic> a crm:E18_Physical Thing .
  crm:P1_is_identified_by
    [ a crm:E42_Identifier ;
      crm:P190_has_symbolic_content "http://www.wikidata.org/entity/Q3018259" ] ,
    [ a crm:E42_Identifier ;
      crm:P190_has_symbolic_content "http://kbpedia.org/kko/rc/RMS-Titanic-TheShip" ] .

5 Deprecated CRM classes

CRM is constantly evolving, so some CRM classes have been renamed or replaced. Outdated classes and properties MUST be supported nevertheless to integrate data that has already been published.

See “Versions of the CIDOC-CRM” (2025) for a list of CRM versions.

6 Bibliographic References

The encoding of bibliographic data is out of the scope of CRM. LRMoo (formerly known as FRBRoo) extends CRM to express the IFLA Library Reference Model (LRM) for bibliographic information managed by libraries (Aalberg 2024). The model is based on four levels of description called WEMI (Work, Expression, Manifestation, Item) instead of one class, so the model is not practical for simple bibliographic references (citation data). As long as bibliographic entities are not the core object of investigation, it is enough to express publications as instance of E31 Document and express details with an established RDF ontologies for citation data. The preferred choice is the Bibliographic Ontology (BIBO). Additional ontologies exist for more details, for instance the Citation Typing Ontology (CiTO) for citations between publications.

Data providers MUST NOT create their own classes and properties to model bibliographic references with CRM but use BIBO. Applications MAY use the following CRM classes and statements to link BIBO and its corresponding ontologies with CRM:

dcterms:Agent   rdfs:subClassOf     crm:E77_Persistent_Item .
foaf:Agent      rdfs:subClassOf     crm:E77_Persistent_Item .
crm:E39_Actor   rdfs:subClassof     dcterms:Agent .
crm:E39_Actor   rdfs:subClassof     foaf:Agent .
foaf:Person     owl:equivalentClass crm:E74_Person .
event:Event     owl:equivalentClass crm:E5_Event .
bibo:Document   rdfs:subClassof     crm:E31_Document .
bibo:Collection rdfs:subClassof     crm:E31_Document .

BIBO refers to individual classes and properties from other ontologies (FOAF, DCTERMS, PRISM…). Data providers MUST use these classes and properties in bibliographic references but they MAY include additional RDF statements with corresponding classes and properties from CRM.

Entities (authors, publishers…) SHOULD be referenced by established URI (DOI, ORCID, ROR…) like shown in the following example of a proceedings article:

Listing 5: Example of a full bibliographic reference
@prefix bibo: <http://purl.org/ontology/bibo/> .
@prefix dct: <http://purl.org/dc/terms/> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

<https://doi.org/10.1145/3543873.3585579> a bibo:Article ;
  dct:title "Wikidata: The Making Of" ;
  dct:creator
    <https://orcid.org/0000-0002-9593-2294> ,
    <https://orcid.org/0000-0002-3939-2115> ,
    <https://orcid.org/0000-0002-9172-2601> ;
  bibo:authorList (
    <https://orcid.org/0000-0002-9593-2294>
    <https://orcid.org/0000-0002-3939-2115>
    <https://orcid.org/0000-0002-9172-2601>
  ) ;
  bibo:pages "615–624" ;
  dct:isPartOf <https://doi.org/10.1145/3543873> .

<https://doi.org/10.1145/3543873> a bibo:Proceedings ;
  dct:date "2023-03-30"^^xsd:date ;
  dct:publisher <https://ror.org/03wsadn68> .
  bibo:isbn13 "978-1-4503-9419-2" ;
  dct:title "Companion Proceedings of the ACM Web Conference 2023" ;
  dct:isPartOf <https://dblp.org/streams/conf/www> .

<https://ror.org/03wsadn68> a foaf:Organization ;
  foaf:name "Association for Computing Machinery" .

<https://dblp.org/streams/conf/www> a bibo:Series ;
  dct:title "WWW '23 Companion" .

<https://orcid.org/0000-0002-9593-2294> a foaf:Person ;
  foaf:familyName "Vrandečić" ; foaf:givenName "Denny" .
<https://orcid.org/0000-0002-3939-2115> a foaf:Person ;
  foaf:familyName "Pintscher" ; foaf:givenName "Lydia" .
<https://orcid.org/0000-0002-9172-2601> a foaf:Person ;
  foaf:familyName "Krötzsch" ; foaf:givenName "Markus" .

If structured data is not available, bibliographic references can also be expressed with blank nodes having a plain string value rdfs:label:

Listing 6
_:123 a bibo:Document
  rdfs:label "D. Vrandečić, L. Pintscher, and M. Krötzsch. 2023. Wikidata ..." .
Tip

The citation management software Zotero can import a large number of formats and export BIBO RDF.

To link bibliographic references to other CRM entities use P70 documents.

7 Differences to the official encoding of CRM in RDF

An official encoding of CRM in RDF is published since version 7.1.2, managed at https://gitlab.isl.ics.forth.gr/cidoc-crm/cidoc_crm_rdf/ (Doerr, Light, and Hiebel (2020)). The encoding of CRM in RDF described in this document differes by introduction of SKOS:

In addition the use of non-standard temporal properties such as P81a_end_of_the_begin and P82b_end_of_the_end will likely be discouraged in favour of Time Ontology and EDTF.

Rationales: integration with terminologies and simplification of queries.

8 References

Aalberg, Riva, Trond. 2024. “LRMoo: Object-Oriented Definition and Mapping from the IFLA Library Reference Model.” IFLA. https://repository.ifla.org/handle/20.500.14598/3677.
Doerr, Martin, Richard Light, and Gerald Hiebel. 2020. “Implementing the CIDOC Conceptual Reference Model in RDF.” https://cidoc-crm.org/sites/default/files/Implementing%20the%20CIDOC%20Conceptual%20Reference%20Model%20in%20RDF.pdf.
Nys, Gilles-Antoine, Muriel van Ruymbeke, and Roland Billen. 2018. “Spatio-Temporal Reasoning in CIDOC CRM: An Hybrid Ontology with GeoSPARQL and OWL-Time.” In 2nd Workshop on Computing Techniques for Spatio-Temporal Data in Archaeology and Cultural Heritage, edited by A. Belussi, R. Billen, P. Hallot, and S. Migliorini. Vol. 2230. CEUR. https://ceur-ws.org/Vol-2230/paper_04.pdf.
“Versions of the CIDOC-CRM.” 2025. ICOM. https://cidoc-crm.org/versions-of-the-cidoc-crm.

Footnotes

  1. See also CRM Geo draft at http://www.cidoc-crm.org/extensions/crmgeo/, defining superclasses of geo:Geometry.↩︎

  2. See cdt:ucum and QUDT.↩︎