Commons:Structured data/GLAM/CIDOC CRM

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search

The table below contains a first preliminary mapping of the Wikidata properties proposed for Wikimedia Commons (as of September 2018) to the CIDOC CRM (CIDOC Conceptual Reference Model) ontology, which is used in the museum and cultural heritage sector.

The mapping was drafted by George Bruseker; comments and feedback for the Commons community can be found in the last column of the table (they were formulated by George Bruseker and edited by Sandra).

This mapping is also discussed in a panel at the CIDOC 2018 conference.

Property / Aliases Currently exists on Wikidata as Data type Qualifiers CRM (D1 Digital Image) Comments / feedback
Depicts / Manifestation of / Digital representation of depicts (P180) Item See below ->p165 incorporates->E38 Image->p65 shows->E1 CRM Entity It is extremely important to make a crisp distinction between the description of the digital object qua digital object, and the various information objects that it encodes/carries/incorporates. If this is not done properly, there will be a lot of confusion and mistakes in the metadata and the Commons community will run intp problems in the future!
Depicted place Item ->p165 incorporates->E38 Image->p65 shows->E53 Place
Depicted event Item ->p165 incorporates->E38 Image->p65 shows->E5 Event
Depicted date / depicted period date depicted (P2913) Time ->p165 incorporates->E38 Image->p65 shows->E5 Event-> p4 has time span -> E52 Time Span Is this a correct metadata construct? Do images depict a time at all? Probably not. They could have been taken at a time or they could depict an event that has time, but they don't depict time. How would they?
Type of media file No, to be created Item p2 has type -> E55 Type This is somewhat tricky as one might want to create a limited subset of the main types of these things and then use them as actual classes instead of type assignments.
Mime type Item? l11i was output of -> D7 Digital Machine Event -> p32 used general technique -> E55 Type
File format file format (P2701) Item l11i was output of -> D7 Digital Machine Event -> p32 used general technique -> E55 Type Not sure that file format and mime type are usefully distinguished; see https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/MIME_types
Date of file creation / Scan date inception (P571) Time l11i was output of -> D7 Digital Machine Event -> p4 had time span -> E53 Time Span ->P81 ongoing throughout -> XSD:Date
Date of upload to Wikimedia Commons inception (P571) with qualifier Time L14i was transferred by -> D12 Digital Transfer Event ->p4 had time Span -> E52 Time Span -> p81 ongoing throughout -> XSDD: Date The 'inception' property sounds semantically wrong. Uploading something to Wikimedia does not create it in any usual sense of the concept.
Source website No, to be created Item
Source URL Probably to be created. There is reference URL (P854) (reference URL) but that is for references specifically URL
Institution's media file identifier No, to be created (probably 1 identifier property per institution) Identifier -> p140i was asstributed by -> E13 Attribute Assignemnt -> p141 assigned -> E42 Identifier
Created with support by Item ->l11i was output of -> D7 Digital Machine Event -> p17 was motivated by ->E7 Activity ->p2 has type -> E55 {"Funding"]
Wikimedia campaign/project (e.g. Wiki Loves Monuments) Item ->l11i was output of -> D7 Digital Machine Event -> p17 was motivated by ->E7 Activity ->p2 has type -> E55 {"Wikimedia Project"]
File size / Data size data size (P3575) String p43 has dimension -> E54 Dimension -> p2 has type -> E55 Type 'File Size'

p43 has dimension -> E54 Dimension -> p91 has unit -> E55 Type 'Bytes' p43 has dimension -> E54 Dimension -> p90 has value -> xsd:integer

Creator / Photographer creator (P170) Item or 'User' Roles object has role (P3831), e.g. Photographer ->p165 incorporates->E38 Image->p94 is was created by -> E65 Creation Event -> P14 was carried out by -> E39 Actor

->p11i was output of -> E7 Digital Machine Event ->p14 carried out by -> E39 Actor

Here the problem of distinguishing the file from the the thing encoded becomes obvious. Creators of files and photographers and artists all have very different relations to a digital file depending on where they were in the creation process. So this can only be modelled and mapped by distinct paths (and properties in Wikidata if one wants to keep things clean)
Uploader User' L14i was transferred by -> D12 Digital Transfer Event ->p14 carried out by -> E39 Actor
License reviewer No, to be created User'
Other version(s) No, to be created Commons file
Derived from file / Extracted from file Commons file -> l22 was derivative of -> D3 Formal Derivation -> L21 used as derivation source -> D1 Digital Object
Part of series/whole Item -> p106i formed part of -> D1 Digital Object The new data model called of the Parthenos project provides ways in which this could be done more explicitly, but the basic relation is this
Series ordinal series ordinal (P1545) String (number) Would need mapping. RDF is not so good at ordinals, would have to either make it an identifier for this context or make a fake relation on relation property.
Follows follows (P155) Commons file Again, this would have to be a property on a property.
Followed by followed by (P156) Commons file
Link to this file in external service (e.g. IIIF, map warper) URL or Identifier Qualifier indicating the specific external website the URL refers to ->pp8i is dataset hosted by -> PE15 Data E-Service -> pp28 has access point -> PE29 Access Point See the Parthenos Entity Model here as well: this is data management beyond the scope of CRMdig as it is currently configured
Language Item Applies to part p106->E33 Linguistic Object -> p72 has language -> E56 Language Here, again precisely distinguishing between the digital object, an image and a text is highly important; these must not be merged.
Quality assessment No, to be created Item