Commons:Structured data/Modeling
This page contains an overview of how to model information (metadata) about files on Wikimedia Commons in Structured Data.
The basics: structured data for every Commons file
[edit]The following structured data is relevant for every file on Wikimedia Commons. This structured data roughly corresponds with the information stored in the Information template, a general usage infobox template to describe files in wikitext.
Structured data to add | Brief instructions | In-depth instructions info about the data model in structured data |
---|---|---|
File caption(s) (multilingual) | A (short) textual description of the file, in at least one language. Plain text; no Wiki markup or hyperlinks. | Data modeling guidelines: File captions |
Date | Usually the date when the file was created; using a inception (P571) statement. | Data modeling guidelines: Date |
Source of the file | Information about where the file was taken from. Is it the uploader's own work, was it uploaded from an external website,...? Typically using a source of file (P7482) statement. | Data modeling guidelines: Source of the file |
Creator | Who created the file? Typically described with a creator (P170) statement. | Data modeling guidelines: Creator of the file |
Copyright status and license | Is the file still under copyright, or is it public domain? If still under copyright, which license(s) applies/apply? Using copyright status (P6216) and copyright license (P275). | Data modeling guidelines: Copyright and licenses |
If the above structured data is added to a file, the file's wikitext description can be simplified as follows:
File (click to explore how it is described) | Wikitext | Main structured data |
---|---|---|
== {{int:filedesc}} == {{Information}} == {{int:license-header}} == {{self|cc-by-sa-4.0}} [[Category:Energica Ego]] |
|
An overview of further structured data property statements, that are in active use can be found here: Commons:Structured data/Properties table
The specifics: case examples of common Commons files
[edit]Own work upload directly to Commons
[edit]To describe a simple {{Own}} work upload directly uploaded by the author or {{Self}}-licensed by the uploader:
- File caption: one or more short description(s) of the file + language
- Date: inception (P571), see Commons:Structured data/Modeling/Date
- Source of the file: source of file (P7482) → original creation by uploader (Q66458942)
- Creator of the file: creator (P170) → "some value" to indicate the creator doesn't have a Wikidata item. Qualified with:
- object of statement has role (P3831) → photographer (Q33231) to indicate we're talking about the photographer here, if it is a picture and not a video or audio file
- author name string (P2093) → "<some name>" to indicate what name should be shown. Usually a username or a real name
- Wikimedia username (P4174) → "<some username>" to indicate the contributing user
- URL (P2699) → "https://commons.wikimedia.org/wiki/User:<some username>" to link back to the user page of the contributing user, if it exists
- Copyright and licenses: copyright license (P275) and copyright status (P6216), see Commons:Structured data/Modeling/Copyright
Upload from a platform like Panoramio, Geograph or Flickr
[edit]To describe an upload directly uploaded from a platform: (Preferably all uploads were done by a tool or bot, for consistency)
- File caption: one or more short description(s) of the file + language
- Date: inception (P571), see Commons:Structured data/Modeling/Date
- Source of the file: source of file (P7482) → file available on the internet (Q74228490) to indicate the source
- operator (P137) → e.g. Panoramio (Q239516) to indicate the platform
- described at URL (P973) → "<some url>" to indicate the location
- Creator of the file: creator (P170) → "some value" to indicate the creator doesn't have a Wikidata item. Qualified with:
- object of statement has role (P3831) → photographer (Q33231) to indicate we're talking about the photographer here, if it is a picture and not a video or audio file
- Flickr user ID (P3267) → "<flickr/... user number>" to indicate the Flickr user identifier (number), if applicable
- author name string (P2093) → "<some name>" to indicate what name should be shown. Usually a username or a real name on the platform
- URL (P2699) → "<some url>" to indicate the URL of the page where the file is located
- Copyright and licenses: copyright license (P275) and copyright status (P6216), see Commons:Structured data/Modeling/Copyright
For Flickr uploads please also see Commons:Flickypedia/Data Modeling
Pronunciation
[edit]- Copyright and licenses: copyright license (P275) and copyright status (P6216), see Commons:Structured data/Modeling/Copyright
- Type: instance of (P31) → pronunciation file (Q108167708)
- Language: language of work or name (P407) → e.g. French (Q150)
- Transcription: audio transcription (P9533) → "<verbatim>" to describe what is pronounced
- Recording date: recording date (P10135)
- Who recorded it: recordist (P10893)
- Who pronounced it: spoken by (P10894)
- IDs: e.g. Lingua Libre ID (P10369) → "<id>" to describe the source identifier if applicable
How to model more specific types of files
[edit]- Visual artworks - work has a Wikidata item Working
- Works without a Wikidata item Working
- Maps Working
- Illustrations Working
- Conference talks Working
How to model specific types of metadata
[edit]Here, we look at specific types of metadata for a file:
- Depiction and Digital representation of and Main subject Working
- Date Working
- Author and Creator Working
- Source Working
- Copyright and Licensing Working
- Metadata Working
- Location Working
- Participants and Sponsors Working
- Quality and Maintenance Working
- Significant Event Working
- Image captured with Working
GLAM
[edit]In some cases, large-scale content contributions mainly originating from Galleries, Libraries, Archives, and Museums (GLAM) use more specific data models.
It is highly recommended that all file metadata also complies with the general, basic data modeling recommendations listed above. This will make sure that all data on Wikimedia Commons can be uniformly searched and queried across the entire platform.
Content specific properties may be added, like:
- The Metropolitan Museum of Art: The Met object ID (P3634) → "<description>"
- iNaturalist: iNaturalist observation ID (P5683) → "<id>"
- Digital Public Library of America → Please see: Commons:Digital Public Library of America/Modeling
- Biodiversity Heritage Library → Please see: Commons:Biodiversity Heritage Library/Modeling
Bots
[edit]Some bots automatically populate SDC data based on metadata in Commons templates.
- BotMultichill adds properties for various IDs.
- BotMultichillT populates date, coordinates, camera, source, copyright and author.
- SchlurcherBot adds various types of SDC to files per the modelling described here.
- JarektBot adds Wikimedia VRTS ticket number (P6305) and digital representation of (P6243) using QuickStatements.
- AliciaFagervingWMSE-bot uploads creator (P170), inception (P571), coordinates of the point of view (P1259), depicts (P180), and participant in (P1344) to Wiki Loves Monuments files from Sweden, Israel, and Poland.
- METbot adds The Met object ID (P3634) and collection (P195) to the Metropolitan Museum of Art files.
- GeographBot uploads different SDC information to Geograph Britain and Ireland files.
- DPLA bot added structured data claims to DPLA items.
- XRayBot adds/updates coordinates of depicted place (P9149) (and others) - but only to XRay's own photographs.
- NikkiBot adds structured data to Lingua Libre uploads.
- FlickypediaBackfillrBot adds structured data to Flickr files.
- Emijrpbot adds structured data for various camera properties from Exif data and depicts statements based on Wikidata/Wikipedia pages.
General remarks
[edit]- What should be the general order of statements in the structured data statements tab? The community can indicate and change this order at MediaWiki:Wikibase-SortedProperties. (See the equivalent page on Wikidata)