Commons:Structured data/Development

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search

Development steps are listed per component of the Structured Commons project and software, and are shown in reverse chronological order (most recent on top).

Past releases

[edit]

Early 2020

[edit]
  • Geo-coordinates
  • Date
  • Strings
  • Quantity
  • URL
  • External identifiers
  • Constraints

2019

[edit]

General: Timeline and roadmap

[edit]
Roadmap for development on Structured Data on Wikimedia Commons in 2017-2019. Version of October 31, 2017.

A timeline for development on Structured Commons can be found in this roadmap document (version October 31, 2017), which will be updated as development plans are updated.

The roadmap is the best estimate of when things might happen based on the information we have now. As we get more information, the estimates will change.

As a general rule, the timeline should be pretty accurate for things happening in 3-6 months, and much less accurate for things farther than 6 months in the future.

The team is working on updating this document with user facing milestones, such as the expected data the first feature will be deployed.

Current and future development

[edit]

This section contains a high level overview of some features being developed. For more detailed information, including project reports, quarterly goals, and technical requirements, please visit the team page on mediawiki.org.

Technology

[edit]

MediaInfo extension

[edit]

MediaInfo is a new entity type for Wikibase, that is able to handle structured metadata for multimedia files. This technology is mostly implemented through the WikibaseMediaInfo extension to MediaWiki.

The extension hooks into a file description page and adds a link to a MediaInfo page storing supplemental metadata about the file. This may, for example, include the author, detailed license information, and the concepts that a picture actually depicts.

Further information: Extension:WikibaseMediaInfo

  • October-December 2017: The multimedia team at the Wikimedia Foundation is gaining expertise in Wikibase, and unblocking further development for Structured Commons, by completing the MediaInfo extension for Wikibase.

Federation

[edit]

In a technical sense, a federated database system is a management system where multiple autonomous databases work together in a single, so-called federated, database. Wikibase Federation is implemented for Structured Data on Wikimedia Commons: it makes it possible to use entities (Items and Properties) defined on one Wikibase repository (i.e., Wikidata) on another Wikibase repository (i.e., Wikimedia Commons). https://en.wikipedia.org/wiki/Federated_database_system

Computer-aided tagging

[edit]

The computer-aided tagging tool is a feature developed by the Structured Data on Commons team to assist community members in identifying and labeling depicts statements for Commons files. It is implemented via the MachineVision extension to MediaWiki. See Commons:Structured data/Computer-aided tagging for more information.

Multi-Content Revisions

[edit]

So-called multi-content revisions form an important building block for structured data on Wikimedia Commons (and on other Wikimedia projects). Multi-content revisions are groundwork to make information in Mediawiki wikis technically more straightforward to organize. The current wikitext pages will be able to be split out into separate documents (slots) with different functionality (such as infoboxes, categories, template documentation); these different slots can then be integrated into one page, sharing page-level functionality and one shared history. Specifically for structured data on Wikimedia Commons, multi-content revisions make it possible to store a structured data entity (an item, a property, a MediaInfo entity) and wikitext in the same page. Structured Commons is a major use case for multi-content revisions.

First feature to release: multilingual captions

[edit]
  • July-September (?) 2018: first feature of Structured Commons will be released. This will be multilingual, translatable file captions.

Metrics

[edit]

How will we measure the effectiveness of new functionalities on Wikimedia Commons? In order to be able to do this, we need to establish relevant criteria that can be measured and a (2017) baseline against which we can compare in the future.

  • October-December 2017: metrics and a metrics baseline for Commons are defined.


Research

[edit]

Research of Commons use by community members

[edit]
  • Upcoming: interviews with Commons contributors (phab:T175185)
  • December 2016: Qualitative design research of heavy Commons users by Jan Dittrich (WMDE).

Past development

[edit]

Earlier research and development that took place in 2014, is documented at Commons:Structured data/Archive/2014/Development.

Metadata cleanup drive (2014)

[edit]

In 2014 and early 2015, a large metadata cleanup campaign took place across MediaWiki wikis, in order to prepare as many files as possible for conversion to machine-readable data.