Commons:Structured data/Development
Development steps are listed per component of the Structured Commons project and software, and are shown in reverse chronological order (most recent on top).
Past releases
[edit]Early 2020
[edit]- Finish final parts of Lua support (T237107 & T236691)
- Complex top level statement support, such as
- Geo-coordinates
- Date
- Strings
- Quantity
- URL
- External identifiers
- Constraints
2019
[edit]- 11 December - Computer-aided tagging via Special:SuggestedTags
- November - depicts search in the search bar
- 18 October - Lua support
- End of August - support for Special:Campaigns
- 31 July - support for most other statements
- 20 June - qualifiers for depicts
- 7 May - depicts in the UploadWizard
- 23 April - depicts on file pages
- 10 January - multilingual file captions
General: Timeline and roadmap
[edit]A timeline for development on Structured Commons can be found in this roadmap document (version October 31, 2017), which will be updated as development plans are updated.
The roadmap is the best estimate of when things might happen based on the information we have now. As we get more information, the estimates will change.
As a general rule, the timeline should be pretty accurate for things happening in 3-6 months, and much less accurate for things farther than 6 months in the future.
The team is working on updating this document with user facing milestones, such as the expected data the first feature will be deployed.
Current and future development
[edit]This section contains a high level overview of some features being developed. For more detailed information, including project reports, quarterly goals, and technical requirements, please visit the team page on mediawiki.org.
Technology
[edit]MediaInfo extension
[edit]MediaInfo is a new entity type for Wikibase, that is able to handle structured metadata for multimedia files. This technology is mostly implemented through the WikibaseMediaInfo extension to MediaWiki.
The extension hooks into a file description page and adds a link to a MediaInfo page storing supplemental metadata about the file. This may, for example, include the author, detailed license information, and the concepts that a picture actually depicts.
-
Screenshot of a test page for a MediaInfo entity. See an example on the test site federated-commons.wmflabs.org.
Further information: Extension:WikibaseMediaInfo
- October-December 2017: The multimedia team at the Wikimedia Foundation is gaining expertise in Wikibase, and unblocking further development for Structured Commons, by completing the MediaInfo extension for Wikibase.
Federation
[edit]In a technical sense, a federated database system is a management system where multiple autonomous databases work together in a single, so-called federated, database. Wikibase Federation is implemented for Structured Data on Wikimedia Commons: it makes it possible to use entities (Items and Properties) defined on one Wikibase repository (i.e., Wikidata) on another Wikibase repository (i.e., Wikimedia Commons). https://en.wikipedia.org/wiki/Federated_database_system
-
This graphic shows roughly how federation works (red arrows), and which information will live on Wikimedia Commons or will be retrieved (via federation) from Wikidata.
- July 2017: Test Federation Debuts: announcement that Wikidata's engineers have developed a test “federation,” which allows users to test a very basic preview version of Structured Data on Commons. The test lets Wikidata’s items and properties describe media files on Commons.
- Announcement on Commons: https://commons.wikimedia.org/wiki/Commons_talk:Structured_data/Archive_2017#New_step_towards_structured_data_for_Commons_is_now_available:_federation
- Announcement on the Commons-l mailing list: https://www.mail-archive.com/commons-l@lists.wikimedia.org/msg03559.html.
- January 2017: Building of Test Federation Begins: Wikimedia Deutschland begins the work to develop a test “federation,” which would allow users to test a basic version of Structured Data on Commons. The federation will let Wikidata’s items and properties describe media files on Commons. See https://phabricator.wikimedia.org/T156114
Computer-aided tagging
[edit]The computer-aided tagging tool is a feature developed by the Structured Data on Commons team to assist community members in identifying and labeling depicts statements for Commons files. It is implemented via the MachineVision extension to MediaWiki. See Commons:Structured data/Computer-aided tagging for more information.
Multi-Content Revisions
[edit]So-called multi-content revisions form an important building block for structured data on Wikimedia Commons (and on other Wikimedia projects). Multi-content revisions are groundwork to make information in Mediawiki wikis technically more straightforward to organize. The current wikitext pages will be able to be split out into separate documents (slots) with different functionality (such as infoboxes, categories, template documentation); these different slots can then be integrated into one page, sharing page-level functionality and one shared history. Specifically for structured data on Wikimedia Commons, multi-content revisions make it possible to store a structured data entity (an item, a property, a MediaInfo entity) and wikitext in the same page. Structured Commons is a major use case for multi-content revisions.
-
Daniel Kinzler explains Multi-Content Revisions at Wikimedia Developer Summit 2017
-
Presentation explaining Multi-Content Revisions
- Autumn 2017: Getting Multi-Content Revisions sufficiently ready, so that the Multimedia and Search Platform teams can start using it to test and prototype things.
- August 2017: Tasks related to Multi-Content Revisions were cleaned up and streamlined in phab:T174022
- January 2017: At the 2017 Wikimedia Developer Summit, engineers from the Wikimedia Foundation and Wikimedia Deutschland have discussed the implementation of Multi-Content Revisions, one of the engineering cornerstones of the first phase of Structured Data on Commons. https://www.mediawiki.org/wiki/Wikimedia_Developer_Summit/2017/Multi-Content-Revisions
- August 2016: Developers from Wikimedia Deutschland initiated a Request for Comments about Multi-Content Revisions https://www.mediawiki.org/wiki/Requests_for_comment/Multi-Content_Revisions
First feature to release: multilingual captions
[edit]- July-September (?) 2018: first feature of Structured Commons will be released. This will be multilingual, translatable file captions.
Metrics
[edit]How will we measure the effectiveness of new functionalities on Wikimedia Commons? In order to be able to do this, we need to establish relevant criteria that can be measured and a (2017) baseline against which we can compare in the future.
- October-December 2017: metrics and a metrics baseline for Commons are defined.
Research
[edit]Research of Commons use by community members
[edit]- Upcoming: interviews with Commons contributors (phab:T175185)
- December 2016: Qualitative design research of heavy Commons users by Jan Dittrich (WMDE).
-
Presentation with insights about the way in which heavy Commons users typically work
- October 2017: GLAM Users Invited to Take Survey: Jonathan Morgan adds an announcement to the main Structured Data on Commons page, asking GLAM users to take a 15-minute survey on how they upload images to Commons. See https://commons.wikimedia.org/wiki/Commons:Structured_data and https://wikimedia.qualtrics.com/jfe/form/SV_7WDA2RZvPDuaV7f.
- July 5, 2017: Design Interviews Begin with GLAM Institutions: Senior Design Researcher Jonathan Morgan begins researching GLAM institutions’ batch upload workflows, and conducting interviews with Commons contributors at museums, universities, and other institutions. The interviews will take place over the next four months. See https://meta.wikimedia.org/wiki/Research:Supporting_Commons_contribution_by_GLAM_institutions
Past development
[edit]Earlier research and development that took place in 2014, is documented at Commons:Structured data/Archive/2014/Development.
Metadata cleanup drive (2014)
[edit]In 2014 and early 2015, a large metadata cleanup campaign took place across MediaWiki wikis, in order to prepare as many files as possible for conversion to machine-readable data.
-
Screenshot of metadata cleanup dashboard for en.wikivoyage
-
Evolution of metadata cleanup on English Wikivoyage