Commons:Bots/Requests/BotAdventures
Operator: Ghouston (talk · contributions · Statistics · Recent activity · block log · User rights log · uploads · Global account information)
Bot's tasks for which permission is being sought: Adds files to a category in Category:Photographs by camera manufacturer based on Exif data. The intention is not to process millions of files, but to populate categories that don't have many entries.
Automatic or manually assisted: automatic
Edit type (e.g. Continuous, daily, one time run): Intermittent. Will probably select files from particular users or categories.
Maximum edit rate (e.g. edits per minute): 30-60, with Maxlag=5
Bot flag requested: (Y/N): Y
Programming language(s): Go, using go-mwclient.
--ghouston (talk) 11:59, 15 January 2015 (UTC)
Discussion
- I prefer to postpone this task until Wikidata-like functionality come to Commons. Just to avoid unnecessary category to property migrations. --EugeneZelenko (talk) 15:17, 15 January 2015 (UTC)
- The proposal at Commons:Wikidata? This seems like a great idea, but it doesn't yet cover replacing categories. That would require a new user interface to allow selecting images by property and intersecting matches with the category system, and it may take a long time before it happens. --ghouston (talk) 23:38, 15 January 2015 (UTC)
- EugeneZelenko, do you know if there is a public discussion going on anywhere with ideas for Wikidata-like functionality Commons? If there is I would be interested to read it. --MichaelMaggs (talk) 08:19, 25 January 2015 (UTC)
- I linked the wrong page somehow. It looks like Commons:Structured_data is the current version of the project. The timescale is "next months and years", but if taken to its logical conclusion a large number of categories could be replaced by properties (anything with a date or file format for example). --ghouston (talk) 22:46, 25 January 2015 (UTC)
- Well, the logical conclusion would be to replace all categories by properties, but presumably it will start with the easier cases. --ghouston (talk) 04:37, 26 January 2015 (UTC)
- Support I don't see the harm in outrunning wikidata. It would be easy to remove them later with Cat-a-lot anyway. --99of9 (talk) 04:59, 5 February 2015 (UTC)
- Where it may be useful is in the development of a translation table. It would be possible to copy the two Exif fields (camera manufacturer and model) to properties, but probably more useful to translate them to a single property that represents the camera model uniquely. There may be cases where the model values aren't unique among manufacturers, such as File:Big_crane_(3402145416).jpg has just "CX1". There are also cases where the model numbers are just ugly, such as "<Digimax S500 / Kenox S500 / Digimax Cyber 530>" in File:Abejorro_02.jpg. An idea would be to store a translation table in a wiki page somewhere, which would map the manufacturer/model combination to a single string, which would be "Ricoh CX1" and "Samsung S500" in these cases. Initially this bot would use the table, but it would later be available for a bot that sets properties. --ghouston (talk) 07:18, 14 February 2015 (UTC)
- I've made a test run on some of my own files. The structured data project is on hold for now, so it doesn't seem like it will be taking over this task any time soon. --ghouston (talk) 04:45, 8 March 2015 (UTC)
- and a few more because I just discovered list=logevents. --ghouston (talk) 05:27, 8 March 2015 (UTC)
- Support Probably, Wikidata will eventually do this work better, but in the meanwhile such categories may be useful.--Pere prlpz (talk) 20:16, 10 April 2015 (UTC)
- @Ghouston: As you claim to not process millions of files, how are the files selected that are eligible for processing? --Krd 18:25, 5 June 2015 (UTC)
- I've only run it manually on selected user uploads or categories. It could be done more systematically, perhaps using API:Allimages, which would be a good way to find recent camera models. Editing millions of files can be avoided by not adding to categories that already have more than say 500 members. --ghouston (talk) 00:57, 6 June 2015 (UTC)
- Perhaps 100 in a category would be enough. The biggest proliferation of camera models now occurs in mobile phones, where manufacturers seem to spam new models constantly. --ghouston (talk) 01:03, 6 June 2015 (UTC)
- Just for understanding: In which way is a category helpful that consists of semi-randomly selected images, a small percentage out of those that would actually fit into the category? --Krd 18:49, 6 June 2015 (UTC)
- It gives a sample of images taken in real-world conditions from the particular camera. It would be more useful if all images were categorised, but the potential Wikidata / structured-data change seems to have made this undesirable. --ghouston (talk) 00:38, 7 June 2015 (UTC)
- Sounds reasonable to me. Support --Krd 06:56, 7 June 2015 (UTC)
- It sounds reasonable to me, too ( Support) but I still don't see the advantage of not categorising all images.--Pere prlpz (talk) 22:42, 7 June 2015 (UTC)
- Sounds reasonable to me. Support --Krd 06:56, 7 June 2015 (UTC)
- It gives a sample of images taken in real-world conditions from the particular camera. It would be more useful if all images were categorised, but the potential Wikidata / structured-data change seems to have made this undesirable. --ghouston (talk) 00:38, 7 June 2015 (UTC)
- Just for understanding: In which way is a category helpful that consists of semi-randomly selected images, a small percentage out of those that would actually fit into the category? --Krd 18:49, 6 June 2015 (UTC)
As there are no objections, I'm going to approve this request. Bot flag granted. --Krd 18:47, 14 June 2015 (UTC)