Commons:Bots/Requests/Gzen92Bot-6

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search

Operator: Gzen92 (talk · contributions · Statistics · Recent activity · block log · User rights log · uploads · Global account information)

Bot's tasks for which permission is being sought: File upload from gallica.bnf.fr

Automatic or manually assisted: Automatic.

Edit type (e.g. Continuous, daily, one time run): Occasionally.

Maximum edit rate (e.g. edits per minute): About 20.

Bot flag requested: (Y/N): Nothing.

Programming language(s): PHP
After Commons:Bots/Requests/Gzen92Bot-4, I continued to upload files from Gallica (Category:Bibliothèque nationale de France), using the parameter "any fayes" = royalty free document (see API), without requesting bot authorization (the infobox is formatted the same).
But there is a problem with categories (see blocking Commons:Village pump#Obtuse bot created categories).
When to create a category? 2, 5, 10 files from the same id?
How to name the category? File name or only the id?
Without being able to answer this question automatically (there are a few million files available at the BnF), I will simply leave them in Category:Images from Gallica and Category:Files from Gallica needing categories (images). See 10 files uploaded.
Gzen92 (talk) 17:20, 1 November 2024 (UTC)[reply]

Discussion
Some topical categorization should be determined for every file uploaded, in addition to merely adding a source category.
Before uploading any new file. A cleanup plan for the thousand of categories created before (Commons:Village pump#Obtuse bot created categories) needs to be found and implemented (by the uploader or somebody else).
 ∞∞ Enhancing999 (talk) 08:11, 2 November 2024 (UTC)[reply]
No problem for this action, if it is the target. Gzen92 (talk) 10:12, 2 November 2024 (UTC)[reply]
Please outline your cleanup plan for the existing categories.
 ∞∞ Enhancing999 (talk) 11:38, 2 November 2024 (UTC)[reply]
I don't think this is the place to talk about this but it is easy to take the files out of the subcategories of Category:Files from Gallica needing categories (images) and put them in Category:Images from Gallica and Category:Files from Gallica needing categories (images).
The categories can be deleted easily (I don't have the bot permission to do that).
But I still think it's good to group the files, even if the category name could be improved.
Gzen92 (talk) 22:00, 2 November 2024 (UTC)[reply]
As a bot operator, I'd expect you to fix problems you created before potentially creating more of them. If you can't present a plan for your past uploads, I don't think this request should be approved. Commons is not a place to dump uncategorized files.
 ∞∞ Enhancing999 (talk) 10:43, 3 November 2024 (UTC)[reply]
Topical categorisation is not required, see Commons:Guide_to_batch_uploading#Categories. Also, before a "cleanup plan" has to be presented, clear arguments have to be presented as to why those existing categories are a problem. ~TheImaCow (talk) 22:04, 2 November 2024 (UTC)[reply]
Which part are you referring to? Uploaders are required to add categories. Merely adding a user category isn't sufficient.
 ∞∞ Enhancing999 (talk) 10:43, 3 November 2024 (UTC)[reply]
Section "Putting it into action" -> "every file you upload should have: a tracking/source category, Your files can have: topic categories". A source category + a {{Check categories}} (substituted with the "category needed" category here) is sufficient. ~TheImaCow (talk) 13:57, 3 November 2024 (UTC)[reply]
If we agree that both categories are enough, should we create categories to group files or not? For example this category includes a whole book but the name is long. Gzen92 (talk) 21:44, 3 November 2024 (UTC)[reply]
Anyone have an opinion ? With or without category creation ? I would like to continue the job ! Gzen92 (talk) 07:10, 7 November 2024 (UTC)[reply]

I don't understand what exactly is requested here. Please say again. --Krd 14:47, 7 November 2024 (UTC)[reply]

I upload files and create categories as soon as there are two files in the Gallica folder. Which creates a lot of categories: Category:Files from Gallica needing categories (images) 65,000 files and 7,200 categories which contain 139,000 files (average 19).
Remarks on Commons:Village pump#Obtuse bot created categories, I could create categories from 10 files. Gzen92 (talk) 07:43, 8 November 2024 (UTC)[reply]
I still cannot follow. Can you give an example? Krd 08:12, 8 November 2024 (UTC)[reply]
I create a lot of categories with few files (for example Category:Hotel de Roquelaure - Juillet 1906 - photographie - Atget - btv1b10516512m). So a lot of categories in Category:Files from Gallica needing categories (images). Two solutions (I don't know): threshold to create a category (10 files?) or no category at all in Category:Files from Gallica needing categories (images). Gzen92 (talk) 08:22, 8 November 2024 (UTC)[reply]
I think a bot request is not the right place to find a decision how to proceed. Perhaps this is better discussed as com:Village pump. Once there is a decision, we can perhaps discuss what is the best way to clean up, if required. (If I'm mistaken, please advise.) Krd 08:59, 8 November 2024 (UTC)[reply]

I am not sure if this is the right forum, but I am wondering how exactly the copyright status of the images is ascertained. For instance, this image has been uploaded as CC0 despite the linked source page containing:

Note(s) : Toute reproduction doit faire l'objet d'une autorisation préalable de(s) auteur(s), de ses (leur) ayants-droits ou de la société qui les représente

Reproduction : Numérisation effectuée par l'auteur d'une sélection de photographies argentiques

which to me sounds as if BNF does not own the right to the image in the first place. Could you elaborate? Felix QW (talk) 13:20, 11 November 2024 (UTC)[reply]

Hello, there is a parameter in the API Gallica "public domain". But you are right, in some html pages it is indicated "specific conditions of use", I know how to exclude them for the future. Gzen92 (talk) 16:29, 11 November 2024 (UTC)[reply]