Commons:Village pump/Technical/Archive/2024/07

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search

Below Commons:Copyright rules by subject matter#Toys there is a heading that isn't working, namely ==Trademarks and logos==. Can someone fix it? I wasn't able to. Jonteemil (talk) 15:30, 2 July 2024 (UTC)

✓ Done --Geohakkeri (talk) 15:42, 2 July 2024 (UTC)
Checkmark This section is resolved and can be archived. If you disagree, replace this template with your comment. Thanks. --Enhancing999 (talk) 10:19, 3 July 2024 (UTC)

I don't know why this picture looks like it was damaged. If someone knowledgeable about this can fix it, that would be for the best.--125.230.78.179 10:57, 3 July 2024 (UTC)

Let’s ping User:Sreejithk2000 as an involved admin. --Geohakkeri (talk) 11:13, 3 July 2024 (UTC)
I don't know either, but I reverted it to the old version. Yann (talk) 11:13, 3 July 2024 (UTC)
From the file history, I see that the file was overwritten at one point and I moved the overwritten file to File:HarukiKuramochi20230723-1.jpeg. I don't exactly remember how this image got corrupted though. --Sreejith K (talk) 14:09, 3 July 2024 (UTC)
Checkmark This section is resolved and can be archived. If you disagree, replace this template with your comment. --Enhancing999 (talk) 08:50, 4 July 2024 (UTC)

SVG rendering issue (Côte d'Ivoire location map.svg)

On File:Côte d'Ivoire location map.svg the preview is missing. I can download the original file and it looks fine. When I use the link of a png preview of the file, I get "Error: 500, Internal Server Error". I am not sure this is a server side bug/issue, or if something is wrong with this particular SVG file. Wimmel (talk) 17:33, 9 July 2024 (UTC)

Now I notice a similar issue on File:Lesotho location map.svg, but with this file I also get an local error when I download the original file. So it seems something is wrong with the SVG. --Wimmel (talk) 17:37, 9 July 2024 (UTC)
File:Côte d'Ivoire location map.svg did not specify its dimensions (width/height or viewBox). I added a viewBox and it displays. Glrx (talk) 23:51, 9 July 2024 (UTC)
File:Lesotho location map.svg will not display in my browser because it does not define the sodipodi namespace. Glrx (talk) 23:53, 9 July 2024 (UTC)
added sodipodi namespace and it displays. Glrx (talk) 23:58, 9 July 2024 (UTC)
Thank you! --Wimmel (talk) 16:57, 10 July 2024 (UTC)
Checkmark This section is resolved and can be archived. If you disagree, replace this template with your comment. --Wimmel (talk) 16:57, 10 July 2024 (UTC)

Quarry / SQL optimization - using DB indexes

I'm trying to optimize a simple SQL query for pages on commonswiki (table page) which start with a certain string (e.g. "SELECT * FROM page WHERE page_title LIKE "Stolperstein_Euskirchen%" ORDER by page_title;"). The SQL Optimizer on Toolforge tells me that this query would use filesort instead of indexes which seems to be a problem ("Query plan 1.1 is using filesort or a temporary table. This is usually an indication of an inefficient query. If you find your query is slow, try taking advantage of available indexes to avoid filesort."). The DB schema documentation tells me that there should be an index defined (key: page_name_title) on page_title and page_namespace columns. The MySQL docs tell me that i could use USE INDEX (page_name_title) to enforce using that index. If I add that clause to my query (SELECT * FROM page USE INDEX (page_name_title) WHERE page_title LIKE "Stolperstein_Euskirchen%" ORDER by page_title;), the SQL optimizer complains about a "Query error: Key 'page_name_title' doesn't exist in table 'page'". Obviously i'm doing something wrong, but what's my mistake? Fl.schmitt (talk) 09:32, 10 July 2024 (UTC)

The difference is that DB schema documentation is for production sites and toolforge replicas what you are querying when using quarry doesn't have same indexes. Also, afaik when you are running queries in replicas you are running them against filtered views and i am not sure what kind of indexes you could practically have there. However, i do not know if in commons there is people who can answer to question. The better page to ask is to ask the question in mw:Talk:Quarry or phabricator and see if Ladsgroup or somebody from wmf tech is able to answer to question howto make the query fast in point of what database replicas does internally. Question is interesting and worth to asking. --Zache (talk) 17:00, 10 July 2024 (UTC)
@Zache: Thank you for your helpful explanation! I've started a new topic on mw:Talk:Quarry. Fl.schmitt (talk) 18:00, 10 July 2024 (UTC)
The solution was easy: adding a condition on the page_namespace column solved the issue (e.g. SELECT * FROM page WHERE page_title LIKE "Stolperstein_Euskirchen%" AND page_namespace = 6 ORDER by page_title;) - query runs fast as hell now :-).
Checkmark This section is resolved and can be archived. If you disagree, replace this template with your comment. --Fl.schmitt (talk) 18:36, 10 July 2024 (UTC)
Fl.schmitt (talk) 18:36, 10 July 2024 (UTC)

Cleaning up list of followed pages / How to limit an extended watchlist?

Hi, is there a way to do a mass cleanup of the list followed pages? My list is so long that it frequently refuses to load (it times out). I want to keep a much smaller collection of files on the list. How can I mass remove selections of files, like the complete contents of some categories and their subcategories. What's the best practice to do this? Or even only follow what happens to my personal uploads. Peli (talk) 12:20, 2 July 2024 (UTC)

have you reviewed Special:Preferences#mw-prefsection-watchlist? so that you dont keep adding much more to your list. RZuo (talk) 13:59, 2 July 2024 (UTC)
@Pelikana: You can get the raw contents of your watchlist from Special:EditWatchlist/raw, and you can then edit it either in your browser or in the text editor of your choice. I can't immediately think of a way to winnow it by category, but it looks like it's sorted by when items were added so it might be that the items you want to remove are conveniently close together. --bjh21 (talk) 23:17, 2 July 2024 (UTC)
Thanks both, Now I have limited the new additions to my list by following the first suggestion. But for suggestion #2: any time I try to edit my watchlist I receive some kind of fatal error like this "Wikimedia\Rdbms\DBQueryDisconnectedError". So I never was able to actually edit something of it. Peli (talk) 13:58, 7 July 2024 (UTC)
I got the raw list now and worked on it and want to know some details. Can it ignore lines with typos or does that give an error? And how to get the cleaned up list back in place, since the old list is so long that it times out in the replacing process with cut and paste. Peli (talk) 22:43, 9 July 2024 (UTC)
SOLVED. By clearing the list by the red button, and pasting the limited one, it seemed to have worked out well. The manual clearing was very tedious and time consuming tho even in notepad++. But I found the options to add and remove to watchlist by buttons on the watchlist now, great. Thanks. Peli (talk) 16:41, 14 July 2024 (UTC)
Checkmark This section is resolved and can be archived. If you disagree, replace this template with your comment. Peli (talk) 10:54, 25 July 2024 (UTC)

Tech News: 2024-27

MediaWiki message delivery 23:56, 1 July 2024 (UTC)

SVG rendering on election maps

I just uploaded a series of new maps for Icelandic parliamentary elections. I am seeing that despite the files being very similar, there are some inconsistencies with rendering of certain text elements. The circles should have abbreviations of the district names, these only appear in the 2021 map. In front of the party names there are boxes with the letters used to identify the parties, these sometimes don't show up. I have no idea why this happens. The font used is DejaVu sans which should work fine with Wikimedia. Bjarki S (talk) 09:41, 4 July 2024 (UTC)

I have identified the problem. For what ever reason, Inkscape decided to leave the coordinates (Y and X) of the missing elements as 0 in the tag tspan id. I'm fixing this manually in the XML editor. Bjarki S (talk) 10:16, 4 July 2024 (UTC)

Occupation "greek-catholic priest" instead of "politician" in Wikidata Infobox

Is it just me or is the Wikidata Infobox at Category:Iriana saying that Iriana's occupation is "greek-catholic priest" instead of "politician"? I checked the Wikidata entry on her and on "politician" and it says correctly "politician". Where is the "greek-catholic priest" coming from? (note: I'm accessing the page via mobile browser. I've checked mobile view and desktop view on mobile browser but the infobox display is the same.) Nakonana (talk) 20:12, 4 July 2024 (UTC)

Just checked infoboxes of other politicians on Commons and they all list "greek-catholic priest" as occupation instead of "politician". Nakonana (talk) 20:14, 4 July 2024 (UTC)
Maybe mention it at Template talk:Wikidata Infobox. Seems to come from [8]. Enhancing999 (talk) 08:12, 5 July 2024 (UTC)
As this is already reverted purging the page to clean the cache should solve this. GPSLeo (talk) 08:32, 5 July 2024 (UTC)
Would you kindly do so? Enhancing999 (talk) 08:33, 5 July 2024 (UTC)
Up to 147559 category pages are concerned: [9], but it seems to be better now. Enhancing999 (talk) 05:58, 6 July 2024 (UTC)
Looks like it got fixed now. Nakonana (talk) 11:42, 6 July 2024 (UTC)
This section was archived on a request by: —⁠andrybak (talk) 09:53, 4 August 2024 (UTC)
A manual purge of Category:Iriana does not seem to do the trick. Nakonana (talk) 15:56, 5 July 2024 (UTC)

Automatic categorization of subtitles needs to be renamed

If a video (e.g. File:1952. Аленький цветочек.webm) has Slovene subtitles (e.g. TimedText:1952._Аленький_цветочек.webm.sl.srt), then it is categorized in Category:Files with closed captioning in Slovenian, but the main category (and English Wikipedia article, for what it's worth) are called "Slovene", not "Slovenian", cf. Category:Slovene language. —Justin (koavf)TCM 05:50, 8 July 2024 (UTC)

✓ Done Special:Diff/894319705 --Geohakkeri (talk) 06:21, 8 July 2024 (UTC)
hvala. —Justin (koavf)TCM 16:22, 8 July 2024 (UTC)
This section was archived on a request by: --—⁠andrybak (talk) 09:10, 4 August 2024 (UTC)

Query to find dates of DR items

Hi, the first list of files in the DR Commons:Deletion requests/Professional wrestling magazines has copyright issues depending on the date; if published after ~23 October 1987 then they are likely to be deleted. Many of the files only have year in the filename (and some are missing year in the filename), but most seem to have a specific date on the file itself - e.g. File:Ric Flair, circa Spring 1987 (cropped).jpg has 1987 in filename but a date of 1 March 1987 on the file. Is it possible for someone to run a query or join to bring the date from each file into the DR, so that the closer can identify which fall before 23 October 1987 and are eligible for deletion (depending on the DR decision)? Consigned (talk) 13:06, 17 July 2024 (UTC)

✓ Done, thank you Geohakkeri! Consigned (talk) 16:52, 17 July 2024 (UTC)
This section was archived on a request by: --—⁠andrybak (talk) 09:10, 4 August 2024 (UTC)

Characters Not Entering Properly

I have been having a strange error where certain characters will not enter properly when editing Wikimedia Commons. For example, typing two left brackets ("[[") converts both of them into a "ʽ". The same happens in reverse for two right brackets. However, it only happens when typing them sequentially. In other words, if I type one, then move the caret to the left and type the second one they remain brackets. Similarly, copy-pasting them from somewhere else also doesn't cause any issues. In another case, typing an asterisk ("*") results in what is apparently a diaeresis (It won't reproduce ). This only happens on Wikimedia Commons and not Wikipedia or any computer program. However, it does occur on both my userpage and this post. Any idea what is causing this and how to fix it? –Noha307 (talk) 17:33, 22 July 2024 (UTC)

@Noha307: Focus the Commons search bar. If you see a little keyboard icon as depicted, click it and select “Disable input tools.” --Geohakkeri (talk) 19:13, 22 July 2024 (UTC)
Hey, that fixed it! Thank you! Noha307 (talk) 21:53, 22 July 2024 (UTC)
:This section was archived on a request by: —⁠andrybak (talk) 09:53, 4 August 2024 (UTC)

It looks like something went wrong in Nasrumikailkabira. This is technically a gallery page, but should be the talk page of User:Nasrumikailkabira. Can that please be fixed? JopkeB (talk) 05:31, 31 July 2024 (UTC)

✓ Done fixed. Also move protected the user talk page to autopatrollers to prevent this from happening again and gave warning. —Matrix(!) {user - talk? - uselesscontributions} 05:36, 31 July 2024 (UTC)
This section was archived on a request by: —⁠andrybak (talk) 09:53, 4 August 2024 (UTC)

It appears that the most recent version of this file (which, according to the talk page is a 4K restored version of the film) was not uploaded properly and cannot be played: "No compatible source was found for this media." Can someone please fix this? Johnj1995 (talk) 03:16, 4 July 2024 (UTC)

@Johnj1995: Hmm, the raw webm file seems to work fine, but it won't play in the Media player. I would suggest filing a Phabricator ticket about it. You may need to revert it to the previous version for now. Nosferattus (talk) 22:32, 7 July 2024 (UTC)
@Nosferattus: Per the uploader's comment on a featured media nomination for another film that cannot be played, the error is related to this Phabricator ticket: https://phabricator.wikimedia.org/T357215 Johnj1995 (talk) 03:27, 8 July 2024 (UTC)

Annotations not showing

It seems I was able to add image notes in the past here but now I am unable to see them or the add note button - File:Coleman_Bangalore_entomologists.jpg - any way to turn on the annotation button which shows up on other images? Shyamal L. 11:10, 6 July 2024 (UTC)

Hi Shyamal, I am not sure if your problem is related but Fix the Image Annotator may be relevant. Commander Keane (talk) 05:59, 8 July 2024 (UTC)
@Commander Keane: Added my support. Jeez, never knew we could be that helpless in the open source world. Shyamal L. 06:02, 8 July 2024 (UTC)
@Shyamal: I think voting has closed for that RfC. I support a techical needs survey that is always open to suggestions and voting on Commons though. Commander Keane (talk) 06:09, 8 July 2024 (UTC)

Tech News: 2024-28

MediaWiki message delivery 21:28, 8 July 2024 (UTC)

Help needed from admins speaking javascript

I am working on a backlog of {{Edit request}}s. I can handle most file, template and Lua requests but I do not speak javascript. Can an admin help with requests at Category:Commons_protected_edit_requests_for_interface_administrators? Jarekt (talk) 17:23, 9 July 2024 (UTC)

A gadget to mute audio of a video with one click

Is there any gadget/tool/proposal for such a button on pages for videos that have audio?

I think many files in Category:Videos featuring unidentified music need their audio muted and one example case of a video that (as far as I can see) needs to be muted is File:Beijing to Shanghai by train timelapse.webm.

It would be very cumbersome if one first needs to download a large video, modify it somehow (which most users can't readily, don't bother doing, or would take them long), and then reupload as a new version before tagging the page with {{Overwritten revdel}} which probably even most active users don't know about (and adding Category:Videos without audio).

Instead, it should be just a click that makes the server run some ffmpeg command to remove the audio or similar. I don't know if this has been proposed somewhere if it doesn't yet exist. Prototyperspective (talk) 22:14, 12 July 2024 (UTC)

you imported the example video.
if you were not sure that the music is free, then you should have imported only the video using v2c! RZuo (talk) 05:43, 13 July 2024 (UTC)
Yes, I noticed it only afterwards and this made me wonder about such a button; your comment is not helpful. Prototyperspective (talk) 10:22, 13 July 2024 (UTC)
But why do you trust that the copyright statement at the source is correct for the video but not for the audio? GPSLeo (talk) 12:26, 13 July 2024 (UTC)
Because it was self-recorded by the youtuber who set this license? Also not helpful and offtopic. Prototyperspective (talk) 12:28, 13 July 2024 (UTC)

New technical problem with generation of SVG preview images

The preview images of File:MitigationOptions costs potentials IPCCAR6WGIII rotated-de.svg are broken. They used to be rendered and shown correctly. Since the graphic hasn't changed sind March 2023, it appears something with the SVG renderer ist broken. Does anyone know what happened? --DeWikiMan (talk) 14:56, 14 July 2024 (UTC)

Possibly the use of fill:currentColor and stroke:currentColor. WMF supports SVG 1.1. File claims to be SVG 1.1 (which uses a subset of CSS 2), but currentColor is from CSS 3. The value is supposed to select the current value of the color property. GNUPlot is not emitting SVG 1.1. The WMF renderer was changed (April 2024?) to a version that is a few years behind the latest release. Maybe a more recent version of librsvg (the WMF renderer) supports the property. Glrx (talk) 01:40, 15 July 2024 (UTC)
Thank you Glrx for the suggestion.
I substituted all occurences of "currentColor" with "black". The SVG 1.1 validator basically says that it is correct now (except for the RDF metadata and inkscape elements, see |validator.nu. I also tried to save it as "plain svg" from Inkscape. Uploaded both to Commons. Neither did help.
I ran rsvg-convert (version 2.52.5) on it and it gave a "rendering error: InvalidMatrix", whatever that means...
Do you have any further suggestions? I'd really appreciate it.
--DeWikiMan (talk) 17:58, 15 July 2024 (UTC)
Creating "optimized SVG" from Inkscape did the trick. I don't know exactly why. I believe, the problem could be related to this librsvg problem [10]. Probably, one of the transform matrices was not invertible. In such a case, the librsvg version which is now used on Commons, possibly does no longer ignore the transform, but fails and stops rendering.
--DeWikiMan (talk) 19:16, 15 July 2024 (UTC)
@DeWikiMan: Looks like you found the answer 30 minutes later. Glrx (talk) 19:21, 15 July 2024 (UTC)

Tech News: 2024-29

MediaWiki message delivery 01:28, 16 July 2024 (UTC)

Hosting of free fonts in Commons

As technical aspects of the following RfC, I thought it can be a good idea to crosspost link of this RfC Commons:Requests for comment/Hosting of free fonts in Commons in technical village pump. Pardon me if you see unsuitable or already visible enough. Thanks 😊 −Ebrahimtalk 12:48, 18 July 2024 (UTC)

Good idea, thank you for bringing this up :) --PantheraLeo1359531 😺 (talk) 07:25, 19 July 2024 (UTC)

SVG abruptly not displaying

At some point within the last month File:LGBTCannabis_white.svg stopped displaying, and it's unclear to me why as this file was uploaded in 2020 and hasn't recently been changed other than being added to an additional category. Clicking 'Original file' gives a 'XML Parsing Error: prefix not bound to a namespace' error, while clicking any of the resolution PNG previews gives a 429 error. This error doesn't seem to be something on my end as I asked someone else on a different computer from a different internet connection to take a look and they confirmed that it's broken for them as well. Apologies if this is a known issue that's being worked on or something a-la graphs extension - I don't frequent Commons. Waxworker (talk) 05:08, 19 July 2024 (UTC)

@Waxworker: File:LGBTCannabis_white.svg is not a valid XML file. The Commons SVG renderer librsvg was recently upgraded from 2.44 to 2.50, which uses a stricter XML parser, resulting in this error. The fix here is to add a xmlns:sodipodi namespace declaration or just remove the sodipodi:nodetypes attribute. Other files affected by this:
Dexxor (talk) 09:49, 19 July 2024 (UTC)

Tech News: 2024-30

MediaWiki message delivery 00:01, 23 July 2024 (UTC)

Problems with PDF Preview

Hello, I noticed since a few hours ago the pdf preview function is bugging. I've been uploading slides for sometimes today and noticed that I neither could see the thumbnail nor see the preview in file pages. I did check with my friends, one who used same network as I used, and other who worked in other location, and also using my phone with different networks. All of them reported the same problem. Is this known bug? Or its the problem with my files? Thank you Hisyam Athaya (WMID) (talk) 09:25, 18 July 2024 (UTC)

It seems to be a known problem. @Sannita (WMF): is there a plan to fix it? Enhancing999 (talk) 17:48, 22 July 2024 (UTC)
The preview worked again, I asked other people who worked a lot with Commons and they confirmed this is a known problem. Hisyam Athaya (WMID) (talk) 02:19, 23 July 2024 (UTC)
It still needs to be fixed. Enhancing999 (talk) 07:18, 23 July 2024 (UTC)
AFAIK there are several tickets on Phabricator on the topic, so it is a known bug. I don't know which team has it, though, and I'm afraid the priority is not high on this. I'll try to investigate this. Sannita (WMF) (talk) 15:34, 23 July 2024 (UTC)

DelReqHandler broken for April requests?

It seems that something broke the DelReqHandler tool on Commons:Deletion requests/2024/04, the usual links for closing requests don't appear there, any idea how to fix? Gestumblindi (talk) 11:18, 14 July 2024 (UTC)

DelReqHandler links appear for requests from April 18 and newer, but not for older April requests. I suppose something around April 18 went wrong? Gestumblindi (talk) 18:59, 15 July 2024 (UTC)
Issue still persists. Gestumblindi (talk) 09:19, 24 July 2024 (UTC)

PD template error: author "I, John Doe"

Hi, I found an irritating error in early PD template (2007-2008) and assume there are more than 23K instances of it's faulty use. check

The template creates a set of three lines that adds an "I" to the author's name in two of them. Obviously derrived from " I, John Doe, the copyright holder" it mentions author's name as "I, John Doe". I am not exactly sure where this error lives (I can see that it is on the pages now, inside the PD|self template). I see it as very irritating and kind of disrespectful towards the creators to missspell their names this way adding a random "I" to their name. Does anyone have a good approach to fix thse instances and check the template? Thanks, I saw this in the Dutch translation template. Peli (talk) 23:08, 23 July 2024 (UTC)

This is a good candidate for Commons:Bots/Work requests. I went ahead and made a request to fix this issue at here. —CalendulaAsteraceae (talkcontribs) 06:53, 24 July 2024 (UTC)
Thanks, great move. But I'd like to add that the 'list' is just a kind of educated guess, created by a certain search key, I was not able to check the text in a all or in a significant number of the real pages. The test was just confirmed by looking at a very small number of pages in the first page of the results. Peli (talk) 07:13, 24 July 2024 (UTC)
That's why I asked for a specific find-and-replace in the bot request. It might miss some pages, but it won't have false positives, and should get a lot of the problematic pages, making it easier to do manual review of the remaining ones. —CalendulaAsteraceae (talkcontribs) 20:59, 24 July 2024 (UTC)

VRT process

I was reading through VRT process and i am confused."Before taking permission we have to upload the media"–isn't it? and if author deny or does not reply then what should to be done?
–– KEmel49talk,Uploads 18:48, 24 July 2024 (UTC)

I usually ask if the author would grant permission before, then upload the media, and ask the author to send the permission to the respective WMC email address --PantheraLeo1359531 😺 (talk) 08:30, 25 July 2024 (UTC)

COM:Cameroon

Does anyone know why the level-2 section headings "==Not protected==" and "==Public domain and folklore: not free==" aren't being properly displayed in COM:Cameroon#General rules? -- Marchjuly (talk) 09:28, 27 July 2024 (UTC)

I had the same problem: [15] fixed it. Not sure what it actually is though. Enhancing999 (talk) 10:55, 27 July 2024 (UTC)

Notice of licencing template redirection

Hello, per Template talk:GPLv3, GPLv3 will soon be redirected to GPLv3 only. It currently has no transclusions and has been deprecated for two weeks. Considering the potential legal implications, I want to proceed with caution. Are there any tools that still have the GPLv3 template hardcoded inside? —Matrix(!) {user - talk? - uselesscontributions} 06:55, 28 July 2024 (UTC)

Skip people in search results

Any idea how to filter search results for photos that are not of persons? Is it currently possible or what would need to be added to make it possible? Enhancing999 (talk) 11:30, 27 July 2024 (UTC)

Yes: append -deepcategory:"People" or a similar category more specific to your search like "People climbing" (concrete example). It doesn't work with the two examples and with any other categories that don't just have a few subcats. The way to change that is phab code issue: phab:T369808 Prototyperspective (talk) 11:52, 27 July 2024 (UTC)

Thanks, but that assumes that the images already have a people category. Also, I doubt deepcat will ever be changed to include all subcategories of Category:People.

This is similar for other files from that Flickrstream. Enhancing999 (talk) 12:30, 27 July 2024 (UTC)

Yes, it doesn't work another way and they should be in that cat. Are you asking about machine vision filters? That would be more than difficult to add. It is not about that particular cat but how well that search operator works and it doesn't scan the whole cat tree anew, it uses some cached data or could do so if it currently does scan things anew for every search request. This is what the categories are for and the user should not be required to do categorization first, that's another issue. I don't know what the point of your question is, how do you think this could be possible if not as described or similar (such as excluding terms commonly in the file descriptions of images of people)? Prototyperspective (talk) 13:35, 27 July 2024 (UTC)
The point is to find images of buildings and cityscape included in these searches/Flickrstreams (and skip all politicians, I'm not interested in).
There was some AI done on images that added suggestions to every image. One could just skip all those images where the suggestion is people/faces or similar. Enhancing999 (talk) 13:42, 27 July 2024 (UTC)
Basically it is that images with suggestions for containing people (I thought people not faces), need to be located in the People category. For example with a subcategory "Images likely depicting people to check". Prototyperspective (talk) 14:04, 27 July 2024 (UTC)
BTW, your phab ticket seems to be repurposed to add the missing error message on MediaSearch, not to make deepcategory:"People" possible. Enhancing999 (talk) 14:10, 27 July 2024 (UTC)
No, that's a misunderstanding then: it's not about showing the error message also in MediaSearch but getting it deepcat to work reliably always (except for newly-created categories). Also to add to my prior point a subcat like "People by activity" may be more appropriate and the "Images likely depicting people to check" doesn't need to be in the People cat, one could just add a second deepcat search operator phrase. Other than that I don't think there's a feasible way given that not even any other image search engines have such features and WMC is unlikely to be able to be the first to offer machine vision supported image search. Prototyperspective (talk) 16:44, 27 July 2024 (UTC)
My oldish phone can do some of it, so Commons should be able to offer it as well. Search engines for the general public tend to have some other constraints: like always provide the same results and not output anything problematic.
Commons was almost there a while back .. so it shouldn't be too complicated to make it work.
BTW let's be optimistic about your ticket, but in any case, I don't think it wouldn't solve my usecase. Enhancing999 (talk) 22:56, 27 July 2024 (UTC)
Ok good point, still that is not a public Web or Website search engine. I don't think it would be very useful but maybe I'm wrong or it would be simple to add. I guess you could check if there is a readily available open source package for this that could be used and check if there is a related phab ticket and if not propose it somewhere. Also keep in mind that WMC has far more files than your phone (however maybe that only means the initial scan takes a bit longer). I think generally it suffices to just change the search terms so either some things are excluded or it's more specific to what you're searching for such as searching "animals climbing" instead of just "climbing" or going to the subcategories about e.g. "buildings". Prototyperspective (talk) 10:45, 28 July 2024 (UTC)
Why wouldn't it be useful to be able to search images by what is actually visible?
Your other suggestions implies that the file description text includes that information or was already categorized, but File:Comemoração da Independência do Brasil (48700486098).jpg is somewhat representative in that not being the case. The entire point of the search is to find images and add more detailed categories. Enhancing999 (talk) 10:31, 29 July 2024 (UTC)
I didn't say that, I wrote "very useful" that means that it's about the magnitude/degree.
The other things were just alternatives that don't necessarily always work or work for all files depending on what you intend to do which you didn't specify and can vary.
Adding/integrating machine vision would be useful; see this. Prototyperspective (talk) 11:00, 29 July 2024 (UTC)

Tech News: 2024-31

MediaWiki message delivery 23:07, 29 July 2024 (UTC)

Harvest coord from metadata

somehow coord of File:Ccmhj.jpg from an iphone 14 pro was not detected by commons. a bot to check metadata and fill the coords into sdc would be nice. RZuo (talk) 08:57, 6 July 2024 (UTC)

I added it at Commons:Cross-wiki_upload#Other_known_issues_with_cross-wiki_upload. If we can identify which images should be checked, maybe a request at Commons:Bots/Work_requests can be added. Enhancing999 (talk) 09:24, 1 August 2024 (UTC)

User category

Kind regards. I recently created an own user category (Category:Files by User:NoonIcarus). Is the category be populated automatically? Or can the process be automated? Many thanks in advance NoonIcarus (talk) 23:54, 26 July 2024 (UTC)

@NoonIcarus no it's not populated automatically. you need to add the category to your files in some way.
for example, you can add it directly to the file pages.
or, you can try my method. i use a customised "Author's name" for my uploads through uploadwizard, which is a template that includes the category, then all new uploads will transclude the template and hence the category. RZuo (talk) 10:34, 1 August 2024 (UTC)

MediaWiki internal error

Accidentally set the license tag to {{|cc-by-sa-4.0-sikander}} instead of {{cc-by-sa-4.0-sikander}} on File:LCBO strike - Market street - 20240713C.jpg and got this error:
MediaWiki internal error.
Original exception: [a752adf9-f969-4f5e-b251-829dc2d1186e] 2024-07-14 20:57:43: Fatal exception of type "Wikimedia\Rdbms\DBUnexpectedError"
Exception caught inside exception handler.
Set $wgShowExceptionDetails = true; at the bottom of LocalSettings.php to show detailed debugging information.

Should I report this somewhere other than here? Regards // sikander { talk } 🦖 21:03, 14 July 2024 (UTC)

@PantheraLeo1359531: No, not happening now. Got that error a few times when updating the files but after a few minutes it started working fine. // sikander { talk } 🦖 16:54, 15 July 2024 (UTC)
Good, I assume it was only a shorter temporarily error ;) --PantheraLeo1359531 😺 (talk) 18:18, 15 July 2024 (UTC)
This is part of a series of recent outages. See phab:T370491, phab:T370304. To check, if some error you're seeing is common or not, visit https://www.wikimediastatus.net/. —⁠andrybak (talk) 09:17, 4 August 2024 (UTC)

The link on this template showing the copyright notice does not function, perhaps it is outdated. I mean the link in the sentence "The text of permission is available here." The current link on "here" is https://mosreg.ru/about/, which is not accesible. The correct link would be https://mk.mosreg.ru/o-sayte . Please can some template expert correct the link, I am not aware of all template technicallities. Regards, Ellywa (talk) 08:44, 19 July 2024 (UTC)

Ellywa, make sure you're supplying a parameter |subdomain= (it is an alias for first positional (unnamed) parameter). For example, File:«Свободные знания для Википедии» Спецноминация Премии Губернатора Медиана.jpg uses {{Mosreg.ru|guip}}, which links to https://guip.mosreg.ru/o-sayte. If you need to attribute to https://mk.mosreg.ru/o-sayte, use {{Mosreg.ru|mk}}. —⁠andrybak (talk) 09:27, 4 August 2024 (UTC)
@Andrybak, Thanks, I am not using the template, just noted an error. Is this parameter use listed on the template documentation? Ellywa (talk) 09:37, 4 August 2024 (UTC)
Added documentation for the parameter: Special:Diff/906372326. —⁠andrybak (talk) 09:50, 4 August 2024 (UTC)
@Andrybak thanks a lot! Ellywa (talk) 11:19, 4 August 2024 (UTC)

Orphaned talk pages after format conversion

Consider the following pages:

File:Blue dot 7px.gif (Deleted and redirected after file format conversion) File talk:Blue dot 7px.gif (Orphaned talk page)
File:Blue dot 7px.png (Live page) File talk:Blue dot 7px.png (Non-existent until just now)

I suspect many more orphaned talk pages exist like this one. Until just now, there was no way for an editor looking at File:Blue dot 7px.png to see that there were relevant discussions at File talk:Blue dot 7px.gif. I fixed the problem by moving File talk:Blue dot 7px.gif to File talk:Blue dot 7px.png. I propose that all such pages leftover from file format conversions similarly be moved to match the name of the page in the new format. This seems like a bot task, something like this:

For all pages in the File talk namespace:

  1. Skip if the page is a redirect
  2. Skip if {{SUBJECTPAGENAME}} is not a redirect.
  3. Skip if the {{PAGENAME}} don't end in ".gif"
  4. Skip if {{SUBJECTPAGENAME}} doesn't redirect to SUBJECTPAGENAME.sub(/.gif$/, ".png")
  5. Log and skip if a page already exists named PAGENAME.sub(/.gif$/, ".png")
  6. Move to PAGENAME.sub(/.gif$/, ".png")

An analogous procedure could be followed for other file format conversions.

Questions:

  1. Is there consensus for these moves? Should a talk page persist through file format conversions and associated renaming? To me, the answer is clearly yes, as I regard these as essentially versions of the same file even if the two files co-existed on Commons at one point. Any discussion on the old file format is highly likely to be relevant to the converted file.
  2. Would someone volunteer write a script to perform this task?

Daask (talk) 16:08, 29 July 2024 (UTC)

Hi, We should not have redirects from one file extension to another. This is a source for problems. Yann (talk) 20:47, 29 July 2024 (UTC)
@Yann: isn't there "not" missing in your sentence? Enhancing999 (talk) 12:52, 1 August 2024 (UTC)
Yes, thanks. Yann (talk) 13:13, 1 August 2024 (UTC)
@Yann: I'm confused by your comments. If you are suggesting that the redirect page at File:Blue dot 7px.gif should be deleted, then that doesn't seem to address my concern or solve any problems. Further, it would make it much more difficult to fix the problem I was concerned with, as it would make my proposed script no longer work. Daask (talk) 17:18, 3 August 2024 (UTC)
I don't think the redirect at File:Blue dot 7px.gif is good practice either. Maybe it had been when it was created. Enhancing999 (talk) 14:16, 4 August 2024 (UTC)

OCR to auto-categorize maps / charts by year shown

Is there any gadget/tool for optical character recognition (OCR) of files on Wikimedia Commons?

If there is no such thing it would be really great if somebody could give it a try, it could be very useful.


I'd like to categorize Our World in Data maps by the year of the data into Category:Maps of the world by year as well as OWID charts by the latest data point into Category:Charts by year of latest data.

This is useful for many reasons such as making things in the image explicit as metadata, making things queryable (for example combining cats using petscan), statistics, search (see the search box), better enabling people to find the latest version for some data, better WMC search engine results, and (probably most importantly) updating outdated/old datagraphics that are in use (GLAMorgan can be used for that).

The issue there is that there are really many OWID files (which should now all be in the OWID category) and there may be even far more once people upload "image stacks" for the OWID Gadget if that is the way used to display more interactive OWID data (which I oppose as suboptimal).

One could go through the former manually which also has the advantage that many of these are missing one or a few other categories but the second one really has too many items to do that manually and again more OWID datagraphics keep getting uploaded and this isn't only about OWID datagraphics (there's also other cats one could scan).

See also my related comment here that is about machine vision on WMC more generally or automated species identification: …open letter…#Image recognition software for categorisers.

In my example usecase, an OCR Commons tool could for example OCR read all numbers in a file (files of the petscan results) and then (if it found one or a plausible one) set the category for the latest year that is ≤ current year. Prototyperspective (talk) 11:43, 19 July 2024 (UTC)

For Category:Images by text that could be helpful too. Ideally one could choose
  • a word, group of words, or category tree
  • define a maximum number of words or characters that should be on an image (sample: less than 5 words). This to avoid doing OCR on lengthy texts.
Then confirm suggestions made by OCR. Enhancing999 (talk) 12:21, 19 July 2024 (UTC)
SVG file to OCR
I do not know about gadgets.
There is an OCR tool.
See https://ocr.wmcloud.org/ for direct interface and API documentation.
It will work with PNG files but not SVG files (which can be converted to PNG and then OCR'd).
One can get the URL for a PNG rendering of an SVG file. Here's a conversion that is 887 pixels wide
Here's a Polish OCR run on that PNG:
So the Polish text is (converting Unicode code points to Unicode)
  • Typ ściągający
  • Typ naciskający
  • Typ obustronny
But why OCR an SVG file? The PetScan query shows SVG files that have text elements.
With JavaScript, read the SVG file with the Fetch API, grab the text elements with getElementsByTagNS(nsSVG, "text"), ask for the .textContent of each text element, and then search that string for the years or terms you want.
I do not know about the rest of the task.
Glrx (talk) 14:57, 19 July 2024 (UTC)
Wow great so around 70% of this already exists! Thanks a lot for this info. Now it basically only needs a way to make it scan files in petscan results.
SVG files always have a PNG file linked beneath them so they don't need to be converted again.
However, SVG files already have the text as plain text in them so rather than OCRing them it would be better if they the text contained in them was read somehow. However, that (which you also described in your bottom paragraph) is not needed here:
I tested it like so with a PNG render underneath File:Death-rate-smoking,1996.svg and it worked very well.
If there was a tool where one can e.g. enter a petscan ID and it makes these requests the other thing needed would be
  1. the small code that checks for the latest plausible year-number (and either in the first few lines / title or not in the same line as Data source)
  2. a bot that adds the categories to the files accordingly.
Is there a developer here who is interested in building these three missing parts assuming they don't also exist already? Prototyperspective (talk) 15:37, 19 July 2024 (UTC)
https://ocr.wmcloud.org/ interesting tool. Quite surprising what OCR on photos actually gives. I tried:
Both found "rue des lauriers", but the first also a motto and the second part of sticker from a key service on the pole ;)
Maybe OCR could be added automatically on upload and stored somehow to be searchable. Possibly, as structured data so it's editable. Enhancing999 (talk) 10:49, 22 July 2024 (UTC)
About SVG: ideally the text would be rendered on the file description page separately. Maybe that's something that can be added through LUA directly on Template:Information Enhancing999 (talk) 17:46, 22 July 2024 (UTC)
I added a request for that at Template_talk:Information#Output_SVG_text. Enhancing999 (talk) 10:18, 29 July 2024 (UTC)
Prototyperspective, you stated "I'd like to categorize Our World in Data maps by the year of the data into Category:Maps of the world by year". I think that is a great idea for the digital maps of the 21st century and I have done this a lot (manually) for hundreds of OWiD maps. However, I'd like to prevent you from going overboard once you finished with the OWiD maps: Please do not categorize old maps by year, including old maps of the world. Reprints, republications, entry errors and the natural delay between surveying and publishing the final maps, means that almost all older maps (before ~1990s) should preferably get organized/categorized by decade as the finest granularity. All my best wishes for the OWiD project, --Enyavar (talk) 12:28, 7 August 2024 (UTC)

The XML in the uploaded file could not be parsed

Hello! I wanted to created some map. I got free baselayer in PNG, opened Inkscape and made import of PNG file in software. After that I've added several lines and symbols and saved the result in SVG. If I try to upload the result to Commons, I see "The XML in the uploaded file could not be parsed". One hypothesis is that problem is in embedded PNG-layer, but, as I remember, there are SVG-files in Commons, which contain raster layers. Size of file is 12 Mb. Microsoft Edge opens file normally. What does cause the uploading error? It is possible to download the file for its checking. Perhaps, there is some web service, which cand repair structure of document, if it is broken? But, indeed, I'm not sure, that there file is broken: it is simple (raster layer, a few lines and symbols) and is not huge. Dinamik (talk) 09:44, 27 July 2024 (UTC)

We do not allow uploads of svgs with images inside of them. Its is often misused and it creates potential security problems because our filescanners do not work on those embedded images. —TheDJ (talkcontribs) 08:07, 28 July 2024 (UTC)
Did such limitation exist in Commons always? I believe, that, for example, first versions of this file have embedded baselayer. Dinamik (talk) 09:56, 28 July 2024 (UTC)
Probably not, see Category:Fake SVG. Enhancing999 (talk) 10:16, 28 July 2024 (UTC)
Commons has always allowed files to have embedded bitmaps, but those bitmaps must use the data: scheme. Files with external URLs are now blocked from uploading. Furthermore, the Commons rasterizer will not fetch external URLs, so such a base layer would no longer display. All the versions of the St. Petersburg map display, so there would not be an external URL. Glrx (talk) 22:57, 28 July 2024 (UTC)
The file is over 10 MB. At one point, SVG uploads were limited to 10 MB, but I do not believe the is still the case.
The file is mostly an embedded PNG. Following that, there are some path and flowRoot elements. The path elements should be OK, but the flowRoot is not supported. It was described in an SVG 1.2 draft, but that draft was not accepted. The element does not exist in the SVG 2.0 spec.
WMF supports SVG 1.1. Even if you could upload the file, it would not display as you would expect.
I do not see a reason for the XML error. W3's validator finds 67 errors, but they only involve normal Inkscape, sodipodi, and RDF extensions or the bogus flowRoot elements.
Glrx (talk) 23:15, 28 July 2024 (UTC)
Running rsvg-convert (latest version, 2.58) on that SVG gives an error without the --unlimited option, which is described as "The XML parser has some guards designed to mitigate large CPU or memory consumption in the face of malicious documents. It may also refuse to resolve data: URIs used to embed image data in SVG documents." Dexxor (talk) 07:17, 29 July 2024 (UTC)
Yes, i think the most likely answer is that mediawiki is not setting LIBXML_PARSEHUGE, which limits the max size of a text nodes and attributes to 10 (Decimal) megabytes. As embedded images are stored as base64 data: urls, this would limit the max size to 10mb after base64 encoding (in practise about 6.98 MiB raw size). As far as I know we fully allow embedded images in SVG if they are under that limit, however they are usually not a good idea. If it was important to commons that these types of files be uploaded, we might be able to add the flag, but I'd prefer to keep the flag off if it isn't really needed. Bawolff (talk) 21:36, 18 August 2024 (UTC)