Commons:Requests for comment/improving search
↓ Skip to table of contents ↓ |
- The following discussion is archived. Please do not modify it. Subsequent comments should be made in a new section. A summary of the conclusions reached follows.
- Various ideas were raised and discussed. Rd232 (talk) 12:49, 21 March 2013 (UTC)[reply]
An editor had requested comment from other editors for this discussion. The discussion is now closed, please do not modify it. |
Recently the issue of how to improve Commons' search (Special:Search) has been raised. This RFC is to further discussion of how to improve search. Rd232 (talk) 18:35, 28 February 2012 (UTC).[reply]
- Some Bugzilla requests: Commons:Bugs#Search
- Links to working and proposed search enhancements are listed at Help:Searching.
- The search engines on the Commons and Wikipedia generally work the same. See en:Help:Searching.
Some prior discussions:
- Commons:Village pump/Archive/2012/02#Increasing rankings in a Commons search?
- Commons:Village pump/Archive/2012/02#Will the community support a simple rename of some graphically sexual images so that "toothbrush" does not have to show a woman masturbating?
- en:User talk:Jimbo Wales/Archive 98#Fox News complains about porn on Commons again...
Some related Bugzilla bugs:
- Bugzilla: 8738 - Improve media (image) search display
- Bugzilla: 13370 - Search images by metadata
- Bugzilla: 21061 - Add uploaded file text and metadata from files to fulltext search set
- Bugzilla: 30595 - Inconsistent search results with language links
- Bugzilla: 35974 - Preference or default setting to open search results and suggestions in new tab
Contents
- 1 Introduction
- 2 Safesearch
- 3 Effective search
- 4 Proposals
- 4.1 Allow a form of incategory: which covers subcategories as well
- 4.2 Ogg tags
- 4.3 Incategory: search from a Category page
- 4.4 Incategory: search from a Category page - bug in "Search categories" action field
- 4.5 Support tags in search
- 4.6 Prioritise category classifications in search rankings
- 4.7 Disambiguation or clustering
- 4.8 Adaptive learning
- 4.9 Include category interwiki links in the search database
- 4.10 Category union and intersection
- 4.11 Display date and time of the last update of the search database
- 4.12 Report properly when search don't work
- 4.13 Advanced search
- 4.14 Add intitle: and prefix: search options to Special:Search
- 4.15 Search tagging and prioritization
- 4.16 A little bit of intelligence
- 4.17 Open search results in a new tab
- 4.18 Search link in sidebar
- 4.19 Advanced search link in sidebar
- 4.20 A category-only search form or category adder on all pages
- 4.21 Open search dropdown links in a new tab
- 4.22 Category search at top of Special:Search
- 4.23 Searches should go to search results list and not to a page
- 4.24 Artificial Intelligence
- 4.25 Cat-a-lot should work from search result lists too
- 4.26 Use the CLDR extension to create multi-language search
- 4.27 Ctrl-click to open search in new tab
- 4.28 Add Google site search option
- 4.29 Ever thought about using social tagging approaches on Commons?
Safesearch
[edit]- Note: This has been moved to Commons:Requests for comment/Safesearch
Search (Special:Search) can and should be made more effective. Results should more closely match what users are looking for, and more control given to users over display and filtering of results. How can we achieve this? Rd232 (talk) 18:35, 28 February 2012 (UTC)[reply]
Allow a form of incategory: which covers subcategories as well
[edit]It's currently possible to search incategory:X
or -incategory:X
, where X is a category name, to include or exclude results categorised in Category:X. However, this does not cover subcategories. Covering subcategories would probably be quite difficult technically, but it would be useful. Rd232 (talk) 00:26, 29 February 2012 (UTC)[reply]
- This could be a separate keyword, say
incategorysub:
. Rd232 (talk) 11:45, 29 February 2012 (UTC)[reply]- Or simply have a level of how deep you want to go
incategory:X
defaulting toincategory:0:X
, whileincategory:5:X
meaning that we want to traverse 5 levels deep, with the possibility of course to haveincategorysub:X
meaningincategory:-1:X
. VolodyA! V Anarhist Beta_M (converse) 12:23, 29 February 2012 (UTC)[reply]- Or that. The real technical problem comes with unlimited traversion, I think, because sometimes you end up with the category tree looping back on itself, so you need to cover that possibility. Maybe that's not that big a deal to solve, I'm not sure, but it seems like a headache. Rd232 (talk) 13:15, 29 February 2012 (UTC)[reply]
- Wikipedia already detects template loops. I think the improved keyword should be substantially shorter - more like "cat:" or even "c:". And it would be best to have this automatically applied to search results in the sense that the ones matching it should come up before the ones that don't match it, unless you use some other (hypothetical) thing like "t:" to specify a search of the title or "a:" for search of the annotation text (or "e:" for EXIF data...). Wnt (talk) 15:46, 29 February 2012 (UTC)[reply]
- Or that. The real technical problem comes with unlimited traversion, I think, because sometimes you end up with the category tree looping back on itself, so you need to cover that possibility. Maybe that's not that big a deal to solve, I'm not sure, but it seems like a headache. Rd232 (talk) 13:15, 29 February 2012 (UTC)[reply]
- Or simply have a level of how deep you want to go
Submitted as Bugzilla: 35402 - Provide a form of incategory: which covers subcategories as well. Rd232 (talk) 09:13, 22 March 2012 (UTC)[reply]
I'm very glad that this Bug 35402 - Provide a form of incategory: which covers subcategories as well was filed, well done! Presently the incategory: search doesn't work for me. I have tried to understand how it works, but no success (for example the classic cucumber -incategory:"Human sexuality" or cucumber -incategory:"Behavior"). And there is still the Bug 21102 - don't propose to create the page when using prefix:, intitle: or +incategory:. Frustrating. BTW, some examples for the imprecision to be expected with incategory search were given by Thryduulf on Commons_talk:WikiProject_Erotica/image_level_demo:
- Category: Sex -> Reproductive system -> Reproductive system of Gastropoda -> File:Biomphalaria_tenagophila_reproductive_system.png
- Category: Sex -> Chastity -> Virgin Mary -> File:Church_of_Our_Lady_of_Health_at_dusk.jpg
- Category: Sex -> Sex museums -> File:China_Sex_Museum_Campus1.jpg
Anyway, I really hope the bugreports lead to some improvements! --Atlasowa (talk) 19:39, 24 March 2012 (UTC)[reply]
This has long irked me. It seems that the lack of this sort of function leads to over-categorizing of images. I understand some of the issues with fixing this, but perhaps there would be some solution to the loop problem (i.e., where a subcategory itself links back around to the higher level). For example, let's say there is a category called "Buildings in Boston." That category might have subcategories for "Houses in Boston," "Churches in Boston," "Schools in Boston," and "Hospitals in Boston." In this case, we know that every single entry in one of those subcategories MUST also fit within the higher category. That is, there is no possible image which shows a "House in Boston" that doesn't also show a "Building in Boston." That is not always true. For example, "Frank Lloyd Wright Buildings" might include some images which have the category "Buildings in Boston," but not all of them would. I know this is a difficult suggestion, but if that sort of automatic relationship could be marked linking a category and sub-category, at least those specific sub-categories could be included in the search of higher level categories.--ProfReader (talk) 15:11, 13 August 2023 (UTC)[reply]
Ogg tags
[edit]Related to the ability to search by EXIF metadata and GIF comments is the ability to search by the tags in Ogg container (and Matroska, should it be allowed eventually). As i've pointed out somewhere before getting these is quite simple, there are many libs which can do the job. I already make sure that all the Ogg files that i upload here have as much useful content in the tags as possible. VolodyA! V Anarhist Beta_M (converse) 05:55, 29 February 2012 (UTC)[reply]
- Agreed. Equivalents to EXIF for other media should be searchable as well. I'll check whether there's a bug for that. Rd232 (talk) 11:43, 29 February 2012 (UTC)[reply]
- Bugzilla: 34793 - Search media by metadata. Rd232 (talk) 12:13, 29 February 2012 (UTC)[reply]
- Some standards (for example en:IPTC Information Interchange Model and supported by Adobe tools) use EXIF data to encapsulate further information. Examples in File:Van Eyck, Lam Gods B STB 187.jpg, File:Alchemillae_herba_135006.jpg and File:ID53053-CLT-0015-01-Mons, Hôtel de Ville-PM 53799.jpg. My guess is that a solution that extracts this info at upload time and puts it in an exif template data field of the file page would be far more efficient as standard search tools would remain unchanged and remain performant. An old example where it has been done: File:Neschwitz Altes Schloss 01.jpg --Foroa (talk) 07:32, 1 March 2012 (UTC)[reply]
- That's a rather elegant solution! It could also be done retrospectively by bot. Rd232 (talk) 08:22, 2 March 2012 (UTC)[reply]
- Related problem with IPTC data: Commons:Bots/Work_requests#Filmitadka_EXIF and Commons:Village_pump#Can_we_surpress_.22Image_title.22_field_Filmitadka_metadata.3F. --Foroa (talk) 16:19, 13 April 2012 (UTC)[reply]
- That's a rather elegant solution! It could also be done retrospectively by bot. Rd232 (talk) 08:22, 2 March 2012 (UTC)[reply]
- Some standards (for example en:IPTC Information Interchange Model and supported by Adobe tools) use EXIF data to encapsulate further information. Examples in File:Van Eyck, Lam Gods B STB 187.jpg, File:Alchemillae_herba_135006.jpg and File:ID53053-CLT-0015-01-Mons, Hôtel de Ville-PM 53799.jpg. My guess is that a solution that extracts this info at upload time and puts it in an exif template data field of the file page would be far more efficient as standard search tools would remain unchanged and remain performant. An old example where it has been done: File:Neschwitz Altes Schloss 01.jpg --Foroa (talk) 07:32, 1 March 2012 (UTC)[reply]
- Bugzilla: 34793 - Search media by metadata. Rd232 (talk) 12:13, 29 February 2012 (UTC)[reply]
- Note we actually extract vorbis/theora "comments" (And have done so for years), we just don't do anything with those tags. Bawolff (talk)
Incategory: search from a Category page
[edit]Wouldn't it be useful to have a gadget which allows users to search within a category, right from the category page? It should be easy enough to construct the relevant query using incategory: and send the user to Special:Search - and so not too difficult a thing to do for one of our Javascript experts. This would be convenient for users who know incategory:, and even more useful for the many users who don't know it. Rd232 (talk) 16:20, 29 February 2012 (UTC)[reply]
- We already have the gadget "Search not in category". Easily adjustable. However, note that incategory does only account for direct categories (no subcats) and does not work for categories added via templates (AFAIK). That makes it less useful (since many categories are heavily sub-categorized). Cheers --Saibo (Δ) 17:01, 29 February 2012 (UTC)[reply]
- Works not always correctly, but difficult to validate (problems with hyphens in the search terms). --Foroa (talk) 19:10, 29 February 2012 (UTC)[reply]
- Yes - the subcats are an issue (see first proposal above), though it's still useful without, especially on larger categories. I think categories added via templates is not such a problem, since topic categories shouldn't be added via templates, and I'm thinking of end users here. As for the existing gadget - I don't know how that works, because it doesn't do anything for me. :( But I'm guessing it's a link in the toolbox, and I was imagining something more prominent than that - like an additional box at the top right, or maybe an additional icon in the existing search box that does the search. Rd232 (talk) 17:23, 29 February 2012 (UTC)[reply]
- It adds a tab. Not really the right place, indeed - should better be in the toolbox - especially for the "not in cat" version. The search "in cat" would be a kind of different mode to view the current category which those tabs are for. --Saibo (Δ) 21:06, 29 February 2012 (UTC)[reply]
- Ah, found it. In Vector additional "tab" links aren't visible without going to the dropdown menu, and I forgot to look there. Rd232 (talk) 21:27, 29 February 2012 (UTC)[reply]
- It adds a tab. Not really the right place, indeed - should better be in the toolbox - especially for the "not in cat" version. The search "in cat" would be a kind of different mode to view the current category which those tabs are for. --Saibo (Δ) 21:06, 29 February 2012 (UTC)[reply]
Incategory: search from a Category page - bug in "Search categories" action field
[edit]An example: When here], hitting "Search categories" action field, it generates rubbish as can be seen here. --Foroa (talk) 07:38, 1 March 2012 (UTC)[reply]
Support tags in search
[edit]Support to add additional filter tags, file types and date (ranges) in the search. Tags for example (QI/FP/VI, B&W, video, sound, image sources, user names, adult/noadult, 3D, license type/nolicense, ...). --Foroa (talk) 19:10, 29 February 2012 (UTC)[reply]
- Not only filtering search (and categories, come to that!) by such things, but also sorting... is a distant dream of mine :) Rd232 (talk) 20:31, 29 February 2012 (UTC)[reply]
- Better don't mix the two if you want a chance to succeed. The filtering and sorting might be done in a second round, possibly by an isolated function that starts with the search results and does selective display, so no complication in the search algorithm. --Foroa (talk) 11:03, 1 March 2012 (UTC)[reply]
- Possibly. I was thinking though that developers doing a reworking of the system to allow filtering by things like filetype and upload date would probably want to consider sorting at the same time, rather than do it separately. I think the design issues would be related. Rd232 (talk) 08:19, 2 March 2012 (UTC)[reply]
- Better don't mix the two if you want a chance to succeed. The filtering and sorting might be done in a second round, possibly by an isolated function that starts with the search results and does selective display, so no complication in the search algorithm. --Foroa (talk) 11:03, 1 March 2012 (UTC)[reply]
Prioritise category classifications in search rankings
[edit]Currently category classifications are used in search results, but they do not appear to be highly prioritised; title and description of files seem rank seem to rank more highly. It might help quality of results of category classifications were prioritised, especially where the match to the search term is exact or near exact. For example, this would mean that a search for "cucumber" would prioritise files that are in Category:Cucumbers, as against files that aren't and just happen to mention a cucumber in the description. Category:Cucumbers members would also rank more highly than members of Category:Long category name of something relating to cucumbers. Rd232 (talk) 16:52, 29 February 2012 (UTC)[reply]
- Personally, I am happy with the priorities as is; better documentation would be useful, and allow to better specify potential improvements. I think that you have to provide more precise specifications before it can be turned into something that can be developed. --Foroa (talk) 11:09, 1 March 2012 (UTC)[reply]
- More precise specification might be possible if there was better documentation... But I think the vague point is still understandable for a developer who knows how the system actually works, and the decision about what exactly to do with the vague point probably depends on knowing the system fairly well. Rd232 (talk) 11:37, 1 March 2012 (UTC)[reply]
Disambiguation or clustering
[edit]Find a way so that the search can disambiguate different meanings. For example, searching "cancer" might list the meanings, and choosing one would narrow the search focus using incategory (this would work rather better if incategory: included subcategories). Rd232 (talk) 16:52, 29 February 2012 (UTC)[reply]
- A better approach would be to cluster the results within a search. The technology can be borrowed from something like YaCy. VolodyA! V Anarhist Beta_M (converse) 03:13, 1 March 2012 (UTC)[reply]
- Could be, but that may be further away from the current technology. en:YaCy is interesting. Rd232 (talk) 10:21, 1 March 2012 (UTC)[reply]
Adaptive learning
[edit]More radical proposal: develop a way that the index can learn from users browsing search results. Give each search result item a "this is not relevant to this search" box, and use that feedback to change future results. Rd232 (talk) 16:52, 29 February 2012 (UTC)[reply]
Moreover one could learn from link tracking:
- Which links of the search results are clicked and in which order and the time gap between the clicks an how long the user remains on the result page compared to other result pages. -- RE rillke questions? 13:33, 30 June 2012 (UTC)[reply]
- That is a usefull way. But we would not need to have a "don't show me" box. All we would need is to keep track of the images that were found usefull (clicked) for the search term. That will result in a ranking for each term, using the reader/visitor as the worker without that he knows. But this would also imply some randomness (standard variance, not totally random). Otherwise images not under the top 20 will have a low chance to raise to the top. A typical problem for ranking systems. Thats why some music charts don't allow a song to be longer then x weeks in the the list. --/人◕ ‿‿ ◕人\ 署名の宣言 14:03, 30 June 2012 (UTC)[reply]
Include category interwiki links in the search database
[edit]Currently, the interwikis included in the categories are not integrated in the search database. This would be a powerful and very powerful translation tool if it worked. Examples: Search for fiets should return Category:Bicycles which it doesn't, search for trapauto returns Category:Quadricycles as it is in the text. --Foroa (talk) 18:17, 29 February 2012 (UTC)[reply]
- Definitely a great idea, in fact interwiki should be considered more relevant than a text on the page, because it is also a title of that page, just in a different language. VolodyA! V Anarhist Beta_M (converse) 18:35, 29 February 2012 (UTC)[reply]
- Agreed. Particularly for Commons (with the multilingualism) this would be really useful. This is related to Bugzilla:30595 (Inconsistent search results with language links), but I think should probably be a separate bug (I don't think there's an existing one). Rd232 (talk) 19:07, 29 February 2012 (UTC)[reply]
- Well, there's actually Bugzilla:24658 (Search pages by interwikis), which is related but not the same (it's asking for redirects from interwiki titles to the page). Rd232 (talk) 19:14, 29 February 2012 (UTC)[reply]
- Not sure that I understand the bug report. Note that the interwiki's in the main name space are found with the search (at least here and on en:wiki), so I guess that it is rather a configuration option when building the search database. --Foroa (talk) 07:42, 1 March 2012 (UTC)[reply]
- Well that bug wanted a change so that if a page (like Bicycle) has nl:fiets as an interwiki, then typing fiets into the search would take you to that page (unless there's a separate fiets page). Of course that assumes interwiki links are unique, which they mostly are, but it would need to provide for collisions (multiple pages with nl:fiets interwikis). As for search: are you sure interwiki links are currently searched on Commons? I can't find it documented anywhere, and haven't been able to prove it by searching. Rd232 (talk) 10:16, 1 March 2012 (UTC)[reply]
- 1. Don't mix things. First: we have to find via the interwiki's. I have tested that here and on en:wiki with fiets: it returned only the article en:Bicyle that has only the IW:Fiets. I seem to remebber that this is documented somewhere but can't find it back.
- 2. In a second round, we can try to exploit it better, but I doubt that it is as simple as that. My feeling is that there are many thousands of words that have a complete different meaning in different languages; there are already many thousands of disambiguated terms in English; just try to imagine what options one would have with words like fr:sol and De:Ban. --Foroa (talk) 12:46, 1 March 2012 (UTC)[reply]
- Well that bug wanted a change so that if a page (like Bicycle) has nl:fiets as an interwiki, then typing fiets into the search would take you to that page (unless there's a separate fiets page). Of course that assumes interwiki links are unique, which they mostly are, but it would need to provide for collisions (multiple pages with nl:fiets interwikis). As for search: are you sure interwiki links are currently searched on Commons? I can't find it documented anywhere, and haven't been able to prove it by searching. Rd232 (talk) 10:16, 1 March 2012 (UTC)[reply]
- Not sure that I understand the bug report. Note that the interwiki's in the main name space are found with the search (at least here and on en:wiki), so I guess that it is rather a configuration option when building the search database. --Foroa (talk) 07:42, 1 March 2012 (UTC)[reply]
- Well, there's actually Bugzilla:24658 (Search pages by interwikis), which is related but not the same (it's asking for redirects from interwiki titles to the page). Rd232 (talk) 19:14, 29 February 2012 (UTC)[reply]
Category union and intersection
[edit]This proposal requires major change to the way categories in MediaWiki work, but i think that it is a good thing to bring this up, because it's not only about searching with the search function, it's also about somewhat "browsy" type of search, when a person slowly approaches the desired media. An idea is to introduce a way to create an intersection, a union, or an exclusion of the other category to the category view.
This is very relevant to the idea of the search, as it talks about ability to categorise an image. Unfortunately this feeds into the hands of [...] who try to introduce safe search idea, because we won't need to create categories like Category:Photographs of non-kosher mammals standing on the hind legs with the visible genitalia made in Germany with a digital camera during Rammadon at night (god, i hope this link will be red when i click save). VolodyA! V Anarhist Beta_M (converse) 18:03, 29 February 2012 (UTC)[reply]
- I agree. Somehow, in this area, I imagine something a bit like HotCat to add and remove categories from a list of categories being intersected or combined. Would be brilliant. Rd232 (talk) 19:03, 29 February 2012 (UTC)[reply]
- Having software support for "automatic" category intersection, in particular, would go a long way toward allowing categories to actually be used as "tags", as Brion Vibber once said somewhere (Bugzilla?). I believe he said, in response to the idea of image tagging: "Categories are tags". If that comment were taken "literally", that would mean every image here at Commons that "featured" the color blue (for example) would go into Category:Blue. That's obviously not how we do things now: instead, we mark that category as one "requiring frequent diffusion" — which really just means: "too many people tag images with this category" (!) — and then we proceed to create a profusion of subcategories slicing and dicing the concept of "blueness" in different ways, and require people to choose which of those sub- (or sub-sub-, etc.) categories the image belongs in. And we do the same thing with almost every other common characteristic you can come up with. If we had better tools for category searching and browsing, we could:
- actually let people use categories the way they (seem to) want to: as "tags"
- cut down on a lot of near-duplicate category tree structures (e.g., "Women in blue shirts", "Men in blue shirts", "Women in blue socks", etc.)
- enable "tag blocking" for anyone who wants it (ignoring the potential pitfalls, this approach would, I think, make it technically easier to implement)
- possibly have "automatic" category-browsing suggestions akin to search-result suggestions in the manner of: "items tagged with this category are most often also tagged with these others"
- I'm not saying using categories as tags solves all our problems, or doesn't create new ones, but it's definitely something to consider... - dcljr (talk) 11:02, 30 March 2012 (UTC)[reply]
Display date and time of the last update of the search database
[edit]The display of the date and time of the last update of the search database in the search screens might avoid many help requests and retries. --Foroa (talk) 18:59, 29 February 2012 (UTC)[reply]
- That seems like a good idea - and probably not too hard to do. Rd232 (talk) 19:16, 29 February 2012 (UTC)[reply]
- submitted Found an existing bug: Bugzilla: 6090 - Add time of last search index update to Special:Search. Rd232 (talk) 09:25, 22 March 2012 (UTC)[reply]
Report properly when search don't work
[edit]It happens to me 10 to 20 times per year that search don't give any results whatever one tries. This situation seems to keep on during 3 to many more minutes, I encountered up to 15 minutes but I did not do exhaustive tests. While I can live with it, average users will loose confidence in it. I suspect that this happens during a search database rebuild or move. While this might be acceptable for smaller databases (and short times), I supporse that with 20 million items in the database, this black-out time becomes more significant and not acceptable.
- As a minimum, search should warn when no search database is available (and display it in the status, see comment above).
- By preference, it should switch databases rapidly in stead of disappearing many minutes for an update. --Foroa (talk) 06:39, 15 March 2012 (UTC)[reply]
- That definitely shouldn't happen. This needs to be a bug on bugzilla; I'll try and do that soonish if no-one beats me to it. Rd232 (talk) 10:40, 15 March 2012 (UTC)[reply]
- I'd rather say up to ten times per month, blackout durations of 25 minutes have been observed. --Foroa (talk) 06:43, 4 April 2012 (UTC)[reply]
- That definitely shouldn't happen. This needs to be a bug on bugzilla; I'll try and do that soonish if no-one beats me to it. Rd232 (talk) 10:40, 15 March 2012 (UTC)[reply]
submitted: Bugzilla: 35691 - Occasional temporary problem: no search results for any search terms. Rd232 (talk) 08:03, 4 April 2012 (UTC)[reply]
- Yes, many people will leave, at least for awhile, when they can not find what they want. See also this section farther down:
- #Add Google site search option --Timeshifter (talk) 19:54, 10 October 2012 (UTC)[reply]
- Problem occurs still regularly, but in general, a second try a few seconds later work (in stead of minutes in the past). So even more difficult to prove that it fails. --Foroa (talk) 08:46, 26 October 2012 (UTC)[reply]
Advanced search
[edit]How about a Javascript gadget (or MediaWiki improvement) to allow easy construction of advanced search queries like this? Rd232 (talk) 01:35, 1 March 2012 (UTC)[reply]
- Would be great because the search syntax is hardly documented. --Foroa (talk) 07:44, 1 March 2012 (UTC)[reply]
Add intitle: and prefix: search options to Special:Search
[edit]It would be nice if there were a checkbox for intitle: searches on the Special:Search page. Many people do not know that one can search only the titles. I was just looking for the word "search" in a few selected namespaces on the Commons and came up with hundreds of unwanted results because it was pulling up results from the text too.
Also, intitle: searching is not activated on all MediaWiki installations, and so someone may think from past experience at Wikia at various times that intitle: searches no longer work on MediaWiki in general.
What good are features that people do not know about? There should be checkboxes for all the available options such as
- intitle:
- prefix:
For more info on these features go to w:Help:Searching and scroll down to the section on search engine features. --Timeshifter (talk) 21:59, 26 April 2012 (UTC)[reply]
Search tagging and prioritization
[edit]I think that the optimal system would work like this:
- Allow tagging of pages to indicate that a specific, tangible thing is present or absent. I described this as "{{incl|bell}}" - telling the search that there's an actual, real bell included in the picture. Above was suggested a "{{lowsearch|bell}}" to mark that there isn't a real bell in the picture. A symmetrical combination of these templates might be instructive. The way I tried to implement this originally was simply putting the text in a "display:none" style tag, plus an extra keyword "incl-bell" that you could search for. Problem is, I don't know that actually helps the search results in the current version (there's some huge lag in updating them) and there's no way to use it to downgrade pages. If implemented by devs, a "magicword" might be involved that allows pages not just to be pulled up or down, but actually marks them in a set containing a keyword.
- Allow searchers to specify priorities for the terms they search at the search bar. For example, I picture a system of prefixes:
- "++c:bell" - yields only images in Category:Bell or subcategories of Category:Bell
- "++a:bell" - yields only images with "bell" in the text of the file annotation (including title)
- "++e:bell" - yields only images with "bell" in EXIF metadata
- "++t:bell" - yields only images with "bell" in the title
- "++k:bell" - yields only images tagged with "bell" keyword
- "++ac:bell", etc. would work like Boolean AND; specify OR with "++a|c:bell" or something.
- "+c:bell" - modifies another search to put all the results within the "++c:bell" subset first.
- "++a:bell +tkc:bell" would show only hits with bell in the file annotation, putting those with the keyword marked first, then those with bell in the title, then those in the category or subcategories thereof, then any others. Should be the default for typing "bell" at the search box. Unless you can customize this in your preferences... ;)
- "-c:bell" or "--c:bell" and so forth might be allowed for extra customization to lower certain search results when the user desires.
- The scheme I suggest here is perhaps somewhat hazardously close to being abused as a rating system, i.e. you could have a default "--caetk:penis --caetk:vagina" tacked onto your searches; perhaps you could even have a user setting to tag an arbitrary string of this type to your search results by default. The key distinction here is not software but policy, or rather lack of policy - I'm not suggesting anything approaching forcing people to tag vague things like "OK for under 18", but I would permit people to mark that a file contains a penis if it contains a penis, as part of a general strategy of marking images with their contents to assist in searching, rather than to classify them by vague value judgments about the contents. And I don't want this as some censor-oriented thing but as a general tool to help people find and avoid files containing any object that exists.
Wnt (talk) 06:07, 1 March 2012 (UTC)[reply]
- Something like this can of course be useful; except that it won't really protect us from the wrath of censors wanna-be, because images of sexual use of toothbrushes actually do contain a toothbrush, and many nudity/non-hallal/atheist images/video actually do contain a physical object that is obscure. Imagine for example how difficult it would be to find an image of socks, because when every image actually containing socks will be coming up, rather when only those with that word in title or description. VolodyA! V Anarhist Beta_M (converse) 09:24, 1 March 2012 (UTC)[reply]
- I appreciate you've put some thought into this, but it seems rather reinventing the en:Semantic MediaWiki wheel. And all that search syntax is very demanding on the average user. Simple search results need to be better: more like "monarch" - here's the results filtered for the primary meaning of 'monarch', did you mean 'monarch butterfly'? Or display all results?. Rd232 (talk) 09:40, 1 March 2012 (UTC)[reply]
A little bit of intelligence
[edit]Another approach would be to use some intelligence for the search algorithm. I noticed one nice detail in any "dangerous" search result. It was ambiguousness of terms. The current algorithm searches for keywords inside the description. This is a good starting point to find anything that contains the keyword, but the results will be mixed up, leading to so called "unexpected results". If we would make a tiny extension to the search algorithm it could be much better. I propose this:
- The search works as usual and grabs all results by keyword.
- It looks at the categories of the results. If it finds multiple images from different parts of the category tree it will split the results in groups, labeling them after the lowest parent category.
- If more then one group of images (group1, group2, ..., groupX, no group), from which "no group" is the default, was found, then it won't display the thumbnails. Instead it will display the groups, which gives the user a good feeling on what to expect.
- If the user then clicks on a group entry a search will be done. It will only displays images contained inside this parent category.
If i think of the "cucumber" or "toothbrush" example, then it is very unlikely that they will be grouped together with the "unexpected results", giving the user the choice to refine what he is looking for. This would be a general improvement of the search, since a search for "monarch" would deliver the option to search for the "butterfly" or "monarchy" related images. At the same time it would minimize the likelyness of unexpected results, while treating everything the same way. Yes it can be done and the effort wouldn't be very huge or computation expensive. -- /人◕ ‿‿ ◕人\ 苦情処理係 09:53, 1 March 2012 (UTC)[reply]
- It is almost the same thing that i talked about when i mentioned clustering. Except that it won't help with the censorship, because you need to show all the results and then clusters at the same time to minimise the number of clicks when there is no disambiguation needed. Look at the way clusty search engine did it (and still does, but now they are censored, and thus aren't very useful). You searched and they've presented the results and then said in the side bar that the results fall in one of these groups, listing the size of the group. YaCy has a similar strategy. VolodyA! V Anarhist Beta_M (converse) 10:01, 1 March 2012 (UTC)[reply]
How wouldn't this help with censorship? It still would display anything. It would just group the results and collapse them (no matter what). Yes it would result in at least one additional click. But i guess that it is still better in performance as if a user searching for "monarch" sees 199 thumbnails of butterflies and one picture of monarchy. He would just "scroll" and "scroll" through the results, which wouldn't be helpful. If it would work as proposed, then this user would need much less clicks to find what he wants to find. From my experience such searches are way more efficient, especially for large collections of hierarchical data (images+categories).
When there is no disambiguation it would directly display all of the results. That means that very defined searches would lead to direct results, while rough search terms would provide directional aid to the user. That would make categories actually useful. -- /人◕ ‿‿ ◕人\ 苦情処理係 10:16, 1 March 2012 (UTC)[reply]
- Simple. It won't help censorship because 1 click is a lot, from the perspective of the useability when we strive for 3 clicks max, one extra one is quite often something which separates a useful tool from the one that exists only to close an issue in bug tracker. So what i'm saying is make it display everything and display clusters at the side. This way a person still has all the benefits of clusters, but without any downfalls of an extra complexity. VolodyA! V Anarhist Beta_M (converse) 11:36, 1 March 2012 (UTC)[reply]
- I just tried to be nice and to explain a solution that wouldn't need additional tagging or prejudicial judgment while still giving the critics the results they would expect. I'm not trying to introduce censorship in any way. I'm strongly opposed to it. But seriously. What would be the difference between one click more if it gives you the result you would expect? The main issue with the current search function are the bad result. Simply said: It sucks. Not because a penis is shown, but in general. This leads to many more clicks then one additional click. Helping the users by exposing the category system to their view, without that they even have to know the exact name of the category would be much more beneficial as if they had to enter various combination of search terms to get roughly what they want.
- I'm willed to make compromises at any time, but i won't make a compromise considering value judgment. -- /人◕ ‿‿ ◕人\ 苦情処理係 12:02, 1 March 2012 (UTC)[reply]
- You misunderstood what i'm saying. It's little to no extra work to continue to display all the results while at the same time showing clusters at the side. And i don't agree that current search comlpetely sucks, i think it needs to be improved, but i normally can find most of what i'm looking for. VolodyA! V Anarhist Beta_M (converse) 12:06, 1 March 2012 (UTC)[reply]
- I did understand you very well. But i made this compromise (the sacrifice of one click) to react to the criticism that the search would display unexpected results, without giving the user a chance to interact. But now i have your criticism as well. So i will suppose another detail change:
- Use the described technique explained above and the variation user Beta_M supposed, while giving the user the option to switch between both modes:
- Mode 1: Display thumbnails as usual, but add additional guidance at the side, based upon the category tree.
- Mode 2: Only Display thumbnails if only one group was found. If multiple groups are found, display the names of the groups instead to refine the search and then display the thumbnails.
- Mode 3 (optional): Like Mode 2, but with random preview thumbnails for the groups. (groups wont be collapsed)
- Do you think that this would be good compromise to make? -- /人◕ ‿‿ ◕人\ 苦情処理係 12:25, 1 March 2012 (UTC)[reply]
- I honestly can't see all the implications right now, but let's talk about how possible it is. How would you define different groups? Let's say a person searches for "cones" logic suggests that we want to have at least two clusters "Connifer cones" and "Geometric shapes". But Category:Conifer cones is inside Category:Cones just like Category:Cone (geometry). From the perspective of the software there's no difference between two cones, because you suggest looking for the lowest common denominator. In addition to that the lowest common denominator between Monarch butterfly and A state ruler in monarchy may become Category:Photographs, which is completely useless. So what we need is something slightly more intelligent than just common category judgement. But here i'm a bit stumped, as i don't know how to generate those clusters. VolodyA! V Anarhist Beta_M (converse) 14:12, 1 March 2012 (UTC)[reply]
- But actually having thought about it for a few minutes... Mode three would actually empower the user more than the current search or just showing all the results at first, so if implemented i think it'd be a good step forward. However, i disagree with the "random" part. Perhaps it should be "top 4" or something, and of course there should be still a way to show unclustered "all clusters" search results. VolodyA! V Anarhist Beta_M (converse) 14:21, 1 March 2012 (UTC)[reply]
- You create clusters by distance criteria and you will have to find the right balance. One criterion would be the number of hits in the same category. If it exceeds n images with the keyword inside it will be in favor of creating a group from this. Another criterion might be distance between categories. The "further away" the results are the likelier they don't belong to the same group. At the end you will need some opposing criterion that says "you have to group them up to at max 10 groups". Rightfully balanced, which is daily routine, it would deliver very useful results.
- About randomness: I meant it in a way that images which have the same ranking for the search term should/might be randomized in the preview of the category (see mode 2 illustration), to not favor one result over the other. But this is entirely optional and might be discussed later on. -- /人◕ ‿‿ ◕人\ 苦情処理係 15:26, 1 March 2012 (UTC)[reply]
- But actually having thought about it for a few minutes... Mode three would actually empower the user more than the current search or just showing all the results at first, so if implemented i think it'd be a good step forward. However, i disagree with the "random" part. Perhaps it should be "top 4" or something, and of course there should be still a way to show unclustered "all clusters" search results. VolodyA! V Anarhist Beta_M (converse) 14:21, 1 March 2012 (UTC)[reply]
- I honestly can't see all the implications right now, but let's talk about how possible it is. How would you define different groups? Let's say a person searches for "cones" logic suggests that we want to have at least two clusters "Connifer cones" and "Geometric shapes". But Category:Conifer cones is inside Category:Cones just like Category:Cone (geometry). From the perspective of the software there's no difference between two cones, because you suggest looking for the lowest common denominator. In addition to that the lowest common denominator between Monarch butterfly and A state ruler in monarchy may become Category:Photographs, which is completely useless. So what we need is something slightly more intelligent than just common category judgement. But here i'm a bit stumped, as i don't know how to generate those clusters. VolodyA! V Anarhist Beta_M (converse) 14:12, 1 March 2012 (UTC)[reply]
- This sort of proposal - around disambiguating search terms, which I asked for suggestions for above - is very interesting. As I've tried to argue all along, a really effective search would also be a "safe search" because it would show users what they want and expect. And this proposal actually seems dangerously feasible. Maybe we could try and get some developer input on the feasibility of this type of thing? Rd232 (talk) 07:29, 2 March 2012 (UTC)[reply]
- This is a good idea, Niabot. I would support pursuing this just for its general benefits, quite apart from the adult media issue, which this would clean up almost as a welcome side effect. I recall using the Vivísimo clustered search engine years back ... using a clustered search would be a considerable usability improvement for Commons.
- To address some of the comments above, I guess we could show one or two images for each of the clusters that contain a significant percentage of the search results. So if someone were to search for "toothbrush masturbation" or "cucumber sex", then a sexual example image would show. But if someone were to search simply for "toothbrush" or "cucumber", then an example image for toothbrush masturbation or sexual use of cucumbers would only show if the relevant category represented a significant percentage of all our toothbrush or cucumber images.
- Niabot, do you feel like adding a corresponding section to http://meta.wikimedia.org/wiki/Controversial_content/Brainstorming ? --JN466 16:45, 3 March 2012 (UTC)[reply]
- A great proposal and great mockups, as others have noted. We could even talk to some clustering-search experts for help; some of them created Wikipedia-friendly search engines in years past. [I recall one which offered clustered search for all topics, and had a wikiepdia-specific section as one of the tabs on the search engine itself.] --SJ+ 01:51, 15 March 2012 (UTC)[reply]
- This is a great proposal. This solution becomes more useful as our category system evolves. If there is only one cluster in the first 50 results, they can be all shown. The software could also work out which cluster is most desired and give preference to that cluster. John Vandenberg (chat) 09:11, 4 March 2012 (UTC)[reply]
- "most desired" in which way? Number of searches? -- /人◕ ‿‿ ◕人\ 苦情処理係 09:38, 4 March 2012 (UTC)[reply]
- Either 1) number of clicks, or 2) number of page views. A search results in a set of clusters, and users select/click which cluster they want to look at - If more people ask to see the cluster 'prince albert the piercing', we should put that one first with a textual description that isnt intentionally offensive. Each cluster contains many images, and these images are viewed directly on Commons, via Wikipedia, or InstantCommons - If the clusters are weighted by the pageviews of all their members combined, then pageviews is a measure of importance or relevance/currency. John Vandenberg (chat) 10:24, 4 March 2012 (UTC)[reply]
- I'm not really sure that this additional feature would be in the favor of the project. You might say that this would be supporting the main stream giving it what it wants to see. That is always a good idea if you want to reach out to the mass (popular culture) and to entertain it. But Commons and Wikipedia have a bit of another preposition. The projects aren't meant for pure entertainment. They are meant to show the diversity and to represent all knowledge. Thats why I'm not really sure if this would be a good addition, since it would favor the development of so called mono cultures.
- Let's say we would be a service for music downloads. Then such kind of rating would be in favor of the big labels, while listing the independent labels at the end of the list. Because 99% of the readers/visitors/listeners/viewers are lazy they would enter the first cluster and ranking it even higher. Thats when a ranking algorithm starts to manipulate it's own ranking. Thats a typical problem that you don't want to have in a fair ranking.
- As stated at the beginning, I'm not against this thought, but I think it would need a closer look, because of the explained issue. You could also see it as a parallel line to what the ALA meant with rating systems. -- /人◕ ‿‿ ◕人\ 苦情処理係 11:38, 4 March 2012 (UTC)[reply]
- Agree with Niabot that page views aren't an ideal metric, especially if a nice-to-have aspect of implementation would be that we are trying to reduce the prominence of adult media files displayed for innocuous searches like "toothbrush". Anything based on page views is likely to have the opposite effect:
- When ranked by pageviews or clicks, almost all the top Commons content pages are adult media files.
- The most-viewed category is Category:Shaved genitalia (female), followed by Category:Vulva and Category:Female genitalia.
- The masturbating-with-a-toothbrush image is viewed more than 1,000 times a day, compared to roughly 1 view a day or less than one view a day for actual images of toothbrushes.
- Its popularity is not due to the fact that it is our best image of a toothbrush (it isn't), or that the image is included in a subcategory of Category:Toothbrushes, the term the user searches for. It is due to the fact that it is primarily an image of masturbation displaying female genitalia: it is included in Category:Shaved genitalia (female), which, as mentioned above, is the most popular category in all of Commons, and it is also part of Category:Female masturbation, the 10th most popular of all Commons categories.
- The same thing applies to the cucumber images: their viewing figures will far outstrip viewing figures for any images just showing cucumbers, but these high viewing figures will not be because of people who have browsed to these images via the cucumber search term, or the cucumber category tree, but because of people interested in sexual media, where the presence of a cucumber is merely incidental.
- More generally speaking, page views aren't everything; if we were after maximising page views, we'd have a w:page 3 girl on the main page. --JN466 15:15, 4 March 2012 (UTC)[reply]
- I have to say, this comment makes me think that maybe we don't have so much of a problem in the first place. If people are actually looking for masturbation with a toothbrush 1000 times more often than an actual toothbrush, then delivering that result for "toothbrush" might just get people what they're looking for more often. The "principle of least astonishment", if one believes in it, should dictate that if our horny little audience is really hunting for porn most of the time, it would be astonishing not to serve it up to them. Wnt (talk) 22:34, 4 March 2012 (UTC)[reply]
- The point I was trying to make is that those 1,000 daily page views don't come from people who are searching for an image of a toothbrush. They're from the quarter million people who look at Category:Shaved genitalia (female) and Category:Female masturbation every month, where this image is contained ... The other point is, regardless of how educational it is to look at other people's genitalia, and at images of other people having sex, would a free porn site meet the definition of a tax-exempt educational site? If YouPorn, say, proposed a business model whereby they were funded by donations, would they qualify for tax exemption and 501(c)(3) status? Probably not. And would Wikimedia donors be happy to see their money spent on providing the public with a free porn service? Probably neither. --JN466 00:06, 5 March 2012 (UTC)[reply]
- You will have to consider that Wikipedia and Commons are highly ranked in search engines. We all know the famous search terms that this search engines have every year in the top 10. This leads to the high traffic inside this categories, since people click blindly on the first links. If they are actually happy with the result is the other question. This would need a little study on how many people search for "pussy/..." at Google & Co, clicked on the link to Commons and then clicked on the thumbnails in the category to view the files in larger size and would repeat the same procedure soon again. I think that we don't need such a study and already know that we don't have so much traffic because of our content. If i think a bit more about it, then we should spam this categories with worst low quality porn we can get. Since if the audience is unhappy with our collection, then it would turn quickly away and understand to not repeat the same procedure again, no matter how high the rank at Google & Co. is. ;-)
- To conclude: There is no black and white and there is always more then one reason to be considered. Only giving or considering one single reason isn't a good way to make an argument, and it will enforce the typical yin yang battle between the three apes, without any substantial result. -- /人◕ ‿‿ ◕人\ 苦情処理係 00:50, 5 March 2012 (UTC)[reply]
- The point I was trying to make is that those 1,000 daily page views don't come from people who are searching for an image of a toothbrush. They're from the quarter million people who look at Category:Shaved genitalia (female) and Category:Female masturbation every month, where this image is contained ... The other point is, regardless of how educational it is to look at other people's genitalia, and at images of other people having sex, would a free porn site meet the definition of a tax-exempt educational site? If YouPorn, say, proposed a business model whereby they were funded by donations, would they qualify for tax exemption and 501(c)(3) status? Probably not. And would Wikimedia donors be happy to see their money spent on providing the public with a free porn service? Probably neither. --JN466 00:06, 5 March 2012 (UTC)[reply]
- I have to say, this comment makes me think that maybe we don't have so much of a problem in the first place. If people are actually looking for masturbation with a toothbrush 1000 times more often than an actual toothbrush, then delivering that result for "toothbrush" might just get people what they're looking for more often. The "principle of least astonishment", if one believes in it, should dictate that if our horny little audience is really hunting for porn most of the time, it would be astonishing not to serve it up to them. Wnt (talk) 22:34, 4 March 2012 (UTC)[reply]
- Either 1) number of clicks, or 2) number of page views. A search results in a set of clusters, and users select/click which cluster they want to look at - If more people ask to see the cluster 'prince albert the piercing', we should put that one first with a textual description that isnt intentionally offensive. Each cluster contains many images, and these images are viewed directly on Commons, via Wikipedia, or InstantCommons - If the clusters are weighted by the pageviews of all their members combined, then pageviews is a measure of importance or relevance/currency. John Vandenberg (chat) 10:24, 4 March 2012 (UTC)[reply]
- "most desired" in which way? Number of searches? -- /人◕ ‿‿ ◕人\ 苦情処理係 09:38, 4 March 2012 (UTC)[reply]
Bugzilla:35701 - Clustering for image searches. Rd232 (talk) 17:32, 4 April 2012 (UTC)[reply]
- Stopgap idea: please see Commons:Village_pump/Proposals#Make_search_default_to_searching_category_namespace. Rd232 (talk) 11:34, 22 April 2012 (UTC)[reply]
Open search results in a new tab
[edit]There is custom JS that will do this. See this discussion on Wikia (it uses MediaWiki):
It would be nice if it were an option in preferences, or even better if it were the default setting on the Commons and Wikipedia. --Timeshifter (talk) 21:47, 20 March 2012 (UTC)[reply]
- We can start by making it a gadget; I'm not sure everyone will want that as default behaviour though (and I'm not convinced about opening every new search at Special:Search in a new tab). We'll need someone more competent than me to do the gadget (unless my code at User:Rd232/common.js not working is just a purge issue). Rd232 (talk) 10:09, 21 March 2012 (UTC)[reply]
- A possible default would be to right-click the search button, and open results in a new tab. The gadget/preference could offer options for left clicks to open results in a new tab. It could offer various options such as only opening new search-result tabs for search forms found at the top right of pages. --Timeshifter (talk) 07:00, 22 March 2012 (UTC)[reply]
Done as a user script - see MediaWiki talk:Search-results-new-tab.js. If we want to make it a gadget, we should propose that somewhere more visible (COM:VPR maybe). Rd232 (talk) 13:46, 22 March 2012 (UTC)[reply]
- Thanks! It is working for me. Can you create a second JS import page for getting all search results in new tabs? Including searches done from Special:Search? It has been done at Wikia. See thread. I understand that some people will not want that. That is why a different JS import page would be needed. I sometimes do multiple detailed custom searches, and I like to see all the results in different tabs. It helps me find more of the files and categories I need. For example; I will search for files with various searches. Then I will search for categories with various searches. Then I will put files in various categories. I x-out search result pages I don't need. This saves me time, since I don't have to open new blank Special:Search pages, and then copy and paste the search terms, reset all the check boxes, etc.. I only have to make minor adjustments if all search results open in new tabs from Special:Search.
- I like that you are doing this via imports. That way it can be adjusted as necessary (hopefully by somebody) in the future as the need arises, and as the MediaWiki software changes.
- Have you tested it? I thought (when I tested it) it also covered searches done from Special:Search. Rd232 (talk) 14:53, 23 March 2012 (UTC)[reply]
- Yes. It only opens new tabs when search is done from the top-right search form. Special:Search opens in the same tab. --Timeshifter (talk) 14:57, 23 March 2012 (UTC)[reply]
- Aha. Well it is a different HTML ID (...obviously...) so I've added that - should work now (did in my test just now, at least). I've added it to the existing script as I think it's logical for the two behaviours to go together, and we can always create an additional script for just the searchform later if people want it. Rd232 (talk) 15:08, 23 March 2012 (UTC)[reply]
- Yes, it is working now. I agree about the second page only if there is demand. Links under the Special:Search form (Content pages, Multimedia, Help and Project pages, Everything, Advanced) open in the same tab. But that is OK since they are links. I can always right-click them to get new tabs. --Timeshifter (talk) 15:19, 23 March 2012 (UTC)[reply]
- Aha. Well it is a different HTML ID (...obviously...) so I've added that - should work now (did in my test just now, at least). I've added it to the existing script as I think it's logical for the two behaviours to go together, and we can always create an additional script for just the searchform later if people want it. Rd232 (talk) 15:08, 23 March 2012 (UTC)[reply]
- Yes. It only opens new tabs when search is done from the top-right search form. Special:Search opens in the same tab. --Timeshifter (talk) 14:57, 23 March 2012 (UTC)[reply]
I noticed a bug when using this script (on Google Chrome 18.0.1025.151):
- Type "test" in the search box
- Press ENTER (this will open the page in a new tab)
- While the new tab is loading, press CTRL+SHIFT+TAB to return to the previous page. Unexpectedly this will open one more tab with the same search results instead of just going back to the page we were reading.
Helder 17:35, 11 April 2012 (UTC)
- For those who are following this the bug discussion has moved to here: MediaWiki talk:Search-results-new-tab.js --Timeshifter (talk) 13:27, 13 April 2012 (UTC)[reply]
Search link in sidebar
[edit]I like to be able to go to Special:Search directly, and do custom searches. Currently that is impossible from a Commons page. One has to use a bookmark, or cover up the current page by clicking on the search icon at the end of the search bar.
See previous section about opening search results in a new tab. If that were enabled it would solve this problem too. I could click on the search icon and Special:Search would open in a new tab. I would still like a search link in the sidebar though. Most readers will not think to click the search icon to get to advanced search. --Timeshifter (talk) 22:50, 20 March 2012 (UTC)[reply]
- I would still like a search link in the sidebar though. Most readers will not think to click the search icon to get to advanced search. - I don't understand that. Advanced search options are also available from the search results page. Rd232 (talk) 10:13, 21 March 2012 (UTC)[reply]
- I want a direct link in the sidebar that would allow me to get to a new tab with the Special:Search page. Call the link "Advanced search" or something. Clicking the search icon at the end of the search form at the top right of the page currently takes one to Special:Search, but not to a new tab. Right-clicking the search icon does not help. If a direct link to advanced search was in the sidebar, then I could right-click it to get to a new tab. --Timeshifter (talk) 07:04, 22 March 2012 (UTC)[reply]
Done - with the script in the section above, you can click the magnifying glass to get Special:Search in a new tab. Rd232 (talk) 13:49, 22 March 2012 (UTC)[reply]
- Great! Thanks. --Timeshifter (talk) 14:38, 23 March 2012 (UTC)[reply]
Advanced search link in sidebar
[edit]Most readers do not know much about how to find stuff on the commons. They try the search form, but soon see that it covers up the page they are looking at. Many people stop using the search form after that, and use Google Toolbar instead.
Google toolbar does not have a separate category search though. Special:search does. Google toolbar does not allow one to pick and choose namespaces to search. Special:search does. Also, Google no longer makes its toolbar for Firefox. So putting an "advanced search" (Special:search) link in the sidebar would solve many problems. --Timeshifter (talk) 05:37, 23 April 2012 (UTC)[reply]
Bugzilla wiki for Wikimedia has advanced search link in sidebar
[edit]See https://bugzilla.wikimedia.org - It has advanced search link in sidebar. Is there any reason this can not be done on the Commons?
Or can a gadget allow one to place this link in the sidebar? --Timeshifter (talk) 10:57, 7 July 2012 (UTC)[reply]
A category-only search form or category adder on all pages
[edit]I don't know why this was not done long ago. It is the number one thing to do on the Commons; put stuff in categories. The drop-down suggestions that show up when one starts typing in the search form are invaluable. Need a way to search only for categories from the search form on every page, and need a way to copy category names from the dropdown menu. It needs to be default hard-wired, and not just an addon like HotCat.
Wikia has something that might be adapted, too. The Wikia category adder is default installed, and easy to use. Between HotCat and the Wikia extension, something even better could be created. --Timeshifter (talk) 02:16, 21 March 2012 (UTC)[reply]
- (i) being able to limit search to category names in the search form would be useful, yes. (ii) HotCat could be activated by default. However I'm not sure this is a good idea; brand new users with no familiarity with the category system may use it to miscategorise, because it makes it so easy to add categories. And once (mis)categorised, files no longer show up in "uncategorised" tracking categories. (iii) can you explain the difference between Wikia category adder and HotCat (and/or provide a link)? Rd232 (talk) 09:55, 21 March 2012 (UTC)[reply]
- http://help.wikia.com/wiki/Help:CategorySelect - see the bottom of that page, and click "Add category". Start typing, and suggestions will show up. It is a monumental time saver. It is much better than HotCat in my opinion. Much simpler, and more intuitive. I agree that it might be better to leave HotCat as an option. But CategorySelect could be installed by default. Nothing happens until one clicks the "Save" button, and people will know whether the category relates or not. So there will not be much totally incorrect categorization. HotCat does more, but it has many flaws, mainly that it is incomprehensible at first. It needs an option for exposed text labels. --Timeshifter (talk) 07:01, 22 March 2012 (UTC)[reply]
- Well, having gone to the effort of signing up for a Wikia account so I can test it, I'm disappointed to find that CategorySelect is basically just a prettier version of HotCat (using icons and nice design). The only functionality difference I can see is that it allows you to choose a category sortkey from a popup window, without using the | pipe syntax. That's friendlier (though I'm not sure if the pipe syntax works in addition - it's more efficient). Rd232 (talk) 09:34, 22 March 2012 (UTC)[reply]
- HotCat baffles me almost every time I try to use it. I have to figure it out again almost each time I use it. It is definitely not intuitive. I use CategorySelect frequently, and it has always been easy from the very beginning. People uploading images to the Commons oftentimes only upload one image, or maybe a few images. So they are complete newbies. Many things about Wikia are a total pain, but CategorySelect is one of the good things they have created. I believe all their added MediaWiki code is under CC-BY-SA. --Timeshifter (talk) 18:50, 23 March 2012 (UTC)[reply]
- Well, having gone to the effort of signing up for a Wikia account so I can test it, I'm disappointed to find that CategorySelect is basically just a prettier version of HotCat (using icons and nice design). The only functionality difference I can see is that it allows you to choose a category sortkey from a popup window, without using the | pipe syntax. That's friendlier (though I'm not sure if the pipe syntax works in addition - it's more efficient). Rd232 (talk) 09:34, 22 March 2012 (UTC)[reply]
- http://help.wikia.com/wiki/Help:CategorySelect - see the bottom of that page, and click "Add category". Start typing, and suggestions will show up. It is a monumental time saver. It is much better than HotCat in my opinion. Much simpler, and more intuitive. I agree that it might be better to leave HotCat as an option. But CategorySelect could be installed by default. Nothing happens until one clicks the "Save" button, and people will know whether the category relates or not. So there will not be much totally incorrect categorization. HotCat does more, but it has many flaws, mainly that it is incomprehensible at first. It needs an option for exposed text labels. --Timeshifter (talk) 07:01, 22 March 2012 (UTC)[reply]
Open search dropdown links in a new tab
[edit]When I click on a link in the dropdown menu that shows up while entering search terms the page shows up in the same tab. I would like the page to show up in a new tab.
If this can not be focused only on those dropdown links, then I would like all links on the Commons to open up in new tabs for me. --Timeshifter (talk) 18:04, 24 March 2012 (UTC)[reply]
- Those are the search suggestions. Using the script I made based on your suggestion above, in Firefox, clicking a search suggestion opens the page in a new tab. Does it not do that for you? Rd232 (talk) 20:44, 24 March 2012 (UTC)[reply]
- Yes. I must have remembered incorrectly, or been trying on Wikia or Wikipedia. For others reading this see:
- Wikia: Admin Forum:Open search results in new tab. JS or CSS?
- Wikipedia: en:Special:Search
- Commons: Special:Search
- And here again is the JS import on the Commons that fixes this problem:
- MediaWiki talk:Search-results-new-tab.js
- MediaWiki:Search-results-new-tab.js - Can a note be added to the top of this page pointing to the talk page for correct implementation?
- Will this JS import work on Wikipedia too? What code would I use to import from the Wikia JS page to Wikipedia? --Timeshifter (talk) 15:52, 25 March 2012 (UTC)[reply]
- The JS will work on Wikipedia too. You can ask a WP admin to copy the script to the equivalent location, or copy it to your userspace (either to a subpage, or directly into your common.js). You could also use the Commons script directly in your WP common.js using
importScriptURI ('https://commons.wikimedia.org/wiki/MediaWiki:Search-results-new-tab.js');
. Not sure what you mean by a note at the top of the page. I'm finding it quite useful, though, so I think I will propose it become a gadget. Rd232 (talk) 16:18, 25 March 2012 (UTC)[reply]
- The JS will work on Wikipedia too. You can ask a WP admin to copy the script to the equivalent location, or copy it to your userspace (either to a subpage, or directly into your common.js). You could also use the Commons script directly in your WP common.js using
- Great! I find it very useful. See the note at the top of this page:
- en:User:Yair rand/interwikiwatchlist.js
- I tried importing your JS to Wikipedia: en:Special:MyPage/common.js (en:User:Timeshifter/common.js)
- So far it is not working. Maybe it needs some time to take effect. I did a hard refresh and a purge. See:
- http://help.wikia.com/wiki/Help:JavaScript_and_CSS_Cheatsheet#Caching_Issues --Timeshifter (talk) 19:55, 25 March 2012 (UTC)[reply]
- Maybe. If it still doesn't work, copy the script itself to Wikipedia, so you're not using importscriptURI. Rd232 (talk) 22:09, 25 March 2012 (UTC)\[reply]
- It still does not work hours later using the import method. It does work when the JS is directly pasted into my JS page: en:User:Timeshifter/common.js
- --Timeshifter (talk) 00:42, 26 March 2012 (UTC)[reply]
- Maybe. If it still doesn't work, copy the script itself to Wikipedia, so you're not using importscriptURI. Rd232 (talk) 22:09, 25 March 2012 (UTC)\[reply]
- A way has been found to import the JS from the Commons to Wikipedia. See discussion here:
- MediaWiki talk:Search-results-new-tab.js --Timeshifter (talk) 20:50, 5 May 2012 (UTC)[reply]
Category search at top of Special:Search
[edit]Special:Search does not have category search in the line of items at the top. Category search is the number one thing people need.
This line of items:
- Help • Search categories • Show other tools
only shows up after a search is made. It should be up before a search is made too. --Timeshifter (talk) 05:41, 23 April 2012 (UTC)[reply]
Searches should go to search results list and not to a page
[edit]Wikia has just implemented this, and most people like the change. See:
Many people are irritated when search takes them to a page, when what they really wanted was a list of search results. And since there is no easy way to get to the Special:Search page most casual readers give up their searching.
Most casual readers do not know that Special:Search exists since it is not linked anywhere on Commons pages. The only way they know of it is if they happen to see the advanced search options at the bottom of a search results page. Even then they may not know that Special:Search exists as a separate page. And they still do not know how to get to it so that they can bookmark it. But why should people have to bookmark Special:Search? It should be linked from the sidebar of every page. --Timeshifter (talk) 22:12, 26 April 2012 (UTC)[reply]
Artificial Intelligence
[edit]There are many developments in image search, particularly in recent years. Aside from the proposals here perhaps this should be considered as well. You can read my Wikimania abstract or repoort for more info. -- とある白い猫 ちぃ? 01:55, 7 June 2012 (UTC)
Cat-a-lot should work from search result lists too
[edit]I could categorize far more images if Cat-a-lot worked from search result lists, too. I oftentimes see many images in search result lists that I know exactly where they should be categorized. If I could use Cat-a-lot to do so I would do it. Currently, I usually don't have the time to open each image page individually to change the category. If I could select a bunch of images out of a search result list, and then put them in a category via Cat-a-lot then it would be fast and easy.
Of course, Cat-a-lot would have to work so that it would only add the category to those images that aren't already in the category. --Timeshifter (talk) 14:14, 22 June 2012 (UTC)[reply]
- I had the impression this is already the case. -- RE rillke questions? 13:01, 30 June 2012 (UTC)[reply]
- I don't think so, though I am not sure. I see this: "Resolve double categories". It is listed in "Resolve double categories". under "Open bugs & features". I am going to add a link pointing here. --Timeshifter (talk) 07:31, 1 July 2012 (UTC)[reply]
- Why don't you simply test it? After opening Cat-a-lot, click on the description text which gets a green background and Cat-a-lot does not add duplicate categories (at least if the old categories are spelled correctly). -- RE rillke questions? 09:52, 2 July 2012 (UTC)[reply]
- OK. I tested it, and it does not add duplicate category names in the wikitext. I was having a hard time wrapping my mind around a way to set up a test. But I figured it out. I tested it by adding the same new category to 2 images in an existing category. I did not remove their original existing category. Then I used cat-a-lot to move the 2 images back to the original existing category. It did so fine without duplicating the wikitext for the original existing category.
- Why don't you simply test it? After opening Cat-a-lot, click on the description text which gets a green background and Cat-a-lot does not add duplicate categories (at least if the old categories are spelled correctly). -- RE rillke questions? 09:52, 2 July 2012 (UTC)[reply]
- I don't think so, though I am not sure. I see this: "Resolve double categories". It is listed in "Resolve double categories". under "Open bugs & features". I am going to add a link pointing here. --Timeshifter (talk) 07:31, 1 July 2012 (UTC)[reply]
- I removed the "Resolve double categories" bug from Help:Gadget-Cat-a-lot. Someone else had added that bug originally, but did not leave their signature. --Timeshifter (talk) 10:36, 3 July 2012 (UTC)[reply]
Use the CLDR extension to create multi-language search
[edit]The CLDR extension has the names of languages, countries, continents, currencies and such in many languages. Since English is the most common language used on commons, the idea is that if an user searches for an country, continent or an currency in another language then he would get search results from the equivalent English word as-well.
This idea could also be expanded to search for the same word in multiple languages at the same time, as long as the word is in the CLDR database.
In order for this to work, some coding changes on the Commons search would need to be done. First, the search would need to recognize the words in the CLDR database. If there is an match, it would search also for the same word in another language, or in several languages.--Snaevar (talk) 12:53, 1 July 2012 (UTC)[reply]
- hm, interesting; something like this multi-lingual searching has been mentioned before, but CLDR as a way to support it is helpful. We could certainly file a bug for it, but search development is unfortunately very neglected. mw:Extension:CLDR is already installed by the way, which is a point in favour of the idea. Rd232 (talk) 13:33, 1 July 2012 (UTC)[reply]
Ctrl-click to open search in new tab
[edit]I found a better method to open search buttons in new tabs. See:
Add the following JS to Special:MyPage/common.js
/* Ctrl-click to open search buttons in new tab */ $('#searchform, #search, #powersearch, #searchbox, .search-types, #search-types').bind('keyup keydown mousedown', function (e){ $(this).attr('target', e.ctrlKey?'_blank':'') })
Ctrl-click now works for opening both search buttons and search links in new tabs. See Special:Search
In Special:Search the search options initially found below the search form are in the form of links. The search form itself uses a submit button. As does the search form at the top right of every page. And the search forms for searching archives.
In Firefox and Internet Explorer one can ctrl-click to open links in new tabs. One can shift-click to open links in new windows. Since this is how most people have things set up for their browser, this Commons gadget conforms with this.
Ctrl-click also works for opening archive search in new tabs. Try the archive search here:
- Template:Village pump archives
- Commons:Village pump/Proposals/Header
- Commons:Village pump/Header --Timeshifter (talk) 20:11, 11 July 2012 (UTC)[reply]
Add Google site search option
[edit]In English Wikipedia preferences there is this gadget: "Add a selector to the Wikipedia search page allowing the use of external search engines."
When this gadget is enabled it allows one to choose from a dropdown menu of various search engines at en:Special:Search.
It could easily be expanded to also do site searches:
This would be especially useful during times when Wikimedia's search slows down or stops. See bugzilla:35691 and this discussion section higher up:
Ever thought about using social tagging approaches on Commons?
[edit]Social tagging (folksonomy) might provide a supplement to existing categories that enhances searchability of images. See for example the Final report of the steve.museum project. Social tagging can be combined with serious games. - Is anybody aware of initiatives or experiments related to social tagging and Wikimedia Commons? --Beat Estermann (talk) 13:06, 6 December 2012 (UTC)[reply]
- Well, to me categorisation is a form of social tagging − we may lack the fun part in it though :þ. Anyway, the only discussion I remembered on that subject took place on Commons-l. Jean-Fred (talk) 13:30, 6 December 2012 (UTC)[reply]
- The above discussion is preserved as an archive. Please do not modify it. Subsequent comments should be made in a new section.