User:LennardHofmann/GSoC 2022/Report 3
In the last two weeks, I cleaned up the Lua code of {{Wikidata Infobox}} and added some features requested on its talk page. As promised in the last report, I released the new infobox for community testing; see here for the announcement and changelog. I also tested the infobox on all pages that link to the sandbox module, and it seems to perform well: Most category pages take roughly 2 seconds to load.
Wikidata performance
[edit]You might wonder why the new infobox performs the "expensive" call mw.wikibase.getEntity('Q42')
instead of calling the "non-expensive" function mw.wikibase.getBestStatements
whenever needed.
The short answer is that we have to call getEntity
in order to put all labels and descriptions into an invisible HTML element so that searching for the page becomes easier. But getBestStatements
(and getAllStatements
) are actually also pretty slow on their first run. Check this out:
local starttime = os.clock()
mw.wikibase.getBestStatements('Q42', 'P31') -- usually takes 25–45 ms
print(os.clock() - starttime)
So why isn't getBestStatements marked as expensive? Because it's pretty fast when called on a Wikibase entity that has already been loaded:
local item = mw.wikibase.getEntity('Q42') -- usually takes 50–90 ms
mw.wikibase.getBestStatements('Q42', 'P31') -- takes 0.7 ms
Or alternatively:
mw.wikibase.getBestStatements('Q42', 'P18') -- usually takes 25–45 ms
mw.wikibase.getBestStatements('Q42', 'P31') -- takes 0.7 ms
As you can see, using getEntity
still comes with a significant performance cost as it needs to convert the whole entity into a Lua table, but if you're calling WikidataIB._getValue
over 300 times, using getEntity
might save time overall, as it allows you to avoid unnecessary calls with if item.claims[pid]
.
Luckily, fetching labels and sitelinks from unloaded entities is much faster than fetching statements, especially if the entity is large. However, if you want to generate a wikilink to a Commons category based on a QID, you often need to fetch the entity's topic's main category (P910), category related to list (P1754), and Commons category (P373) statements (see d:User:Mike Peel/Commons linking for details). This is why generating Commons links from large entities is slow.
TL;DR:
getEntity
isn't much slower than getBestStatements
. Avoid fetching statements from unloaded Wikibase entities when possible, but fetching labels and sitelinks is fine.