File:February 2023 Wikimedia Enterprise API community conversation meeting.webm

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search

Original file (WebM audio/video file, VP8/Vorbis, length 1 h 39 min 58 s, 1,218 × 540 pixels, 690 kbps overall, file size: 493.05 MB)

Captions

Captions

Add a one-line explanation of what this file represents

Summary

[edit]
Description
English: Recording of the Wikimedia Enterprise public "office hours" conversation video call with project staff, focusing on the 2022 Financial report & Product update published earlier that week.

The approximate timecode of questions discussed during the meeting were:

  • 1' Introductions
  • 5' Summary of report
  • 10' Financial accounting for free-users
  • 25' Ratio of Free/Paid queries on the API
  • 27' Meaning of 'optimization' in the software update
  • 32' Integration of Wikidata
  • 36' Uptime/availability stats
  • 42' Future revenue prospects
  • 49' Principles of commercial project in a nonprofit movement
  • 56' Effect on corporate donations
  • 62' API development plan
  • 65' Development expectations from customers
  • 70' Community considerations in development process
  • 75' Editor/reader privacy
  • 84' Movement communications in general
  • 95' Where has this been discussed before
  • 99' Concluding remarks

Some links which were referenced during the call: Product roadmap; WMF Financial reports (Form990); live status and incident history; project principles; OpenFuture.eu blogpost; Where has this been discussed before? FAQ.


Transcript [Machine generated and unedited]


0:04
hello good evening or good day my name is Liam Wyatt known as user Wittylama
0:10
this is the February 2023
0:15
office hours community call for the Wikimedia Enterprise project this is in
0:21
the immediate week of the publication of
0:29
the inaugural financial report of this project and also a product update
0:36
for some future software changes that are being introduced and this is the call to for anyone in the Wikipedia
0:43
Community who has any questions and wishes to ask them live the video is
0:48
being recorded and will be uploaded to Wikipedia comments as a result this session follows in the scope of the uh
0:56
saga space policy of the Wikimedia Foundation anyone can obviously have their camera
1:02
off or on or write comments in the chat as they like I will put a link in the chat and also
1:09
in the Wikipedia Commons file the link to the financial report
1:16
and I will try and keep notes of time code notes for when each question is listed so it's easy for people to come
1:22
back afterwards uh introducing the people from the Wikimedia Enterprise team in the room
1:28
perhaps if you could just uh introduce yourselves briefly those of you who who wish to
1:34
study with Lane sure and hi folks I am Lane Becker I am
1:39
the head of the Wikimedia Enterprise project at the foundation
1:45
um my primary focus is working with many of the folks here on the operational aspects of the business that includes
1:51
sort of figuring fitting it into the sort of the workings of the larger Foundation including financially so
1:57
quite involved alongside Liam with the creation of the financial reporting
2:03
uh uh and then just generally any aspect of business operations or sales
2:10
I'll pass it to Amy hi everybody
2:16
um I'm Amy Muller and I am the manager of customer success and support for Wikimedia Enterprise and that is exactly
2:24
what it sounds like another cutter
2:30
yes hi everybody I'm really relatively new Joiner in the team I'm a software
2:36
engineering the Greek Community Enterprise team and that's it for me Chuck Maybe
2:44
hello uh my name is Chuck and I am product marketing primarily and also
2:49
helping and around product itself and thank you
2:58
hello everyone my name is and I'm uh not operations manager on the
3:06
Enterprise team can I should also point out we have Dennis Bartel here uh who's not directly
3:15
with the Enterprise team but is a Wikipedia Foundation team member responsible for
3:21
foreign yeah I'm working with the movement strategy and governance team and I'm
3:27
here to support and assist in cases yeah and when it comes around with the German
3:32
community maybe to help and clarify um wordings and similar
3:37
that's what I pay for so we do actually in this team uh present here uh speak obviously English
3:45
French German Italian Portuguese possibly other language Dutch
3:52
um so if anyone has questions in those languages they can ask them natively if
3:58
they would like that all being said I passed the floor to uh anyone who has a burning question
4:06
of already in their mind from the financial report
4:11
otherwise I can describe the financial report verbally for those who have not read it yet or
4:17
have not seen it yet but I want to stop talking myself if people who'd come to
4:22
this meeting with a idea of a question in mind already
4:28
foreign
4:38
in that case I will describe what we have done here in this
4:44
report and hopefully that will generate some concerns or questions we do have
4:50
some questions pre-listed from the um mailing list and from the talk page so I
4:58
want to reiterate those uh later but I want to make sure people who are physically here in this call get
5:05
the first opportunity so please raise your hand if you have a question whenever you feel like it otherwise the
5:11
purpose of the financial report was that the Wikipedia Enterprise project is operated by a completely
5:19
owned commercial subsidiary organization of the Burkina Foundation a non-profit
5:25
organization both in America and as a commercial activity
5:32
uh it is has its own Financial requirements its own Financial purpose
5:39
to make a profit to give money back to the Wikimedia movement but
5:45
because that is entirely owned by the Wikimedia Foundation a non-profit organization there is no standard
5:53
required form for that financial report there is for the Wikipedia foundation
5:59
for non-profit organizations that is called the form 990 form 990 which is
6:05
published by the Wikipedia Foundation every year but this thing the committee Enterprise does not have an equivalent
6:11
so we wanted to make sure we had something published that was clear and
6:17
just for this project not hidden or unclear in the larger
6:24
Wikipedia Foundation report for the needs of the tax department in
6:31
America the IRS and for other legal and financial accountability requirements
6:36
all of this information is in the form 990 but that is not an easy thing to
6:42
read especially if you are not an American accountant uh it doesn't does not actually help to explain anything so
6:49
this report is written as a narrative it describes rather than just lists numbers
6:55
in a and clear structure the key messages
7:01
really are that we are approximately covering costs uh some months were
7:09
slightly above the cost of that month some months the revenue was slightly below the cost of that month but really
7:15
within a quite a narrow percentage range so we are at the moment
7:21
equal for Revenue per month and expense per month um this we hope will change in the
7:30
positive next year but we'll uh we can't promise that but that is the expectation
7:38
the uh taxation is a common uh question or
7:44
common request the short answer for the quick for the issue of Taxation is we did not pay any
7:50
tax because we were not yet profitable over the course of the entire year there is a much longer answer for that which
7:56
covers state and federal uh different uh rules like that but the the simple
8:03
answer is you don't pay tax if you're not yet profitable and we are not yet profitable because we have only been
8:09
existing for a year as a commercial activity the
8:15
organization itself the LLC limited liability company is entirely owned by the Wikimedia
8:23
foundation and all of us myself Lane the others in this team here are employees of public media Foundation it exists uh
8:31
entirely to be the legal thing that signs contracts with
8:37
commercial organizations and owns the risk of the legal risk of that promise
8:43
this ensures that the Wikimedia Foundation itself cannot be sued and
8:49
lose a billion dollars if we do something wrong that it's illegal protection uh it is not
8:57
it makes zero difference from a financial perspective and it makes zero difference from a reporting
9:05
transparency perspective which state that of America that LLC is registered
9:12
in this is a common question it is registered in the state of
9:18
Delaware because that is the state where most of corporate law in America exists
9:25
and therefore that is where contracts are most well understood to their
9:30
meaning and their implications if you need to go to court
9:35
it's not a lawyers like to know in advance what words mean
9:42
that is a summary of the of the financial section I have not touched the product section we can come back to that
9:48
there was a question from the mailing list but I do see a question in the chat
9:54
about profitability uh yeah okay you could you would feel
9:59
free to answer yourself directly okay good so yeah I have a question
10:06
about the prophet of her base
10:11
profitability so now it's all right so yeah because I think if you have none if
10:20
you offer your service for free for more than a trial use or some small
10:27
use I think then it is necessary that that
10:33
you have a kind of compensation for that or from my point of view it would be
10:38
great if then the Wikipedia Foundation but pay for these use cases for example for
10:46
the the things or services that are offered to the internet archive amount
10:52
to the Wikimedia LLC to make it transparent because I have looked at the form 990 and other then there is a
11:01
section about contributions to organizations and there is a long list
11:06
of different organizations and what the amount they receive from the
11:12
Wikimedia Foundation to support the work and I have seen that there's also
11:17
possible to to use different methods to evaluate the amount and so
11:26
from my point of view at least I think it would be possible also to evaluate the
11:32
contribution given to the internet archive for using the Wikimedia
11:39
Enterprise API yeah and I wish that this is published because I think then the
11:47
Wikimedia lse would be profitable because why at least from my it's a
11:53
speculative thing I can't guarantee that but I think they use it enough as that it
12:01
would be then profitable
12:08
I can I think either lane or Amy as the
12:13
customer service support manager uh might have different angles on this answer themselves I see ladies thinking
12:22
thinking hard there uh my immediate response to this is it I I think it
12:30
would be technically possible to account in a financial sense for the value of
12:35
the service provided to hex external XYZ external organization
12:43
uh but and that might fall under an accounting
12:49
law or an accounting rule from the finance department the fact that they
12:55
have not counted for this currently means that is
13:01
not a requirement otherwise they would do it there are various weird accounting
13:08
laws I've never heard of that because I'm not an accountant that we have to follow
13:13
um so the fact that three contracts are not formally counted as
13:21
a cost means that it is not an accounting standard to do it that way it might be
13:28
possible to describe uh in our um
13:34
report the the the kinds of usages but it is
13:41
important to also note that we cannot promise or just announce on
13:48
behalf of anyone else what they are doing with Wikimedia content without
13:54
their permission uh in exactly the same way as it is the
14:00
privacy of a reader that you can read whatever Wikipedia
14:05
article you want and the Wikimedia Foundation is not going to publish your
14:11
data without your commission you can write a blog post if you want we will
14:18
write and this is uh Chuck his uh work right uh blog posts and stories case
14:25
studies about how different customers free or or paid are using their content
14:31
but it's not in the context of a financial accounting regulation but a
14:37
use case uh from the lane can you elaborate yeah
14:44
well I was just going to say I actually don't know I just want to be upfront I
14:50
don't know how our finances will appear like how our part of the look we were
14:57
working on our own Financial transparency report to be specific about our finances but I do not yet know how
15:05
they are going to be accounted for on the form 990 that the foundation puts
15:10
out so it's unclear to me the degree to which it will be sort of separate or it will be Blended in it's certainly
15:17
something that we can try and understand better I think Liam in advance about actually happening but that's just as
15:24
one of the things that your question made me realize is I'm actually on because we tend not to
15:30
we have tended to focus mostly on just sort of making sure that we are clear about where the money is coming from and
15:36
how it is coming in to us we have been because it is out of our purview to say how the money gets spent that's on
15:42
that's for the foundation board we've been much less focused on sort of what happens with the money and how the money
15:48
gets reported after it passes through us um just just something to me just
15:54
something I wanted to be upfront about I should add that uh adding
16:00
it's it's quite possible that this could end up in the form 990 in the way you describe as an in-kind contribution a
16:08
grant uh effectively a grant to an external organization Maybe
16:13
but if we described that as profitability for the
16:21
Wikimedia Enterprise project I believe that would be understood as
16:27
trying to cheat because they are not giving us any money it is free so we are giving them
16:36
something of value but
16:41
trying to describe giving away something to us for free as improving our
16:48
profitability would be maybe that would work in an
16:54
accounting system but I think the community
16:59
would understand that as cheating because we are not actually getting any
17:05
money from it and so it does not help our profitability if anything it is an
17:10
expense because we have to provide something for free well that yeah just
17:15
to I want to give you a chance to respond I'll go but just to to hot pile on to
17:21
that comment I think um that is how we think about it but since I mean when
17:27
we're doing Financial estimation for the project we are actually making the assumption that there's going to be some
17:33
amount of money that we need to set aside to support um free use of the service whether it's
17:41
because it's a um you know a sort of a sister organization like internet archive that
17:47
we want to support or uh as I suspect might be the case in this coming year there's you know potentially other other
17:54
organizations or types of Partnerships where we'll just we'll see value for any number of reasons some tied to
18:01
Enterprise some tied to larger Mission goals I suspect to to um to not charging
18:06
we we just we've been baking that into our cost assumptions in the same way that if you still have a financial
18:12
report you saw that there's a um and over at 50 a 15 overhead charge
18:18
that we pay back which is pretty significant actually to the foundation in terms of our budget so there's just a
18:24
sort of an assumption that 15 of the money that we bring in just you know isn't counted in the way that we would
18:30
normally count it and uh I just I'm mentioning that because I
18:35
think we just sort of take all of these into account and then look towards profitability as a number that goes
18:40
beyond that and we feel I mean I feel like uh for me personally getting to this
18:47
this sort of Break Even moment that we're currently at as quickly as possible was sort of a very important
18:52
goal which is why we move towards it very quickly um the the next goal that we want to
18:58
move towards and I I can't say how fast we'll move towards it yet but the next goal we want to move towards is actually you know paying paying back the initial
19:05
investment so not just being profitable or uh achieving this so achieving profitability month over month so that
19:11
we know that there's more money coming in than we're spending month over month and then after that what we want to do
19:16
is say okay how quickly can we also pay back the initial investment I don't know how long that's going to
19:23
take us but that is from a goal perspective very much what we'd like to do next and quick quick as quickly as possible
19:31
but it takes it makes assumptions that that is again all with the assumption that there are these additional costs
19:36
such as supporting internet archive just baked into what we're doing
19:44
oh you're mutedly sorry yes I am um I hope that answers the question in
19:49
sufficient detail hooker but you uh welcome to clarify or add anything
19:54
Dennis in particular if you feel there's something being Lost in Translation
20:00
um please speak up I guess you just um had it fine but
20:06
maybe hope you can let us know more I hope your question has been answered in detail to hold you
20:13
yes it was answer thank you yeah I I yeah this is an important aspect what
20:21
you mentioned with maybe it's it could seem like a way of cheating if gender
20:27
Prophet seems yeah if there is a transfer payment from the
20:33
immediate Foundation to give me the others yeah the time what point I think and so yeah at the moment I don't have a
20:40
problem with it but I think at least I would be interested I think
20:47
it would be great if you find a way to publish
20:53
an amount but also I can't understand I don't think in some place yeah I think that's
21:00
important to pay attention to privacy and also think about what ranking is needed what not but I think it would be
21:06
interesting to get a bit uh uh and
21:12
understanding what part of the Services of are used for free and what
21:20
are and how much is it used through a contract where
21:26
organizations pay to I think that would be interesting if you publish their kind of ratios or
21:33
maybe to understand it a bit more yeah yeah do you mean in a sense of the
21:39
proportion of use which is by paying companies paying
21:46
customers compared to the proportion of use by free
21:52
companies and yes yes this is what I mean
21:59
uh Chuck might be able to get into that but my my guess is
22:04
not yet because we are too young to have any Baseline data
22:15
Chuck did you have anything you could speak to on the question of proportion of free versus paid over time
22:25
um I don't have anything substantial to add to that I don't have that data in front
22:31
of me and it's not something we're currently tracking as far as I know yeah I would say
22:37
uh the proportion here's what I would say the proportion of use for paying customers is much much much much much
22:44
more significant than it is for the free customers um in part that's because how a free
22:51
customer can access how a free customer can access it is
22:56
limited relative to our paying customers who have significantly more access to the data it can you know so for example
23:02
if you are using our snapshot API which is if you're familiar with our service is kind of the equivalent the commercial
23:09
equivalent of the the free Wikimedia dumps API um the free version of the snapshot API
23:15
allows you you can download it as many times as you want but it only updates once every 30 days or so once a month so
23:23
the likelihood that you're going to download that frequently is low because it's just the same data until it
23:29
refreshes a month later it's on a slower Cadence than the free dumps which refresh every two weeks but for our
23:35
paying customers you know they can download that once a day and they can also download hourly diffs uh so that's
23:42
a like a much more significant load and since we measure everything Based on data egress data out
23:49
um you know it's a as you can imagine even just on that one API of significantly more use for our paying
23:55
customers uh and that's you know uh equally true of the for example we have a um our real-time API which is what we
24:03
also call our fire hose API because it's the one that again if you're familiar with our free API Universe it's the one
24:09
that's that's comparable to the event streams AP or event stream API
24:14
um and that it's uh sort of firing all of the edits and all the changes that are happening across our projects
24:20
um that is not accessible to free users but it is accessible to some of our paying customers
24:26
um and so that's again a place where there's significantly more data usage so I I would say we could on the one hand I
24:33
could say we could try and get those numbers and pull them together but if I'm being honest I don't think they'd be very helpful useful partly because we
24:40
haven't had free customers for that long um and partly because I just like anecdotally I can tell you the
24:46
difference is going to be huge I should clarify the um
24:51
there's two categories we don't call them both free
24:56
um but from the General concept of the word there are two categories of three there is the
25:01
trial user the account on the website that anyone can sign up for and get the
25:08
monthly dump that's fine there's no um you don't have to talk to anyone to do it and you can also download the via the
25:16
existing Wikipedia Foundation dumps then there is the sort of special case three
25:22
version so if this is the internet archive for example they get access to
25:27
the same thing that Google has which is
25:33
as much frequency as they want so that is not automatically available
25:40
to for free to anyone in the world because it's a commercial service primarily but
25:45
secondly basically no one has a use case for that
25:51
scale the internet archive is a very rare organization because it has
25:57
a genuine need for that scale I cannot speak to
26:03
how they are using it uh yet or in the future that's their own internal systems
26:10
but there is basically no one who is a
26:16
non-profit organization or a volunteer or an individual even academic
26:21
who has a need for the right now everything
26:28
every change speed and volume commercial organizations like search engines do
26:35
need that do want that and they are paying for it because that's an expensive and difficult service to
26:41
provide and that is where all of the volume of data the volume of
26:48
commercial value comes from so the proportion of free
26:53
users um the the amount of data that is coming from free is very small not because it's
27:01
restricted but because there's no one who has that need for the
27:07
huge amounts of data that uh large commercial organizations require
27:15
since we are speaking uh on the the technical side
27:21
data volumes uh I hope that we can come back to you if you had a follow-up question on this
27:27
uh but for the financial perspective but I wanted to raise a question that was asked on the mailing list uh uh with
27:36
regards to the statement in the uh software update
27:43
saying I'm curious about the implications for optimization that was
27:49
mentioned in the the changes that have occurred over this last year and the word
27:55
optimization we have optimized that was written in the product update
28:03
perhaps Ricardo or chuck could speak to
28:08
what was meant by optimizing
28:14
yeah I can go ahead just a note before I go not that I would try to hide anything
28:19
I just wanted to give a note that I'm three months here so if I miss something I'm happy if you give any questions to
28:27
Liam post for it I'm happy to clarify them in margita just because this is going to be a record I wanted to be as
28:33
much as straight as I can so the terms of optimization is simple just like any software when you build the first
28:39
version We and the identify points that can be improved either points that can
28:45
give us Financial uh gain or optimization or give us performance
28:51
improvements from what we previously had so in that performance in that in that
28:57
regard what we try to do with this new releases is coming out is exactly that is up to optimizing internals of the
29:04
software in order to achieve better performance and therefore lose use less resources for example and other things
29:12
like optimization of the the code itself and making it clean and more
29:18
more Enterprise let's call it like that I hope that clarifies if somebody has
29:23
any other further questions I'm happy to clarify also
29:28
yeah and um in the product updates that's specifically what I was
29:34
um overshadowing to keep it less technical
29:42
um uh with the the blog post that we'll have on our news
29:47
page um I can get into a little more specifics about that but essentially
29:52
that's a it's like a code audit a restructure
29:57
um based on you know what we've learned in the last couple years of having the software written from a very V1 and a
30:04
beta standpoint to being in production how people are using it and how people would like
30:10
to use it in the future and features that we would like to build um so a lot of the
30:16
restructuring and rewriting of the kind of the architectural pace of that code um is helping with what we're gonna you
30:24
know have in the pipeline to build we needed this to exist to in order to build on top of that if
30:31
still trying to keep it a little non-technical there
30:38
so uh we'll make sure that uh the person who asked that question will get a link
30:43
to this particular answer for the detail uh Martin uh as the the other person in the
30:52
room who hasn't had a chance to ask a question specifically I wanted to particularly call on you if you did want to ask
30:59
anything or you you don't have to but you're here so I want to make sure you had a chance uh okay chat message you're
31:06
fine just listening great uh uh to come back to you
31:13
that did the previous answers or the previous things
31:18
are they clarified sufficiently answered sufficiently for your
31:24
um for your needs or yes you have other questions that you had not asked yet which we want to to go
31:31
to yes I have a question and I've then stroke me so I I asked it in German and
31:40
then he will translate and I want to try it then better yes I am not so good
31:48
in English I saw my name is
32:07
wikidata in the RP services to integrine or from Wiki data updates to the common
32:14
order give this foreign
32:20
foreign yeah and the question is are there any plans
32:25
um and if there are then when um to offer the same services with
32:31
Wikimedia Enterprises for Wiki data especially as far as I understand it's
32:36
part of the bigger thing already but maybe you might clarify yeah you want me to take this one please
32:44
okay um we yes we would very uh it has always been part of the plan to integrate
32:52
wikidata into the Wikimedia Enterprise product in some way the um the qid
32:59
structure already exists in our data set so uh was that was you know done
33:06
uh or we we use that as sort of the key identifier in our data set specifically with an eye towards you know a world in
33:13
which we could integrate Wiki data into our apis um we have been in constant
33:20
communication with the wikidata team uh at Wikimedia Deutschland because uh for
33:26
us to do this we have to ensure that we do it in a way that is uh as because
33:31
they are the team that supports it um it's very important to us obviously and also to them that we we integ that
33:39
if and when ideally when we integrate it that we do it in a way that is respectful of what they have already
33:46
done and is um uh is going to con I would say going to
33:51
contribute meaningfully to their goals um you know uh so a lot of our
33:59
conversations are really around that and working towards alignment on that and that's taken a little while but I feel
34:06
like we've had very good conversations with our team and we're definitely getting closer
34:11
um all of which is to say we don't have a timeline I think everyone on our team would love to see it happen this year in
34:19
calendar year 2023. um we know it's very important to
34:24
uh every customer that we've had or every potential customer that we've talked to
34:31
particularly for example like smaller search engines that maybe don't have the
34:36
um the resources available on their end to do the elaborate parsing work that is sometimes required to use not sometimes
34:43
that is required to use Wikipedia data for example um it's much easier for them for obvious
34:50
reasons as the knowledge graph to get started with wikidata so they would love to see it integrated larger companies
34:57
that we work with such as Google make extensive use of wikidata and would love to see it integrated as well
35:03
um we think it would be valuable for the product I think it could potentially be really cool so there's that due
35:10
um it's just a question of you know what is how and this is very much on my mind which
35:17
is why I'm making this space I want to I just I want to make sure that we find a way to do it that it's like uh
35:24
yeah that is like an agreement between us and we committed and then we could add a team where everybody is clear on
35:30
what uh clear on what's Happening and that everybody's getting getting sort of what they need out of the partnership
35:37
and that's that's what we've been working towards for the last year and I think we're
35:42
I feel good but I would not say 100 certain that we
35:47
will get it done this year and the first version of that integration in 2023.
35:53
um but I'm I'm not at a place yet where we can make any promises on that front because we're still in discussions with
36:00
them about what it will take to make it happen
36:05
I guess I guess what I'm saying is it's there's technical challenges but the things that we're trying to work through
36:11
that are bigger than that are not necessarily technical challenges they're just organizational and structural
36:17
challenges to make sure that it's um done right done in the right way
36:22
carefully and thoughtfully and in everyone's best interests
36:31
yeah thank you foreign I have another short question
36:38
how high was the down time in the last year so how
36:44
how many hours was the Wikimedia Enterprise API not
36:52
available last year
36:57
I feel like the best people to answer that question aren't here but maybe Ricardo knows
37:10
I'm sorry no oh I think the the answer is I don't know
37:16
the number specific number but we have a uh yes I put a link in the in the chat
37:22
to the status page status dot enterprise.wikimedia.com that is the
37:28
uptime statistics live which covers the last
37:34
90 days nine zero days which is 100 percent
37:40
we have a contractual obligation for 99.9 uh
37:48
which is different it says SLA not a SLO so this is a contractual
37:54
requirement as part of a legal and financial responsibility this is different to the
38:04
normal Wikimedia Foundation uptime uh what you might have recently seen that Wiki data or the Wikimedia Foundation is
38:12
talking about SLO a which is a best effort requirement in fact I'm not sure
38:18
if that's the technical description but it's the it is not a legal requirement but a we will do our best to meet this
38:25
standard we in the Enterprise team have a contractual requirement that has
38:31
Financial penalties if we do not meet that requirement this also includes
38:37
Amy's team which is did you answer the phone when I called you fast enough did you answer my email
38:44
asking for a reset password fast enough this is quite different to the way the
38:50
rest of the Wikimedia movement operates and because of that financial and legal requirement is much more expensive to
38:57
operate it requires a level of redundancy that is unnecessary for
39:04
normal humans but because we have the contract that
39:10
requires this we have to have a level of Technical and human redundancy in that
39:16
support which would be basically it would be a waste to use donor donor
39:22
money for that level of requirement
39:27
but if they want to pay for it fine it's also a reason why we are
39:33
hosting at least for the time being while we get from zero to up to speed why the hosting is on AWS is
39:42
on the external systems because the they are much larger than Wikipedia is and
39:49
they can be responsible for building the infrastructure that allows that degree of service requirement
39:57
when we are more stable we have all the customers that we we know
40:02
from this month to next month it's not going to be twice as much service requirement it will be much more stable
40:09
then it is much more um sensible to talk about moving that kind of Hosting inside the Wikipedia
40:16
Foundation but it would be an extreme Financial Risk to try and
40:24
put that legal burden on our internal infrastructure
40:30
which is not designed for it it can do it but it should but our the Wikimedia
40:37
foundation's internal infrastructure should not be held responsible to the arbitrary
40:43
and extreme legal and financial threats of the contract
40:49
uh just because that's what an external organization feels like it needs
40:56
the short answer is we are very good but that costs a lot of money to do it and that's why they pay a lot of money
41:02
to do it I hope that answers the question
41:09
uh Amy wanted to do you want to say that or you're going to read that out to a chat message
41:19
yeah sorry as if the chat doesn't translate through to the recording then sure you can read it out precisely the
41:26
recording doesn't doesn't kick uh so to clarify those uptime and customer support response times the guarantees
41:32
are for the paying customers we do not offer that kind of contractual guarantee to a free customer because they're not
41:39
paying for it basically uh so we can't give them a discount next month because
41:44
it's already free um there was a question in the chat from
41:51
Andrew uh well a hand up in the chat from entry would you like to
41:57
speak your your question yeah um so I hello everyone uh sorry about
42:03
joining a little bit late um I was looking through the
42:09
um financial performance portion and I saw that the expenses have grown to
42:17
above revenues and I'm curious uh what you project the
42:24
increase in Revenue to because there's only so many possible organizations that
42:30
could use this type of service so I'm curious how you anticipate your growth to look say over the rest of this year
42:38
or later down the line yeah I can take that Liam if you want
42:44
um we feel so okay so depending on how you look at
42:50
the market that we're serving and the way that I'll just give you sort of my take on it the way that we look at the
42:56
market is there's a you know broadly a very very wide range of uses for
43:01
Wikipedia content Wikipedia data at Large across all of the projects
43:08
um and you know in some of the research that we've done and poked around it's actually kind of amazing how widely it's
43:13
used I mean I guess it's not really it's not surprising but it still continues to amaze me uh that said you know you can't
43:20
start a product and say oh we're going to sell to everybody because then you don't really have much Focus so our
43:25
focus on our initial Market as we describe it you can this is I think fairly clear from looking at the homepage of the website is search
43:32
engines and voice assistance um while it is true that that's not a huge
43:38
Market uh some of the players in it have a lot of money as you know
43:45
um and actually there has been kind of a Resurgence in the last
43:50
year or two uh and folks that are actually going after the search engine Market again that was true even before
43:56
the kind of the recent chat GPT if you're familiar with what's going on with um openai and the AI space there's
44:03
also a renewed interest in Alternative forms of searching through artificial intelligence and that actually also kind
44:10
of fits into when we look at the customer profile for what we think of as search engines and voice assistance
44:17
uh folks who are looking at uh replacing not necessarily search but searching
44:22
with artificial intelligence very much fit that profile as well so it is actually even compared to when we
44:28
started thinking about this as our Market two years ago it is actually a grow it is a renewed and growing Market
44:35
um and one that's fairly well funded so all of it just to say like that's kind
44:41
of why we are focused on this market and I think we'll continue to focus on this market so there's there's other markets
44:47
that have expressed interest uh that I don't think our product is really well suited for at the moment education is
44:54
one there's a lot of interest in um uh with a lot of interest from
45:00
educational companies they want something that's a little easier to make to put into like a question and answer format
45:06
our apis and our structure currently does not allow that but I can see a future where that would be it would be
45:11
possible to tailor it more towards something that looked like a question and answer format
45:17
um Financial companies have expressed a lot of interest uh separate apart from the search capacity
45:23
for artificial intelligence there's a lot of use of um Wikimedia data for training purposes and while right now
45:30
most of that's being done with the free dumps I think there's potentially Market opportunity there as well although for
45:38
all the reasons you can imagine we want to tread carefully when when thinking about that as a market
45:44
um all of it is to say I actually think the potential from a business perspective is quite High
45:49
uh in each of those markets we have to figure out what they want to buy you know which is made you know part part of
45:56
the child part of the challenge for us always is that uh we're trying to sell something that we also give away for
46:02
free and that's frankly quite hard so you have to think really and unlike most com like if we were a more traditional
46:08
data sales company we would just not give it away for free and everybody would have to pay and make a lot of
46:14
money and go home obviously that's quite counter to the mission of
46:19
um the movement so that's not what we do instead we have to come up with more nuanced ways to figure out how to build
46:25
alternative approaches or alternative services that will have value in these markets
46:30
um I would say I don't I don't think it's a small Market I think it's actually a growing one I would agree that's overall I would agree that the
46:37
search engine and voice system Market uh isn't huge but we've really only begun to sell into it and I think we have a
46:44
lot of prospects for this year amongst both um some of the large Global search engines that we have not yet sold to and
46:51
also some of the new intros that the search Market that are starting to show a lot of promise and uh and have a lot
46:57
of capital to do it um and that's where a lot of our Focus they share is going to be
47:02
uh I don't have projections for this year quite yet in terms so I can't share
47:08
those out and uh any projections that we come up with actually have to be um approved by the board before they're
47:15
actually by the Wikimedia Foundation board before they're actually made public so that will be something that
47:20
will happen in the coming months as part of the annual planning process that's kicking off right now
47:26
um but I can say that I feel very confident and I think the whole team feels very confident that we will be able to get into a place of comfortable
47:32
uh month month over month profitability uh in Cal in this in calendar year 2023
47:39
um we have some really good sales prospects in front of us and a lot of a lot of I mean just a lot of interest in
47:45
being able to make use of this product in a more with the commercial guarantees that Amy was talking about before that's
47:51
actually a huge problem for a lot of the larger search engines in particular and so we see a lot of interest we think we
47:58
can do it the fact that we're already a break even is the reason that I feel pretty confident saying and the fact
48:03
that we're both at break even and not planning on expanding the team too substantially Beyond where it is right now suggests to me that we should be
48:10
able to get into profitability again month over month profit ability fairly quickly
48:16
with the emphasis on the Fair Lake because I don't know how long exactly it's going to take and don't want to make any promises
48:24
helpful thank you okay yeah I'm always happy to discuss this in more detail so
48:29
any of that needs clarification let me know I did have a bit of a follow-up um and
48:35
you slightly touched on that in that uh Wikipedia and Wikimedia products in
48:41
general have been from the start uh marketed as
48:46
free accessible to everyone everyone has access to all the same information in
48:54
theory eventually uh I mean already more information than anyone can
49:01
ever consume in one lifetime um and so I'm curious how
49:08
you reconcile this notion of selling a um
49:15
of a not-for-profit uh governing entity selling this data to for-profit private
49:25
corporations and what that would look like fur an average
49:32
human trying to use a Wikimedia product as it was originally
49:39
described certainly I this is a question we obviously deal with as a fundamental an
49:47
ideological level not just at a commercial advertising business level
49:53
but it's this project is part of the Wikimedia Foundation part of the wiki media
49:58
movement so it has to operate within not against that those principles
50:05
um you know as a long-standing Wikimedia volunteer myself that's that's why I'm here
50:11
as a wikimedian not not because I'm trying to uh uh run counter to it but the the two
50:20
reasons the two issues is wicked media content
50:27
has always whilst the organization is non-profit and non-commercial websites
50:32
don't have ads and you know all that jazz which you you and I know
50:38
um has always been available for anyone to use for any purpose including commercially
50:43
and these organizations to whom this project is selling the API access are
50:52
already using Wikimedia content extensively and making profit from it
51:00
their use of that information costs the movement money and time and effort to
51:07
maintain the service to them it is not free because of the the high
51:13
requirements they place on the infrastructure it's the method methodology we sorry the
51:20
metaphor we use is imagine if a large Factory attached itself to the city
51:28
water supply and then said you need to give me industrial quantities of water
51:35
at the same at high pressure that you that you give also to the
51:42
individual people the individual houses that cost the city a lot of money to
51:47
provide that water even though the water is exactly the same so if the city gives it to those
51:54
companies you at the same price or the free like it
52:00
gives to the rest of the the community that means that that the communities
52:06
uh the service provided to the rest of the community is diminished because they
52:12
have to concentrate on the largest loudest heaviest user of the service
52:18
so we this project for answer one is this project flips that requirement they
52:26
pay us and subsidize the Wikimedia movement provide Revenue diversification
52:32
to the Wikimedia movement for their use of the same information at
52:39
high speed instead of what has happened all the time until this last year which is the
52:44
Wikimedia movement and donor money is subsidizing them and their requirements
52:49
and I think that's unfair um it's the classic tragedy of the
52:55
commons story or here's here's an available field one large farmer comes
53:00
on and and with all his cows eats all the grass uh okay if he wants to have
53:05
that many cows we can have a separate feel just for that farmer uh and
53:11
everyone else can have access to the normal field the second reason is
53:16
this is a crucial distinction this project is not selling data it is
53:22
often it is easy to misinterpret sometimes misinterpret deliberately by
53:28
people trying to um find a controversy that does not exist
53:35
but most of the time it is misinterpreting because it's standard that an API or an API company sells the
53:42
content if you are a weather API if you are a traffic API you are selling the
53:49
traffic information or the weather information well the finance information we are not selling the information we
53:56
are selling the pipe the information is the same and it's freely licensed in the
54:02
case of wikidata it's cc0 so it does not even require attribution by law
54:08
the content is the same we structure it a little differently in terms of the API
54:14
metadata but none of the structure or none of the information in that metadata is new or
54:21
secret it's things like does this page have a different Sudden
54:28
Change in um readers number of readers that's something that wikimedians already use or does this page have a
54:36
sudden number of uh IP editors Anonymous editors that's information that wikimedians already use to make
54:44
decisions about should the article be locked for only administrators or should
54:49
we um does it need temporary protections
54:54
that kind of information has not been available in an API before and we're trying to add that in but it's it's
55:01
already existing data so what we are selling is not the water
55:06
to return to my original metaphor is not the water we're selling the pipe and the
55:13
contractual guarantee that the water will be provided at a certain
55:19
water pressure and that there will be a telephone available for you to answer uh if you
55:25
have a question it's quite different from uh the idea of selling better water
55:31
which we do not do that water is already available for everyone for free
55:36
I hope that those two issues responds to the question of the sort of
55:42
the the ethics of what we're doing Andrew uh that does provide some clarification
55:50
yeah um and then also uh I'm curious uh there have been a number of instances where uh
55:57
especially Google uh to the extent that there's even a Wikipedia page called uh
56:06
Wikipedia are called Google and Wikipedia um where Google has donated large sums
56:14
of money to the Wikimedia Foundation over the last multiple years and so I'm
56:20
curious uh how many other companies to which you're
56:26
marketing this project already donate to the Wikimedia movement in some capacity
56:34
and do you anticipate those donations would go down somewhat proportionally to
56:41
the amount that you're going to be charging them for API access
56:47
one of the yeah Lane did you want to answer that or oh uh well I'll just say I mean I'll just I would like you to
56:53
answer it actually but I'll just say this is actually a was a topic at the very beginning of the when Enterprise
56:59
was first getting going this was actually uh the topic of whether or not companies would would see you know their
57:06
donations versus sort of commercial payments as um uh you know complementary or in Conflict
57:13
where one would go down and the other would go up and there wasn't a lot of certainty and in fact in those cases
57:19
where the customers have been had donated it's it seems like or or uh or
57:25
considering it it seems like it's very it's very Case by case some of them treat them as very separate because of
57:30
the way their organizations work some of them absolutely see the money moving over from one to the other
57:36
um so it's not it's not it hasn't been some of them haven't ever donated and it's just not
57:41
the way that they're because they're not large enough um or because they're not the kind of organization that donates money no
57:46
matter how large they get uh what what I what I will say though is
57:52
that um you know in this one of the reasons that Enterprise and the idea of
57:57
Revenue diversification around Enterprise with corporations was a priority uh from the financial
58:05
side of things was because it is it is a more consistent and stable guarantee of income you know every year that a
58:12
company donates they have to make the decision to donate every single time and you know our goal is to sign multi-year
58:18
contracts where we're not necessarily locking them into our service but providing them with additional value in
58:24
such a way that renewal of those contracts when the time comes to renew is an obvious thing for them to do that's kind of how we think about our
58:32
business and what we're laser focused on doing and that's just a lot more comforting I think when it
58:38
comes to working with commercial organizations than kind of going hat in hand some year after year and hoping that they'll want to make the same big
58:44
donation next year that they did this year particularly in this moment where a lot of them are pulling back from a lot
58:50
of their expenses and even before they let go of 10 000 people on their staff they let go of giving away any of their
58:57
money for philanthropic causes so so yes yes and sometimes yes sometimes
59:03
no sometimes and overall a hard Revenue stream to depend on in any sort of
59:10
consistent or planful way when uh when it's donations
59:16
do you have anything to add um I think pretty much covered it
59:22
there's there's an issue of how uh those kinds of companies deal with donations versus contracts as from their
59:30
perspective tax um but primarily as as Lane said
59:36
it is a more consistent I mean these kinds of companies Google and so forth have various kinds of relationships
59:45
um with Wikimedia movement funding Affiliates and sponsoring conferences
59:51
sometimes and so forth and this is trying to be deliberately not owning the
59:57
entire relationship with the rest of the organ of the movement uh we're just talking about API access
1:00:05
but in as much as that might affect that company's interest in donating
1:00:11
I think the Wikimedia movement is much more
1:00:16
that is much better served having a relationship of
1:00:23
of a a formal legal relationship like this does with a giant company
1:00:28
than going to them every year every two years begging for a donation because it's
1:00:36
quite different principle than donating the movements focus on donation or the community
1:00:42
foundations focus on donations from small donors to the five dollars the ten
1:00:48
dollars that's crucial for our independence as a movement not having a
1:00:54
large proportion of the money coming from one big donor who then gets to
1:00:59
influence the organization politically and we have checks and balances balances on the maximum amount of any individual
1:01:07
company or total we can obtain as well but having a contractual clear
1:01:15
relationship that is long-term with these kinds of companies is much more
1:01:20
stable financially and much more there's more confidence in our
1:01:25
independence and what we owe and what they owe us then writing once every year or two years say
1:01:32
hey please could you uh donate some money because you know you use this a lot uh that's
1:01:40
they are making so much money off the Wikimedia knowledge that we should stand
1:01:46
up and actually say that they need to invest in our movement rather than
1:01:53
be look magnanimous every time they happen to donate a little bit here or
1:02:00
there is a question in the chat from Martin I
1:02:05
wanted to not forget um Ricardo could you speak to this the question is
1:02:11
um the additional features we mentioned earlier uh in the in the update about
1:02:17
how we're building out the software are these being built due to customer
1:02:22
request or is there a strategic plan for developments to the API
1:02:28
uh I I assume this is not an either or but a both yeah I think is yeah go ahead Lane no no
1:02:36
you go please yeah okay yeah it's a it's a big a mix of
1:02:42
of both um as a revenue a revenue generation company let's say like that we listen to
1:02:50
our customers and their request as features we have a clear example for example the summary delete section of
1:02:56
web page is offered by the Wikimedia apis giving a summary and the customers
1:03:02
requested us for example to give a different perspective on that that we
1:03:07
are start offering but as we are offering and releasing that at the same time we're giving back to the Wikimedia
1:03:14
and there there are conversations going on to the development we did will be put
1:03:20
back in the public API so that we continue to be using Liam's terms we continue to be the pipe and not any
1:03:26
content generation so that we don't allow companies to influence that content in any way or Etc that's my
1:03:33
perspective and uh that's how I see it so in terms of that is there is a
1:03:38
strategic plan we have product managers clearly designing where we want to go and how are you going to go but we
1:03:44
listen to our customers and to the companies we interact with in order to go after the revenue
1:03:56
tendency the question you had with regards to planet yes I see in the chat uh you're happy with with that answer
1:04:03
there is a roadmap listed and a quarterly update listed on the media
1:04:09
wiki page for Enterprise which is obviously technically focused and Chuck
1:04:15
will be publishing on the Wikipedia Enterprise news page uh in the next couple of weeks a more
1:04:22
technically oriented update about here are the new features available that's not describing the roadmap or the
1:04:29
Strategic plan in general but describing the features of now that are coming in the next week or two
1:04:37
um but on our media wiki page in general is a roadmap Uh current development
1:04:42
priorities Etc including down to things like the fabricated Board of what is the
1:04:49
individual bugs and things being worked on this week
1:04:55
depending on your level of strategic thinking about feature development
1:05:04
uh were there further questions from anyone because that is the list of questions that were written in the chat
1:05:12
to now Martin follow-up question what if a customer request exceeds the time of
1:05:17
your developers do you hire more developers if it makes sense to fulfill this request can you use the resources
1:05:23
of the Wikipedia foundations teams itself good practical question it is yeah I
1:05:30
mean we don't do every we don't do everything they ask for um and it and it tends to be more of a
1:05:35
sort of a trying trying to understand uh particularly of life some of the work that we're trying to do
1:05:42
in um making the uh as you might have gotten from the product update some of
1:05:48
our Focus right now including like for example integrating Wiki data is under what we call the heading of you know
1:05:53
machine readability which means you know both like wikidata which is obviously designed to be machine readable so
1:06:00
integrating that but also looking at looking at the um looking at the the
1:06:05
what we can extract from the data say for example a Wikipedia page or
1:06:10
Wikipedia article um like what we can extract from that that would be useful right so not trying
1:06:16
to make it all machine readable all at once but saying okay we know certain parts of this page are useful and
1:06:21
necessary and what can we pull out of there and so one of the the product manager Stephanie is unfortunately not
1:06:27
here who's been working on that but part of the way that she's gone about it is she's just talked to a lot of current customers and potential customers
1:06:34
um that are very interested in that feature to try and understand okay what are the parts of the page that would be
1:06:40
most useful for you to have um so it's so it's really just more a
1:06:46
question of like okay starting to understand that allows us to prioritize um better but I don't know that we're
1:06:53
necessarily going to build them everything that they wanted uh uh so that's sort of how we do it I
1:06:58
would say we we are not looking to hire additional or not we are actually in the process of hiring some more developers
1:07:04
but beyond that that's sort of to build out the team for what we see in front of us
1:07:09
um but uh but we're not looking to sort of scale up significantly beyond that in terms of the engineering team rather
1:07:15
we're trying to make sure that the scope of work that we're doing is scale scaled to the number of Engineers that we have
1:07:21
um and as for working with resources from the foundation itself we generally try to shy away from that the overhead
1:07:28
charge that you may have that you may have seen in the financial report that I was discussing earlier it's not really about pulling in engineering or product
1:07:35
support it's more um uh sort of more business oriented support so it's about paying for the
1:07:41
legal team it's about paying the legal team's time it's about paying for the time of the folks who do RIT Services
1:07:47
it's about paying for Finance and Accounting codes um
1:07:52
well I think we're generally speaking we're trying this I mean there's a lot of work to do on the foundation side to
1:07:59
support the technology over there and our goal is to um to our our goal on our team is to
1:08:05
pull over as much of the overtime as to pull over as much of the commercial activity on the our apis as possible
1:08:11
specifically so that it's less work for the folks over at the foundation to support commercial usage in the way that
1:08:18
Liam was describing earlier so so generally speaking we would prefer not to use the resources of the rest of the
1:08:23
foundation whenever possible there are certainly times when we talk to them about things that we need or aspects of
1:08:29
like parts of the foundation's core infrastructure services that we would like to see improved but we sort of see
1:08:35
ourselves as just one one voice and one stakeholder among many when those teams are making their decisions
1:08:41
we also don't get to keep our own money in the sense oh yeah the
1:08:47
money passes right there it's by yeah it goes to where the rest of the money the donations and so forth go to and then we
1:08:54
ask for resources like every other team and that's appropriate it means you don't get everything you want but it
1:09:00
means just like the fundraising team doesn't get everything it wants even though it was the one that got the money
1:09:06
in the first place uh um Amy I see your hand up to answer this question there's also a question in the
1:09:11
chat from Andrew I will get to that next thanks yeah I was just gonna just um
1:09:16
further clarify on that little point just um you know our customer it's not client services so you know our
1:09:23
customers they're not dictating a timeline for us on things that we're doing you know they might have requests
1:09:29
and we have conversations with them about it but of course it you know has to fit in with our our bigger strategy
1:09:35
as well and and we you know figure out what timeline works for the scope
1:09:41
um and size of our team and our own resources and um you know so you know
1:09:46
there's the question about like if uh if their request exceeds
1:09:52
um sort of our ability do we hires and you know so yeah we we don't do that and
1:09:57
they don't they don't dictate to us like when something has to be done by or you know it's different in that way like
1:10:05
what what a client services sort of a model might look like
1:10:14
okay thanks Amy uh Martin seems in the chat happy with with the detail of the response there's a question here from
1:10:20
Andrew would you like to read it yourself or should I read it out for you
1:10:25
yeah I was just curious um if the company
1:10:31
comes to the uh Enterprise team and asks for a specific API feature to be made
1:10:38
available to them would the people who spend hours every day making Wikimedia
1:10:48
what it is would they have a way to potentially object to the
1:10:55
creation of such features and will such objections be taken into consideration
1:11:01
by the team yeah I love this question um it's really it's interesting I mean
1:11:08
you know the way that we're trying to approach the product roadmap that we put together is by sort of you know again
1:11:15
sort of doing the research understanding what it is that companies are asking us for figuring out you know
1:11:20
uh where is what like the range of companies that we're trying to serve right now are trying to pull aboard as
1:11:25
customers you know where where is there the most overlap right where can we get the most most for our Resources by
1:11:31
focusing on like a particular feature that we know the vast majority of them want that's the ideal um and then we roll all of that up into
1:11:38
a product roadmap um which we do publish publicly and maybe you can drop the link in thank you
1:11:45
right there I'm meta oh that's the principles um but yeah there's the so so I kind of
1:11:51
two answers to this question one is like we're trying to publish it as a product roadmap and I think that if there's a
1:11:56
desire to sort of intervene at that point in any of the features I think that would be uh thank you uh I think
1:12:03
that would be a really good place to intervene um uh and then and then there's this more
1:12:09
abstract layer of the principles that we put together and I really I have to commend Liam for this he really in a
1:12:16
similar way that I was describing really did the leg work to understand early on in the first years like what the
1:12:22
community uh Community ideas or Community concerns were and what it what what we needed to
1:12:27
get into the principles both in terms of sort of positive or aspirational goals as well as you know defensive promises
1:12:34
things that we said that we would never do um so I would say in either of those
1:12:40
cases like we're perfectly happy to continue to have a discussion the reason I said I love this question it's like the team would uh be overjoyed to
1:12:46
continue a discussion about the principles I think like any document it should be a living documentum I don't want to remove principles that are
1:12:54
keeping us from uh or that are keeping us holding us to our commitments uh I
1:12:59
still think there's plenty plenty of room for a conversation around those to see if there are things that need to get added or modified and meaningful ways so
1:13:05
that's one level and then I would say on the product roadmap that we we are trying to publish that
1:13:12
um I I had a head of actual product development on a fairly regular basis
1:13:17
um sometimes we do a better job or Worse job on that but we're trying very hard as we figure out what it is that we want
1:13:23
to build and I think would be a great place to engage with us um I think we could also potentially
1:13:28
discuss we tend to do these public office hours around kind of announcements and events but that could
1:13:34
be another place where we could potentially start doing it in terms of product or feature discussions if that was something that folks in the
1:13:40
community would be interested in yeah I meant oh go ahead
1:13:46
so I was I was going to speak uh two exactly the same two levels that you
1:13:51
were referencing there so there is so under the broadheading of community
1:13:57
oversight of features or Community oversight and and interrogation or second opinion about software
1:14:05
I would say it's the same as anything else in that the Wikimedia Foundation product or technology departments build
1:14:13
there are various informal methods of
1:14:19
of raising concerns or expressing opinions on
1:14:24
high level areas like an annual plan down to an individual commit or an
1:14:30
individual feature because we have the fabricator board and you can comment on a fabricated ticket so those kinds of
1:14:36
things are all the same as any other given feature development
1:14:42
process for any other Wikimedia Foundation product um tool there's also the formal
1:14:50
methodology things like board you know appealing to the elected members on the
1:14:56
Wikimedia Foundation Board of Trustees which is your kind of official formal legal method of of
1:15:03
raising those kinds of concerns uh and you'll be unsurprised to know that I
1:15:08
spent a lot of my time answering questions on the top page on meta asking
1:15:13
similar questions to the the ones we're discussing today in the chat uh hogu can
1:15:21
perhaps confirm that I'm fairly fast in answering questions and we can try and
1:15:28
get back quickly precisely because we know that this is a weird and unusual
1:15:34
thing inside the Wikimedia movement I would like to think that we are trying
1:15:40
to be scrupulously available and clear about what is happening
1:15:46
not that anyone else is being unclear deliberately but because there is more
1:15:52
scrutiny on the potential ways of
1:15:57
misuse of what this project is doing that our road maps our plans our
1:16:05
adherence to the values of the movement are more uh scrutinized then a lot of corners of
1:16:14
the Wikimedia movement the so that's one and two is the
1:16:20
principles that I link to the chat there which is if you go to the Wikimedia Enterprise page on meta and there is a
1:16:27
sub page on the link from the info box principles that refers to things like these are the guard rails we've set
1:16:33
ourselves and they're not in law but they're written there
1:16:39
be loud and large so anyone can hold us to account for it uh things like
1:16:46
when we try to write this in a way that's future proof so things like there is no exclusive contracts no
1:16:54
exclusive content so we're not building something for One customer
1:17:00
and then they get to exclude other potential customers their competition
1:17:05
whatever we build is available for everyone including the free users of the
1:17:11
trial service or uh non-profit organizations who wish to have the the
1:17:16
free version as well there is no exclusive content so we're not building
1:17:24
some getting some secret data feed
1:17:29
and selling that data feed so that the large commercial organizations get a better thing
1:17:36
there's no um special information within the data feed that is not already
1:17:41
available perhaps not as easily findable but not secret or new
1:17:47
those there's a bunch of other principles in there about financial transparency and so forth but I think
1:17:53
those two things are quite crucial for ensuring future proofing
1:18:00
the appropriateness of the features built by
1:18:05
the Wikimedia Enterprise team and of course we're always working with the with and under
1:18:12
the same privacy policy in terms of use policy Board of Trustees over a site
1:18:19
audit committee anyway so we can't make up
1:18:25
rules as we go along it's still part of the bookie video Foundation I hope that uh and and all the law that comes with
1:18:33
that I hope that answers your question that was helpful uh though to clarify
1:18:40
something that you'd said I wasn't necessarily objecting to uh or I wasn't
1:18:45
necessarily referring to a potential customer requesting exclusivity I was just
1:18:52
referring to the creation of or providing easy access to a feature in
1:19:02
general rather than to that specific company just like for example if
1:19:07
um I mean if you mentioned uh some
1:19:14
one example that you mentioned was um which pages had a high influx of Ip
1:19:21
editors if you made that list then someone could go to that list and see so
1:19:28
here are all the IPS that are editing this page and here are where all these
1:19:33
IPS are based and then make some potential inference about those IPS uh
1:19:39
which see there's a lot of security uh there's a lot of data that is available
1:19:45
that becomes a security it's public but it becomes a security or privacy risk when made available at high speed or
1:19:52
aggregated yeah uh even though the data itself is already there and so we're
1:19:58
very careful not to do that unsurprisingly uh and we also we are
1:20:04
only talking about the Articles here we're not talking about user Pages we're not selling or
1:20:12
including um editor information it's the content not
1:20:18
the the use of the users themselves the so we can be the foundation legal
1:20:24
team is very working with us a lot about the formal requirements for privacy
1:20:32
and uh reader protection but equally there they are
1:20:38
considerations that apply already to the existing apis and existing database
1:20:44
dumps because all of this content is already available through apis and large commercial organizations are already
1:20:52
scraping as much as they possibly can from Wikipedia
1:20:57
sites so we are trying to make it a little bit more structured a little bit more
1:21:04
cleaner and thereby bring back some degree of control or power to the Wikimedia
1:21:12
movement in how the data is used and how the data is expressed
1:21:17
rather than just saying hey anyone it's it's available technically it's true from a legal or
1:21:24
from a copyright perspective um but the considerations are a bit more
1:21:31
technically and legally structured in in an API rather than just scrape everything
1:21:37
we also I should just briefly mention that because of that non-exclusivity
1:21:43
principle I mentioned before a lot of people are
1:21:48
initially concerned that this kind of service will benefit the biggest players
1:21:54
and increase commercial monopolies on access to data whereas although they
1:22:01
might be the first customers we feel that because the data we're providing is
1:22:06
the same to everyone and the price just depends on how much you're using it
1:22:11
this will benefit these smaller players to help
1:22:16
get them to be able to use bigger media data in a way that they have not been able to do before this will level the
1:22:23
playing field rather than increase the power of Monopoly of larger
1:22:28
organizations so we think it has an equity aspect to this too not nearly for Revenue but
1:22:35
companies and organizations small search engines that have never been able to use Wikipedia data because it's just too
1:22:42
damn hard will now be able to do so increasing the reach of the content and
1:22:48
getting more people to see more things through more diverse methods of access not just through the
1:22:55
one big company okay that's a little bit
1:23:03
more reassuring but um I mean the the reason I brought up IEP users editing a
1:23:10
specific page was because when I joined that was an example that was mentioned in
1:23:17
response to my question so I mean that's
1:23:23
I don't know uh and then finally uh this isn't necessarily a question about
1:23:30
uh the Enterprise project itself but in general about the
1:23:36
structure of Wikimedia and how it communicates with its users and that is
1:23:42
the only reason I knew about Wikimedia Enterprise and by extension
1:23:48
this Zoom call is because I'm in a telegram
1:23:53
Group which notified me that there was a zoom call happening um though
1:24:00
uh the subsequent um Wikimedia uh meta web page for this link
1:24:09
said uh 9 pm UTC which I think is in half an hour so that slightly threw me
1:24:16
off as well and then the reason I'm in the telegram Group is because I was in
1:24:21
another telegram Group which said hey if you want to get all these updates about
1:24:27
the Wikimedia project then you should join this group and not the one that I was in previously and
1:24:35
I mean I can't imagine how many people are on my shoes I mean I've been on and
1:24:41
off more actively since early 2020 but on and off I've been a Wikipedia editor for
1:24:49
a little over a decade now and I feel like
1:24:54
so many initiatives whether it's this or whether it's some other feature
1:25:01
um are being rolled out with feedback from
1:25:06
at most I don't know a couple hundred distinct users whereas the number of
1:25:14
people who use the Wikimedia project in some capacity
1:25:21
or another is on a daily basis I would imagine at least in the hundreds of millions
1:25:29
um and so just this entire structure it's definitely not the fault of the
1:25:36
Enterprise project but I feel like that's something that
1:25:41
should be looked at at some point so I can I can speak to this but I
1:25:48
noticed uh Amy had her hand up from before so I want to check if you wanted to ah go back I'll be really yeah I'll
1:25:54
be really quick because I wanted to address Andrew's concern about the IP address thing because I think Andrew did
1:26:00
come in in the middle of a statement and I think lost some context um there
1:26:05
um missed some context rather um and I think I can address that quickly where I think the second issue
1:26:11
Liam definitely should handle it might be a much bigger response um um I can't remember exactly the example
1:26:17
Andrew but I think um it was more of maybe a hypothetical or laner Chuck maybe could clarify more
1:26:24
for me but the idea is just that there could be a um you know we're doing some work around like credit we're calling
1:26:29
kind of credibility or credibility signals or content Integrity but the idea would be that like maybe there's
1:26:35
something could get returned saying oh this article has suddenly had a lot of anonymous IP address edits on it
1:26:43
that might be something of concern that doesn't mean it's returning a list of what those IP addresses are just a
1:26:49
signal that there was a bunch of Anonymous IP addressed IP address edits
1:26:54
on it and that's that's that piece of um data is could might be significant it
1:27:00
might be something to make a decision about whether or not you want to trust that particular version of that article
1:27:06
due to that or if that article should be like shut down and only for admin editing at that moment or something it's
1:27:12
not about returning a list of all those IP addresses that is not a feature that we would be building or ever be giving
1:27:18
to a customer um chucker lane or anybody else want to
1:27:24
um say anything more about that please feel free but I hope that puts that any concern around that to rest
1:27:29
um Andrew because for sure for sure for sure we are not in the business of getting that kind of data yeah to
1:27:35
anybody yeah we don't even that's not even in the API data that we're flowing
1:27:40
through anyway um and that's what I was going to mention but you did it a little more eloquently than I could have but yeah
1:27:46
that's that's not a concern because we don't have that data to give
1:27:54
uh I didn't realize that's what my answer was potentially applying we were we were talking about doing
1:28:02
and the the question of comms Communications strategy in general is a
1:28:08
continually fraught one because you have this equal problem of wanting to tell and get as
1:28:14
many people to know about athene as possible but like any organization uh there's so many
1:28:21
different things going on simultaneously that people suffer from overload of
1:28:27
being informed or being asked to consult or communicate about
1:28:32
everything simultaneously uh often it is frequently the simultaneous
1:28:39
concern raised from uh individual Wikipedia volunteers saying I wasn't
1:28:46
informed about X and I'm being told about
1:28:52
and asked to join consultation calls or information things about so many things
1:28:58
that I can't read them all and those are both true but also contradictory so the
1:29:06
Wikipedia Foundation has gotten better at this over the last couple of years with a more
1:29:12
um what they're calling a air traffic control making sure that announcements
1:29:17
and requests for comments and things are spread out rather than all on the same day and also different
1:29:25
that there are certain groups who are of interest to different kinds of announcements maybe in the Indian
1:29:31
Community is something that this announcement is important for and then another one is for the
1:29:37
Wikimedia um developer Community which is a different but in for some people it's an
1:29:44
overlap because they're in both of those groups um and so just trying to be able to
1:29:49
communicate to the relevant people but not to irrelevant to people is a hard task to
1:29:57
achieve when you're trying to do be as accessible as possible without overloading people's attention or
1:30:04
pretending that they should really care about things that are not relevant to them well Community Enterprise is not
1:30:10
relevant to most people in the Wikimedia movement because it's a technical feature that
1:30:16
nearly anyone in the Wikimedia movement has no use for so we do not want to try and draw too
1:30:25
much attention to ourselves by broadcasting hey you know Wikipedia Enterprise has an announcement and
1:30:31
everyone should read it because most people don't care and it's not relevant and that's fine uh I put that message on the telegram
1:30:39
group at a couple of hours ago by way of a little bit of extra advertising uh on
1:30:45
the recently created announce telegram Channel but uh
1:30:52
things channels that have that are for announcements only notably the Neta
1:30:57
forum are notoriously bad at drawing with where I did put this report of this
1:31:05
event this meeting a notoriously bad at drawing uh the right attention because most announcements are
1:31:13
irrelevant to most people you just care about the one thing that you care about that announced channel on telegram was
1:31:21
created last month precisely because of a frustration in the general telegram
1:31:27
Channel but there were too many announcements and most people were annoyed by it being flooded with announcements so it was pushed to a side
1:31:33
Channel which people don't really follow so it's a it's a circular problem if you
1:31:40
can think of a place where this call and this project should be advertised
1:31:48
promote like raised awareness that it's not that where the audience would care
1:31:53
please tell me and I will promote things and raise awareness about this project there but I don't want to
1:32:00
over announce Wikimedia Enterprise to groups who who would just get annoyed if I was
1:32:07
talking about commercial apis to them too much
1:32:15
so in general I agree with virtually everything you said just now
1:32:20
um and uh for the the other telegram
1:32:25
group that I was in isn't even the General Wikipedia one I don't I didn't
1:32:31
know that there was a general Wikipedia telegram this was a group about
1:32:37
um consisting of people who wanted to
1:32:43
discuss a board selection process either in 2022
1:32:48
or maybe even the 2021 and that just has since
1:32:54
gone to include various other discussion topics
1:32:59
um but the um would the
1:33:06
with the question of announcements I think that um there are certain things
1:33:13
that should be announced to a limited audience but on the other
1:33:20
hand there are certain things that should be announced to
1:33:25
the general public on the
1:33:30
top of the Wikipedia article for example and
1:33:36
um every now and then
1:33:41
when I'm logged in the only things that I see at the top of an article that our
1:33:47
announcements are either um vote in such and such election after
1:33:56
candidates have already been announced
1:34:01
um or things like Wiki loves monuments for example which I agree I don't really
1:34:09
have much interest in that but um something like
1:34:16
I don't know where the Enterprise announcement would
1:34:23
fit in that because on one hand it is a really big change
1:34:30
and a really significant proposal but on the other hand like you said it's not something that most people would notice
1:34:38
um this announcement was made to a relatively limited group because it's a interim announcement it's a finance
1:34:44
update the announcement of the creation of this project which was a year ago uh
1:34:51
almost two years ago actually we had several rounds of much more wider discussion in the formation period in
1:35:00
the hey we want to do this what do you think period during the strategy process
1:35:06
a paid API or equivalent was mentioned in two of the eventual strategic
1:35:12
recommendations um so there were and there is a section on our on our FAQ about where was this
1:35:20
discussed before which I'll put a link in the chat too so it I don't want to
1:35:27
give you the impression that this meeting is the first or only time that
1:35:32
this project was discussed but there's always the feeling from it at an
1:35:38
individual level of the first time someone hears about something that's that they think
1:35:45
why wasn't I told about it before um I'm gonna put a link in the chat
1:35:53
if I can find it okay to go um to the
1:35:59
the previous rounds of conversation this one is deliberately more restrictive because the nature of the presentation
1:36:05
is is more restrictive and uh so the nature of the update is more restrictive it's an interim Finance update but I
1:36:12
would point out that the very first employee of the Wikimedia foundation in 2010 was employed
1:36:21
for the purposes of building a paid API which is now long since then that was
1:36:27
Brian Viber long since uh closed as as an API but that helped the foundation
1:36:33
start so the concept of how do we deal with paid API Services is a very
1:36:38
long-standing one inside this inside this uh Community now we're just doing a
1:36:43
much more structured way uh this call is now an hour and 45
1:36:50
minutes long so I think that's probably enough uh for most people
1:36:56
um unless there was any final questions that are directly to the Enterprise project and finance update not Wikipedia
1:37:04
foundation in general
1:37:11
anything further
1:37:17
um I'll just say quickly that uh as someone who is in that group of this
1:37:24
isn't something that I was aware was even under consideration until about
1:37:31
an hour ago uh this has been very helpful and uh I appreciate taking the
1:37:39
time to answer what while I was here felt like exclusively
1:37:45
my questions um I appreciate your appreciation I would recommend you read that last link
1:37:51
I put in the chat too where has this paper been discussed because it links to the major kind of Milestones of this
1:38:00
project as it went from Theory to practice and notably some external blog
1:38:06
posts particularly the one from the open Future Institute that one I think is most important
1:38:12
because it's not written by us so therefore it's a bit more I mean it comes from an organized uh
1:38:18
friend of the family of the Wikipedia movement in open knowledge in the European um public policy space talking about how
1:38:25
this fits in the ecosystem of of reuse of open knowledge data sets and
1:38:31
how to do that in an appropriate way especially in a commercial environment so that's uh possibly a bit more neutral
1:38:37
point of view than just something that we have written ourselves Amy
1:38:43
um yeah sorry if um this was already said I had to take a quick call there um a few minutes ago but um Andrew and
1:38:49
case Liam didn't um already mention it this is being recorded and will be posted
1:38:55
um the whole the whole call so everything that you missed before you joined us so you can get more context
1:39:00
for all of it and there might have been some more information that was shared that will be Illuminating for you so
1:39:06
um uh you can go back and watch the record the whole recording if you'd like as well
1:39:12
I have tried to take time code notes but I didn't start my stopwatch so I I might got I might have got the uh the time
1:39:18
codes at all and that'll be on our project on our project page Liam I look
1:39:23
at this on our main page on meta which media Enterprise on meta there's also the videos of the previous versions of
1:39:29
this call um and I will link it from the talk page
1:39:34
where we said there is a finance announcement and there will be a call I will link the video into that as well
1:39:41
um yep if anyone actually watches an hour and 45 minutes of this call I will be impressed and hello to you in the future
1:39:49
if you did with that I think I will turn off the recording if that's all right with
1:39:55
everyone

Date
Source Own work
Author LWyatt (WMF)

Licensing

[edit]
I, the copyright holder of this work, hereby publish it under the following license:
w:en:Creative Commons
attribution share alike
This file is licensed under the Creative Commons Attribution-Share Alike 4.0 International license.
You are free:
  • to share – to copy, distribute and transmit the work
  • to remix – to adapt the work
Under the following conditions:
  • attribution – You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
  • share alike – If you remix, transform, or build upon the material, you must distribute your contributions under the same or compatible license as the original.

File history

Click on a date/time to view the file as it appeared at that time.

Date/TimeThumbnailDimensionsUserComment
current13:31, 11 February 20231 h 39 min 58 s, 1,218 × 540 (493.05 MB)LWyatt (WMF) (talk | contribs)Uploaded own work with UploadWizard

The following page uses this file:

Transcode status

Update transcode status
Format Bitrate Download Status Encode time
VP9 480P 515 kbps Completed 14:25, 17 June 2024 1 h 49 min 39 s
VP9 360P 290 kbps Completed 13:23, 17 June 2024 48 min 35 s
VP9 240P 188 kbps Completed 13:20, 17 June 2024 45 min 4 s
WebM 360P 495 kbps Completed 10:03, 30 November 2023 21 min 38 s
QuickTime 144p (MJPEG) 966 kbps Completed 12:22, 25 October 2024 4 min 48 s

File usage on other wikis

The following other wikis use this file:

Metadata