(Please set yourself as task assignee of this session)
- Title of session: MediaWiki product ramp-up: direction and approach to sustainability
- Session description: 10 months ago we started the journey of ramping up MediaWiki as a product and thinking about ways to systemically tackle the many challenges around MediaWiki. This session will provide an overview on the work done so far (progress, surprises, decisions, open questions) and give an outlook on direction and concrete initiatives planned for the upcoming year. The presentation is similar to the one held in April at the MediaWiki users and developers conference, but will e.g. be enhanced by information on concrete projects we’re working on/are planning, which we hope to discuss more at the Hackathon.
- Username for contact: @Bmueller
- Session duration (25 or 50 min): 50
- Session type (presentation, workshop, discussion, etc.): presentation (+ discussion)
- Language of session (English, Arabic, etc.): English
- Prerequisites (some Python, etc.): /
- Any other details to share?: See https://www.mediawiki.org/wiki/MediaWiki_Product_Insights; especially https://www.mediawiki.org/wiki/MediaWiki_Product_Insights/Reports for more information!
- Interested? Add your username below:
- @kostajh
Notes from session:
MediaWiki product ramp-up: direction and approach to sustainability
Date and time: Saturday May 4, 2024 09:30
Relevant links
- Phabricator task: https://phabricator.wikimedia.org/T363972
- Session slides: https://commons.wikimedia.org/wiki/File:MediaWiki_product_-_direction_and_path_to_sustainability.pdf
Presenter
@ BMueller
Participants
Notes
Intro
One of our goals is to get more people to contribute to MediaWiki core. (See example suggestions at https://phabricator.wikimedia.org/tag/mediawiki-core-hackathon-2024/)
want to have a more sustainable future for MediaWiki - what do we need to do now to keep this sustainable future? Both in terms of code, and in terms of people.
Our communities are aging and having less contributors - how can we make sure that more than just a few people have access / knowledge of this code base
text from slide:
MediaWiki, the software platform and interfaces that allow Wikipedia and other projects to function, needs ongoing support for the next decade in order to provide creation, moderation, storage, discovery, and consumption of open, multilingual content at scale.
What decisions and platform improvements can we make to ensure that MediaWiki is sustainable?
Investing in MediaWiki and developer experiences
Goals for Year 1 (July 2023-July 2024)
text from slides:
Build up dedicated product leadership and engineering group for MediaWiki
Develop a high-level product strategy for MediaWiki
Make progress on key multi-year initiatives to enable platform sustainability and evolution
what did we do differently to increase velocity as it's not always practible to add more when there are so many other epic initiatives going on at the same time.
things like bare metal to k8s, restbase deprecation, and more
Invest in consultancy and code review to enable people to contribute effectively [MediaWiki core]
bottlenecks - getting someone to review code, especially in the more complex or 'haunted' areas of the code - hard for staff and extra hard for volunteers
need to think of what's possible now, how can we do this more effectively
main bottleneck in mediawiki core - people are afraid to touch things in fear of breaking things
need to understand/learn that there is a community that can help you to not be afraid of contributing to the Core code
Start tackling a number of open questions, from release to stewardship
Some of which include things that are oft referred to as the "elephants in the room"
hard questions like a small engineering team and a small group of volunteers - it's a miracle that we can do this with such a small scale! you're all doing an incredible job!
text from slides:
We started with questions:
How do we envision the future of MediaWiki as the essential platform for the Wikimedia projects and as an open source project?
How can we increase the sustainability of the platform?
What are our key needs and how can we serve these better?
We have huge lists of desires, and need to prioritize specific things
What does “core functionality” mean?
volunteers have been talking to Birgit and Mateus in interviews - asking what does core mean to you?
large range of answers - people don't always answer in regards to code-architecture, but more in what is important to them, such as the core editor-workflows. to increase transparency.
a simple question - what does core mean to you? :)
seems to be a constant negoitation around things - because there are so many different workflows that people use, and no defined shared agreements
How can we provide easier, faster and cohesive paths to feature development?
this is key to move forward to keep the software experience evolving, and to do it in a healthy balance with predictable/unchanging familiarity/consistency.
how do we look at the layers in the code - how to work in one without touching all the other layers
what are the technical barriers to faster development?
in product management (in WMF), there is a need that came out from the coversation with users - "great idea, but it takes us so long because of A, B, C things"
feedback loop should be shorter between community and WMF product team
we've grown organically over the years
What does effective and responsible stewardship look like
ownership
code ownership/stewardship
tooling layers (product stewardship)
if there isn't a curating look at what should be in there and not - what should we be serving now and serving in the future.
text from slides:
… and establishing a presence:
Storytelling
we started with the MW Insights to talk on a regular basis about MW and to let people know we're here
Listening and explore to learn (and to solve)
millions of edits
being clear that we can't just change things in a few days - this all took several years to create in the first place
need to be realistic on what we created and how we can help/change it. Can't feasibly "rewrite the whole thing in Rust".
Helping people be effective in their work
what are the small things we can do?
conversations with people
making small actions with small improvements
helping people be more effective in their work and thank people for their work
Read: Monthly MediaWiki insights - https://www.mediawiki.org/wiki/MediaWiki_Product_Insights/Reports
text from slides:
A few things we learned (and turned into first actions)
wanted to publish a small report but is challenging because there is so much and builds so quickly!
Strengths:
We made stable technology decisions that enabled us to scale for a top website and kept everything running
wanted a stable ecosystem, looking thru the lens of what can be stable for a while
We have expertise: There is always someone who knows how to do something!
Code quality is much better than 10 years ago
there is a lot of conversation about the quality and how to measure it, and ongoing efforts to improve it
… and more!
Challenges:
MediaWiki is a large, monolith code base and entangled system. Understanding the system’s behaviour is hard, making changes is hard, onboarding people is hard.
not only a handful of people can do this
Read: Unraveling complexity: Mapping MediaWiki software components into user-driven workflows
https://www.mediawiki.org/wiki/MediaWiki_Product_Insights/Artifacts/Unraveling_Complexity:_Mapping_MediaWiki_Software_Components_into_User-Driven_Workflows
good to read and think about from a product decision aspect and thinking about the architecture behind it all
We have a number of strong contributors with significant knowledge on (parts of) MediaWiki core, but we need more people who can work on key areas in MediaWiki to ensure sustainability of the software. At the same time, it is hard to onboard in MediaWiki.
a big goal this fiscal year - to increase by 20% more contributoins to MW Core (more than 5 patches)
trying to get people in the habit of contributing - and making sure the process to contribute is somewhat easy and habit forming. wanted to find something fun - t-shirts!
produced the following two assets to try and help with understanding
Read: Contributor retention and growth
https://www.mediawiki.org/wiki/MediaWiki_Product_Insights/Contributor_retention_and_growth
View: MediaWiki Introduction 2023
https://www.mediawiki.org/wiki/User:Krinkle/MediaWiki_Introduction_2023
Lack of clarity on direction. What are the use cases MediaWiki should be build and optimised for and what not? What is “core functionality”? What should exist (or not) within core, extensions, or the interfaces between them? How does that translate into architectural needs and decisions? What about code review for volunteers? What about release, and other needs of the wider MediaWiki ecosystem?
these all came up in the volunteer and staff interviews
things like - you have a problem to solve (as an engineering team), you get advice from one subject matter expert and then get a second set of advice from another subject matter expert that is different from the first
hard to solve this 100%, we have a diversity and a spectrum of information and knowledge
there is a range of functionalities in MW Core - this can be confusing
-> Moving from “project” to “product” enables us to step by step bring clarity to these questions.
product managment brings direction to the many ideas in the room and then setting priorities
engineers often are the ones who have the best ideas, product management helps put all those ideas into perspective - it's about data synthesis and prioritization
Product management is about saying yes and no - but its also about what we need to optimize for
text from slides:
From project to product
MediaWiki needs to support and scale for:
More than 25 billion global page views per month
https://stats.wikimedia.org/#/all-projects
More than 50 million edits per month
https://stats.wikimedia.org/v2
Open content in hundreds of languages
A large collaborative contributor base of editors and many technical contributors
and across a broad scope of areas, from bots to gadgets to extensions to other tools.
how do we deal with things like the Queen dying, aka "The Michael Jackson effect" (from an earlier scaling challenge/problem) - the amount of traffic surge and we need to have software that supports that type of scale
look at the highest use case and how it translates and how we can design for it all
Design scale:
Enable the creation, moderation, storage, discovery and consumption of open, multilingual content at scale, while protecting the privacy of its users.
what we prioritize and why - because of the scale
Manual: What is MediaWiki? does describe some of that.
https://www.mediawiki.org/wiki/Manual:What_is_MediaWiki%3F
Platform mission [draft]:
A well defined, secure, performant core platform product that offers curated pathways (APIs ..) to enable volunteer-powered multilingual knowledge creation, curation and consumption on and off the platform at scale and in service of Wikimedia’s mission.
How do people both consume, and contribute back, both on-platform and off-platform.
what does software need to deliver to maintain that level of use
Product focus areas
Sustainabilty
Increase sustainability of the MediaWiki platform (people and code) to ensure the software serves us well as the essential platform for Wikipedia and other key needs. Release is a part of this.
who is thinkig about this and the release process?
Core capabilities and concepts
Define and evolve architecture of the core software to meet current and future needs of open knowledge production and encyclopedic content.
how do we get ahead of the game with emerging needs, so that don't need to spend years on it?
what should be core and what is it not?
historically the platform was for engineers and the features were product management perspective - how do we make this better and not degrading the platform. keeping the balance is important
Integration interfaces
Define and evolve the pathways to expand and customize MediaWiki to clarify and streamline feature development, and empower technical contributors.
how do you build upon this and serve? - Hooks, APIs, etc.
text from slides:
Enabling more people to know MediaWiki and contribute effectively
Contributions to MediaWiki core, July 1st, 2023 - March 31st, 2024
4.6% WMDE
20% volunteers
74% WMF
A classic journey of a volunteer MediaWiki core contributor
Joins as contributor to Wikipedia
The page https://www.mediawiki.org/wiki/How_to_become_a_MediaWiki_hacker was originally called "m:How to become a Wikipedia hacker"
Runs into limitations on what they want to do
Starts write and run code onwiki
Starts contributing to an extension relevant to the field they contribute to, or starts writing patches for mediawiki/config
First patch to MW core
text from slides:
Stats:
MediaWiki Core -- July 1, 2022 - March 31, 2023 -- July 1st, 2023 - March 31, 2024
Contributors who submitted > 5 changesets -- 61 -- 71
Average time to first review (days) -- 13.8 -- 6.4
Median time to first review (days) -- 0.9 -- 0.6
shout out to James Forester - one of the top reviewers! (yay!)
lots of volunteers also contributed a lot of reviews!
text from slides:
Next steps
[DRAFT] Knowledge Platform in FY 24/25:
https://meta.wikimedia.org/wiki/Wikimedia_Foundation_Annual_Plan/2024-2025/Product_%26_Technology_OKRs#WE_KRs
this is a draft - made several hanges from last year, looking at various aspects - engineering, design, etc
two main objectives for next fiscal year:
MediaWiki platform evolution to better serve Wikipedia’s core needs (WE5) + developer workflows and services to enable Wikimedia’s developers (WE6)
5.1 Improve sustainability of the MediaWiki platform
About 1/3 of all edits on Wikimedia wikis come from tools on Toolforge.
things like Parsoid and keeping that current and not have too many versions
improvements we want to try - such as - the release process was one of the interventions and we wanted it to be a OKR on its own.
5.2 Simplify and streamline feature development
Hooks, registries, providers - how do extensions and more hook into core?
what kind of patterns we can learn about to make things easier, more sustainable
first 6 months - do experiments to find those 'interventions' to be impactful and tangible
these are examples - not promises
extensions are currently optional - you can't rely on extension code in core. Does it need to be this way? can we make core more modular by using something like the extension framework to define modules?
thousands of hooks that are used in different ways for various workflows (notifications, content additions/deletions, etc).
we'll try to make classigication of all these hooks and categorize them
notification system - how can we make this better? Echo is still an optional extension, but a lot of extensions rely on Echo because it has now been made into its own product
several things bad with Echo, but they're medium bad :)
Echo's biz logic - why isn't in in Core?
aspirational things - but we're looking forward to doing it!
Likely to be the area we get the most discussion and engagement about. Quite abstract/broad in scope.
Interested? Talk to Moriel and/or Roan.
5.3 Support current and future needs of encyclopedic content
we want to look at parsoid to supporting wikitext
structure that people already know - templates, contents, structures
can parsoid be a __ to help performance, scaling, features
example - editing a template that is used thousands of times - that causes lots of changes in the backend, purging caches. Can we avoid having to reparse the whole page just to make a small edit?
think of the page as a composition of fragments
lots of ideas we'd like to explore - improving performance of the template, to update pages without performance problems
will also enable wikifunctions
Interested? Talk to Subbu
5.4 Improve process for release and php upgrades in alignment with product strategy
6.1 Enable efficient decision making on developer workflows and services
Interested? Talk to Birgit
6.2 Better serve developers’ testing needs
TLDR: Improving Beta Cluster! Just making a new beta, has been attempted before, and didn't work for known reasons
Instead, we're thinking about it from the perspective of: What are people using beta for, and can we make something more well-defined to support that use-case?
E.g. making PatchDemo better. -- More people, and hardware, to improve it.
E.g. Improving the deployment train. -- Hypothetical: What if Scap was a web-UI? - Codename: Spiderpig. (rationale for the name: scap is a pig and it's on the web, so call it a spider)
E.g. Beta Cluster currently used for experimental testing of 'everything combined in the big pile of complexity' - if often takes along time to determine if it's a bug in 'my' code, or just a bug in Beta Cluster. - Someone suggested "The most production-like environment is production..." -- Working on some experiements and research around that, codename "Group -1"
E.g. Continuous deployment. -- Investigate whether we can make +2 in gerrit = straight to production.
Interested? Talk to Bryan
6.3 Improve sustainability of Toolforge ecosystem
Interested? Talk to Slavina
Additional speakers for above OKRs: Roan and Moriel (5.2), Subbu (5.3), Bryan (6.2)
can talk to Slavina for ToolForge questions
text from slides:
Thank you all for listening.
What are your thoughts on the plan for the upcoming months?
What motivates you to contribute?
Regular updates: https://www.mediawiki.org/wiki/MediaWiki_Product_Insights/Reports