User:Leaderboard/StewardMark
(English) This is an essay. It expresses the opinions and ideas of some Wikimedians but may not have wide support. This is not policy on Meta, but it may be a policy or guideline on other Wikimedia projects. Feel free to update this page as needed, or use the discussion page to propose major changes. |
If it isn't obvious, StewardMark is not an official Meta-Wiki policy (or indeed that of any wiki as far as I am aware.
StewardMark is a experimental scoring system that ranks the performance of each steward candidate using a model that is nearly the same as the support percentage that is currently used to determine whether a candidate passes, and scales across multiple steward years. The model only considers steward election from 2009, as the voting population of prior years is harder to compare.
Calculation
[edit]- Let x be the number of supports received by a candidate.
- Let y be the number of opposes received.
- Let z be the number of neutral votes received.
Then the StewardMark Sm of a candidate is defined by
The key difference is that some weightage is given to neutral users, because I believe that their opinions should also count. For most candidates this will mean that Sm < support %, and will mean the other way round for the rest.
StewardMark only applies to users that have not withdrawn or been disqualified.
Standardisation
[edit]This can be used to compare with scores from other contexts (say RfA scores from Wikipedia). A conversion table should be defined in any case. The "standardised" scale is a real number from 0 to 20, rounded to two decimal place.
The US grade system equivalent is meant to answer this question: If stewardship was a course and the election determined your grade, what would it be? Just like a real college course, C is a bad grade and such students often have to retake - and passed candidates usually have a B or higher, again reflecting the real-world scenario.
Standardised scale (0 - 20) | StewardMark cutoff (/100) | US grade system equivalent |
---|---|---|
20 | 99.5 | A+ |
19 | 96.5 | |
18 | 93 | A |
17 | 90 | |
16 | 86 | A- |
15 | 81 | B+ |
14 | 77 | B |
13 | 73 | B- |
12 | 67 | C+ |
11 | 60 | C |
10 | 54 | C- |
9 | 45 | D+ |
8 | 40 | D |
7 | 35 | D- |
6 | 29 | F |
5 | 22 | |
4 | 16 | |
3 | 11 | |
2 | 7 | |
1 | 2 | |
0 | 0 |
Statistics
[edit]The dataset includes all steward candidates from 2009 and later. Data correct as of the 2024 steward elections.
Statistical parameter | StewardMark (/100) | StandardScale (/20) |
---|---|---|
Mean | 68.4 | 13.08 |
Median | 79.45 | 14.61 |
Maximum | 99.39 | 19.96 |
Minimum | 2.76 | 1.15 |
Standard deviation | 27.7 | 5.15 |
Year | StewardMark mean | StandardMark mean | Number of candidates | StandardMark Stdev |
---|---|---|---|---|
2009 | 67.27 | 12.69 | 22 | 4.86 |
2010 | 48.32 | 9.49 | 25 | 6.72 |
2011 | 71.10 | 13.75 | 20 | 5.39 |
2012 | 83.24 | 15.52 | 9 | 1.80 |
2013 | 68.81 | 13.51 | 10 | 6.38 |
2014 | 82.29 | 15.67 | 10 | 3.70 |
2015 | 73.87 | 13.94 | 14 | 4.07 |
2016 | 62.74 | 11.61 | 10 | 2.43 |
2017 | 69.76 | 13.09 | 7 | 4.07 |
2018 | 70.04 | 13.15 | 10 | 4.27 |
2019 | 74.87 | 14.19 | 7 | 4.64 |
2020 | 66.16 | 12.81 | 14 | 5.36 |
2021 | 61.23 | 11.92 | 10 | 5.89 |
2022 | 79.7 | 15.45 | 7 | 5.08 |
2023 | 83.3 | 15.83 | 5 | 3.43 |
2024 | 73.3 | 14.09 | 11 | 5 |
Raw data
[edit]See Raw data.
Takeaways
[edit]- There are some steward candidates that have done really well, with two candidates in the same year getting a StewardMark of over 99. When setting the conversion scale, one objective was to design in such a way that it would be extremely, but not impossibly, difficult to get a perfect standardised score of 20. MF-Warburg came incredibly close to that with a StewardMark of 99.39/100.
- The skew implies that most steward candidates do pretty well - about 50% of the candidates in the dataset passed.
- There are a couple of cases where someone with a higher StewardMark (for example, 2009's Putnik with a 77.73/100) has failed than someone else who passed. The reason is that the former had fewer neutrals: the latter might have just crossed the 80% support ratio but garnered more neturals that would drag down the score. They are rare though.
StewardMark from a en.wp perspective
[edit]A natural question would be to analyse the suitability of StewardMark when analysing en.wp adminship, giving the large number of candidates that have attempted for adminship. There are some important differences however:
- We must include withdrawn and SNOW cases, as they comprise a significant number of candidates.
- The results are different. For instance, about 3.4% of all candidates score a 100/100 StewardMark, and hence get a 20. On the other hand, mainly as a result of SNOW, one-eighth of all candidates get a zero. These extremes should be taken into account, and even then, en.wp adminship proposals score very well on the high end as compared to stewards.
The raw data for en.wp is available at User:Leaderboard/StewardMark/en.wp RFA raw data. Data last updated: March 2024.
Statistical parameter | StewardMark (/100) | StandardScale (/20) |
---|---|---|
Mean | 52.86 | 10.27 |
Median | 54.05 | 10.01 |
Maximum | 100 | 20 |
Minimum | 0 | 0 |
Standard deviation | 36.16 | 6.89 |
Year | StewardMark mean | StandardMark mean | Number of candidates | StandardMark Stdev |
---|---|---|---|---|
2008 | 50.66 | 9.9 | 591 | 6.92 |
2009 | 50.32 | 9.8 | 354 | 6.75 |
2010 | 47.76 | 9.42 | 231 | 6.69 |
2011 | 51.94 | 10.13 | 139 | 6.88 |
2012 | 46.64 | 9.05 | 95 | 6.73 |
2013 | 59.90 | 11.62 | 74 | 6.28 |
2014 | 51.35 | 10.02 | 62 | 7.65 |
2015 | 52.63 | 10.18 | 58 | 6.48 |
2016 | 57.22 | 11.08 | 36 | 7.39 |
2017 | 65.48 | 12.82 | 41 | 6.72 |
2018 | 66.80 | 12.92 | 18 | 6.84 |
2019 | 76.89 | 14.82 | 31 | 5.02 |
2020 | 74.28 | 14.28 | 24 | 5.77 |
2021 | 83.08 | 16.41 | 11 | 5.48 |
2022 | 77.41 | 15.12 | 20 | 6.38 |
2023 | 72.23 | 14.11 | 19 | 7.08 |
2024 | 78.97 | 15.54 | 5 | 6.27 |