Page MenuHomePhabricator

Decide on wanted metrics for Maniphest in kibana
Closed, ResolvedPublic

Description

For the status on 2017-03-31, scroll down to T28#3147479


Summary (2014-04-30) of wanted data, extracted from comments below:

Note that for all items we have raw SQL queries handy.

Other questions


Original task summary:

Reported upstream: https://secure.phabricator.com/T6041

Key metrics that we are getting from http://korma.wmflabs.org/ and we would still need, either through a Phabricator backend for Metrics Grimoire or something else:

  • Volume of contributors (any activity), authors, resolvers - implemented via T1003
  • Volume of tasks created, closed, open - implemented via T1003
  • Median age of unattended tasks (Needs triage AND Backlog AND no comments from !author)
  • Median age of open tasks by priority (excluding Needs Volunteer) - implemented via T1003
  • Lists of main contributors TBD

All this by project, with a possibility to aggregate data of different projects. See what we have currently at

Note that we will be limited by the data imported from Bugzilla -- see T246, T107254

Details

Reference
fl53

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Qgil renamed this task from Metrics for key Wikimedia projects software in Maniphest to Metrics for Maniphest.Jan 8 2015, 6:44 AM
Qgil updated the task description. (Show Details)

I think the concept of key Wikimedia software projects is interesting but has its drawbacks as well. All projects hosted in Phabricator deserve to be measured under the same rule, and these Key Wikimedia projects should concentrate a majority of activity anyway.

Highlighting good practices and black holes is useful regardless of whether projects are deployed in Wikimedia servers or not.

@kevinator said in T1003#844983:

When I saw the wiki with number of active bugzilla users, I wondered if you knew about the graph extension that lets you embed graphs into wiki pages. > This was also an excuse for me to play with the extension:

bugzilla users: https://www.mediawiki.org/wiki/Talk:KLeduc_(WMF)/test_graph_extension
the data: https://www.mediawiki.org/wiki/Talk:KLeduc_(WMF)/test_graph_extension_data
the extension: https://www.mediawiki.org/wiki/Extension:Graph

This might be a cheap way to get a visualization of the data.

Even if I didn't even say THANK YOU! at the time (which I should, thank you!) I have been thinking about this proposal since Kevin suggested it. And I think it is a good idea.

Since I started working on tech community metrics about a year and half ago, I have learned two things:

  • The tough part is to identify the metrics that will actually influence plans and processes.
  • A monthly cadence of reports is enough.

If you agree with both statements, then you will probably agree with me that a small set of very important and closely followed metrics is more efficient that detailed dashboards with a lot of data points. You may also agree that a small data set updated on monthly basis can be reasonably maintained manually, if we don't have any automation at hand.

For all these reasons, I'm happy to give Extension:Graph a try to bring some actual metrics to https://www.mediawiki.org/wiki/Community_metrics as soon as we have reports for two months.

Though I played a bit more with Phab's SQL in the last days (phun phun phun!) I won't get into this task in Feb 2015 (nothing done on the "thinking and deciding about relevant datasets" part which should actually happen first).
Hence adding to ECT Mar 2015 project.

Dropping some thoughts (and arrows). Input welcome. What makes sense could be moved to the desc.

Generally: Not even sure where and how to "publish" such metrics. Putting it all into the monthly email implemented in T1003 feels like overkill but is the obvious quick'n'dirty way as long as Facts in upstream isn't ready (plus when it comes to my skills I could probably come up with some SQL queries but definitely not with PHP code).

Remaining unresolved items from task description above

Median age of unattended tasks (Needs triage AND Backlog AND no comments from !author)

  • ⇘ We have numerous projects which are volunteer based and likely de-facto unmaintained (in Git/Gerrit & Maniphest). I am afraid that one single number across all projects is not meaningful and won't help us realize anything as there are too many factors involved (e.g. for a small project or for some team projects the task author might be (co-)maintainer and no comments from !author might ever be needed).

Lists of main contributors TBD

if that translates to "Hall of Fame" stuff, I like that because it's a pat on the back.
These might be people we want to make sure to invite and see them at Hackathons etc.

  • ⇗ "Top 20 commenters"
  • ⇗ "Top 20 task creators"
  • ⇗ "Top 20 task closers/resolvers"

Project activity

  • ⇒ Projects with biggest and smallest (bus factor) diversity of users (absolute number) either being active on their tasks or only resolving tasks? Would need some relation or threshold as a small new project with smaller user base also has smaller numbers hence can easily be meaningless. This all does not convince me: Minimum number of tasks; Minimum age of project; Dividing by lines of code would be cool but no access to that plus no 1-to-1 between Maniphest projects and codebases.
  • ⇘ Projects with most activity on tasks; Projects that received the most created tasks: I don't think that tells us anything we want to know.
  • ⇘ Projects with least activity: Hard to measure, e.g. if a project is pretty new in Phab, and has only one task anyway, is that already little activity? Minimum number of tasks? Minimum age of project? Don't want to split hairs and go down that road here I think.

Acceptance of Phabricator by project teams

Not sure if that tells us anything we would like to know when it comes to "accepting" Phabricator as Wikimedia's project management tool, and not sure we even want that apart from existing stuff like T434.

  • ⇒ Number of projects (active and inactive)
  • ⇗ Number of active projects
  • ⇒ Number of projects with >1 workboard column (= likely to actively use columns?; as long as https://secure.phabricator.com/T7410 is not tackled)
  • ⇗ Number of projects that saw >0 moves of a task from one workboard column to another

(Items 1-3 are easy in SQL, 4th one more complicated)

Popularity of / interest in tasks

Could be one source of input for decisions. Still, software development is not a popularity contest and we do not want design by committee. Hence not sure if helpful at all on a cross-project level instead of per-project. Potential criteria:

  • ⇗ subscribers = CC list members (easy in SQL)
  • ⇒ number of comments (but is "heat" something that describes "interest"?) (slightly harder in SQL)
  • ⇘ tokens (Phab only, but tokens can mean really anything and nothing as explained in T85255#1020451)
  • ⇘ flags (Phab only, kind of bookmarks but we have not advertised them enough that I consider them meaningful yet)
  • ⇓ duplicates (we did not convert those relations to edges in Phab when migrating from BZ hence impossible to properly query for them)
  • ⇓ votes (BZ only, we don't use BZ anymore)

Adding the "Wikimedia-Hackathon-2015" project here.
Even though I'm going to work on this before, there's larger potential to discuss "What are interesting key metrics" plus especially how to present and render such metrics in a more consumable way (Phabricator extension? Some external webpage?) than what I am personally able to do (email).

Generally: Not even sure where and how to "publish" such metrics.

Ok, http://korma.wmflabs.org/ is the right place, at least as long as Phabricator doesn't provide the reports we need. Please create a task to coordinate the next steps with @Dicortazar. I guess we will have two tracks:

  • Maniphest backend for Metrics Grimoire to provide the data provided for Bugzilla etc out of the box.
  • Additional metrics we would need, in korma or in reports.

Putting it all into the monthly email implemented in T1003 feels like overkill but is the obvious quick'n'dirty way

An email is better than nothing. I wonder whether https://www.mediawiki.org/wiki/Community_metrics#Reports could be used in the interim.

Median age of unattended tasks (Needs triage AND Backlog AND no comments from !author)

  • ⇘ We have numerous projects which are volunteer based and likely de-facto unmaintained (in Git/Gerrit & Maniphest). I am afraid that one single number across all projects is not meaningful and won't help us realize anything as there are too many factors involved (e.g. for a small project or for some team projects the task author might be (co-)maintainer and no comments from !author might ever be needed).

Ok, but what about meaningful metrics about Needs Triage? Do we agree that a higher % of triaged tasks is better, that projects with horrible % of non-triaged tasks should be highlighted?

Also, I found useful those "Top 20 tickets" lists at http://korma.wmflabs.org/browser/bugzilla_response_time.html , to at least we able to look at those theoretically forgotten tasks and see what can be done. 50 instead of 20 would give us more margin.

These might be people we want to make sure to invite and see them at Hackathons etc.

  • ⇗ "Top 20 commenters"
  • ⇗ "Top 20 task creators"
  • ⇗ "Top 20 task closers/resolvers"

I would say lists are cheap, and we could list 50. We may also want to consider a filter by affiliation, to allow fair promotion of independent / unknown contributors versus full-time employees.

Project activity

  • ⇒ Projects with biggest and smallest (bus factor) diversity of users (absolute number) either being active on their tasks or only resolving tasks? Would need some relation or threshold as a small new project with smaller user base also has smaller numbers hence can easily be meaningless. This all does not convince me: Minimum number of tasks; Minimum age of project; Dividing by lines of code would be cool but no access to that plus no 1-to-1 between Maniphest projects and codebases.

I think Bitergia has done some work around bus factor, but @Dicortazar should know. It can be useful to highlight objective maintenance problems especially in WMF projects. In general, I think it is useful to have data about "broken maintenance" even if we only use it for projects maintained by the WMF (because we have a choice to do something about it).

Acceptance of Phabricator by project teams

Not sure if that tells us anything we would like to know

Agreed. Active users + new accounts should be good indicators already. https://www.mediawiki.org/wiki/Community_metrics#Reports already shows a different trend than Bugzilla's.

  • ⇗ Number of active projects

Unsure. Defining what "active" means, yes. I mean something more sophisticated than "Not archived". Then again, I wonder how much can the picture be distorted by tag projects, and by the fact that updating one tas updates all the projects associated with it, even if some of them might be actually dead.

  • ⇗ Number of projects that saw >0 moves of a task from one workboard column to another

How many workboards have been updated on a given month looks like a simple metric telling something about the level of use of Phabricator, yes.

Popularity of / interest in tasks

Could be one source of input for decisions. Still, software development is not a popularity contest and we do not want design by committee. Hence not sure if helpful at all on a cross-project level instead of per-project. Potential criteria:

  • ⇗ subscribers = CC list members (easy in SQL)
  • ⇒ number of comments (but is "heat" something that describes "interest"?) (slightly harder in SQL)
  • ⇘ tokens (Phab only, but tokens can mean really anything and nothing as explained in T85255#1020451)

What about how many different users have commented, instead or in addition to number of comments.

I think a Hall of Phame of open tasks with these attributes would be interesting and definitely popular. Instead of 3-4 lists, we could have a combined ranking of these three factors, which would drive away some false shots and perhaps some attempts to game the system.

I'm not missing anything else for this second iteration of Maniphest data, and we could live without some of these data points for sure. I propose that we call this "feature planning freeze" and we start nailing down the implementation details of this collection of metrics.

Status update: Bitergia has sent us a quote to develop a Maniphest backend for Metrics Grimoire, and to update http://korma.wmflabs.org/browser/its.html and related metrics with the new Phabricator data from December 2014 onwards.

I have pre-approved their proposal and I will work on the bureaucratic steps. Meanwhile, @Aklapper and @Dicortazar can start working on the details.

Aklapper raised the priority of this task from Low to High.Apr 13 2015, 12:01 PM
In T28#1125190, @Qgil wrote:

Putting it all into the monthly email implemented in T1003 feels like overkill but is the obvious quick'n'dirty way

As long as a T96238: Maniphest backend for Metrics Grimoire implementation is not around the corner, and T94578: Most basic Tech Community metrics are published and up to date has way way higher priority than this very task, we can stick to SQL queries for this very task, put some of the raw SQL results into the monthly Phabricator metrics emails sent to wikitech-l, and manually create graphs for some of that data, if wanted.
I will paste those SQL queries in a dedicated followup comment here so I don't lose them.

Median age of unattended tasks (Needs triage AND Backlog AND no comments from !author)

[...]

but what about meaningful metrics about Needs Triage? Do we agree that a higher % of triaged tasks is better, that projects with horrible % of non-triaged tasks should be highlighted?

Some projects (Mobile related ones and some others) do not seriously use priority.
We basically have such statistics on https://phabricator.wikimedia.org/maniphest/report/project/?order=total , we just cannot sort by that (if you install the wikimedia-maniphest-task.user.js Greasemonkey script you will see percentage numbers instead of absolute numbers and high percentage values are colored). Projects with a low number of open tasks obviously are a smaller problem than those with a larger number. So "I have this data already", it's just not easily accessible/consumable.

  • ⇗ "Top 20 commenters"
  • ⇗ "Top 20 task creators"
  • ⇗ "Top 20 task closers/resolvers"

I would say lists are cheap, and we could list 50. We may also want to consider a filter by affiliation, to allow fair promotion of independent / unknown contributors versus full-time employees.

Yes. But affiliation = future / next-level stuff for MetricsGrimoire.

Project activity

  • ⇒ Projects with biggest and smallest (bus factor) diversity of users (absolute number) [...]

I think Bitergia has done some work around bus factor, but @Dicortazar should know.

@Dicortazar: If you know anything, share it. :)

I propose that we call this "feature planning freeze" and we start nailing down the implementation details of this collection of metrics.

+1, in combination / depending on T94578: Most basic Tech Community metrics are published and up to date.

Dropping SQL queries only in this comment; please ignore.

In T28#1125190, @Qgil wrote:

Also, I found useful those "Top 20 tickets" lists at http://korma.wmflabs.org/browser/bugzilla_response_time.html , to at least we able to look at those theoretically forgotten tasks and see what can be done. 50 instead of 20 would give us more margin.

IDs of those 50 open tasks with the longest time without any action/modification (not: comment):
SELECT id FROM phabricator_maniphest.maniphest_task WHERE status = "open" ORDER BY dateModified LIMIT 50;

These might be people we want to make sure to invite and see them at Hackathons etc.

I would say lists are cheap, and we could list 50.

  • ⇗ "Top 20 commenters"

Top 50 task commenters in last calendar month:
SELECT usr.username,COUNT(usr.username) FROM phabricator_user.user usr INNER JOIN phabricator_maniphest.maniphest_transaction trs WHERE usr.phid = trs.authorPHID AND trs.transactionType = "core:comment" AND FROM_UNIXTIME(trs.dateCreated) >= (NOW() - INTERVAL 1 MONTH) GROUP BY usr.username LIMIT 50;

  • ⇗ "Top 20 task creators"

Top 50 task creators in last calendar month:
SELECT usr.username,COUNT(usr.username) FROM phabricator_user.user usr JOIN phabricator_maniphest.maniphest_task tsk WHERE tsk.authorPHID = usr.phid AND FROM_UNIXTIME(tsk.dateCreated,'%Y%m')=date_format(NOW() - INTERVAL 1 MONTH,'%Y%m') GROUP BY usr.username LIMIT 50;

  • ⇗ "Top 20 task closers/resolvers"

Usernames of those 50 users who closed (in any way) the most tasks in the last calendar month (but also includes reopened ones):
SELECT usr.username, count(usr.username) AS closed FROM phabricator_user.user usr INNER JOIN phabricator_maniphest.maniphest_transaction trs WHERE trs.authorPHID = usr.phid AND FROM_UNIXTIME(trs.dateCreated,'%Y%m')=date_format(NOW() - INTERVAL 1 MONTH,'%Y%m') AND (trs.transactionType="mergedinto" OR (trs.transactionType="status" AND (trs.oldValue="\"open\"" OR trs.oldValue="\"stalled\"") AND (trs.newValue="\"resolved\"" OR trs.newValue="\"invalid\"" OR trs.newValue="\"declined\""))) GROUP BY usr.username ORDER BY closed DESC LIMIT 50;

Acceptance of Phabricator by project teams

  • ⇗ Number of projects that saw >0 moves of a task from one workboard column to another

How many workboards have been updated on a given month looks like a simple metric telling something about the level of use of Phabricator, yes.

Number of projects which have seen at least one task being moved from one column to another column on the project's workboard in the last calendar month:
SELECT COUNT(DISTINCT (edge.dst)) FROM phabricator_maniphest.edge INNER JOIN phabricator_maniphest.maniphest_transaction WHERE FROM_UNIXTIME(maniphest_transaction.dateModified,'%Y%m')=date_format(NOW() - INTERVAL 1 MONTH,'%Y%m') AND maniphest_transaction.transactionType = "projectcolumn" AND edge.type = 41 AND edge.src = maniphest_transaction.objectPHID AND edge.dst = SUBSTR(maniphest_transaction.newValue, INSTR(maniphest_transaction.newValue, 'projectPHID')+14, 30);

Popularity of / interest in tasks

  • ⇗ subscribers = CC list members

Task IDs and number of subscribers per task, ordered by number of subscribers, for open tasks , for those 50 tasks with the highest number of subscribers:
SELECT tsk.id, count(tsk.id) AS subscribers FROM phabricator_maniphest.maniphest_task tsk JOIN phabricator_maniphest.edge edg WHERE (tsk.status = "open" OR tsk.status = "stalled") AND tsk.phid = edg.src AND edg.type = 21 AND edg.src IN (SELECT DISTINCT(edg2.src)) GROUP BY id ORDER BY subscribers DESC LIMIT 50;

  • ⇒ number of comments

The 50 open tasks which received the most comments in the last calendar month:
SELECT tsk.id,COUNT(tsk.id) AS comments FROM phabricator_maniphest.maniphest_task tsk INNER JOIN phabricator_maniphest.maniphest_transaction trs WHERE tsk.phid = trs.objectPHID AND trs.transactionType = "core:comment" AND (tsk.status = "open" OR tsk.status = "stalled") AND FROM_UNIXTIME(trs.dateCreated,'%Y%m')=date_format(NOW() - INTERVAL 1 MONTH,'%Y%m') GROUP BY tsk.id ORDER BY comments DESC LIMIT 50;

What about how many different users have commented, instead or in addition to number of comments.

The 50 open tasks which have the highest number of different users who have commented on them:
SELECT COUNT(DISTINCT (trs.authorPHID)) AS commentauthors, tsk.id FROM phabricator_maniphest.maniphest_transaction trs JOIN phabricator_maniphest.maniphest_task tsk WHERE trs.transactionType = "core:comment" AND trs.objectPHID = tsk.phid GROUP BY tsk.id ORDER BY commentauthors DESC LIMIT 50;

Instead of 3-4 lists, we could have a combined ranking of these three factors, which would drive away some false shots and perhaps some attempts to game the system.

I leave inventing such a formula and its weighting for another night. :)

For the time being (= until T96238 is fixed), I'm not convinced we want to have long lists of items ("The 50 top...") in the monthly Phab email sent to communitymetrics@ (and then forwarded to wikitech-l@), so I've only added the one line "How many projects have seen moves on workboards" in https://gerrit.wikimedia.org/r/#/c/206518/ for now.

If we want to have those lists in the monthly email (comments welcome!) I am happy to set that up, and then manually remove those new elements before forwarding the current content of that email to wikitech-l@ as usual.

Aklapper lowered the priority of this task from High to Low.Apr 30 2015, 9:33 AM

Temporarily lowering priority here:

  • We can always get the raw data agreed on here via SQL queries if we/anybody wanted to have this "soon".
  • Not "soon": We are missing a Grimoire backend here to have this integrated in korma and that backend has higher priority now.

Hence nothing directly actionable right now.

Did someone work on this project during Wikimedia-Hackathon-2015? If so, please update the task with the results. If not, please remove the label.

Updated the task summary to reflect currently available data.
The "wanted items" listed are from April 2014 and is old. It should be shortened a lot to identify which data we would actually effectively use instead of "might be nice to have" to avoid drowning in data. After that's done, specific subtasks need to be created.

Given the current focus on tasks to make existing Git/Gerrit data more reliable and the time it takes to implement new dashboards, this task will remain low priority and I don't see this happening in 2015.

Aklapper moved this task from Need discussion to Backlog on the wikimedia.biterg.io board.
Aklapper renamed this task from Metrics for Maniphest to Decide on wanted metrics for Maniphest in kibana.Jul 15 2016, 5:49 PM

Again that painful realization that we have no stats on Phab or Bitergia (yet) which people filed the most Phab tasks in entire 2016 (we have stats who closed them though). Let's try to make sure that's not the case anymore in 12 months.

Aklapper raised the priority of this task from Low to Medium.
Aklapper moved this task from Backlog to March on the Developer-Advocacy (Jan-Mar-2017) board.
Aklapper changed the task status from Open to Stalled.Mar 9 2017, 3:36 PM
Aklapper added a subscriber: He7d3r.

Waiting for T138002: Deployment of Maniphest panel here to see what would be the 'default' panel and displayed widgets. Cannot evaluate before knowing what we already have... :-/

Sharing which fields is currently available in the maniphest index on https://wikimedia.biterg.io:

T28.png (984×818 px, 146 KB)

Going through / updating the way too long list of proposed metrics in the task description here,

As this task was about deciding, closing as resolved.
If you see a strong need for specific data on wikimedia.biterg.io, file a separate enhancement request.

Thank you very much to everybody involved in this task! Wouldn't it deserve a mention in wikitech-l? Also a link in the monthly report sent to wikitech-l?

I'm happy to mention once basic data quality issues (T157898 and T161235, and to some extent T157709 and a bit of T161308) have been fixed. First impressions...