Page MenuHomePhabricator

Add Image: all wikis ran out of image recommendations
Closed, ResolvedPublic

Description

During a routine check for logs in Logstash, I noticed fr.wikipedia has a disproportionally high number of "No recommendation found for page" found Add Image errors. When investigating those errors, I noticed that Special:Homepage only suggests 11 articles for the Add Image task, also see https://fr.wikipedia.org/wiki/Sp%C3%A9cial:NewcomerTasksInfo. Further investigation revealed that clicking on any of those articles prints an error message:

image.png (1×1 px, 331 KB)

The same problem is also present for all other Wikipedias, see T345188#9128178.

This means Add Image tasks shows as available for users, is selectable by newcomers, but no suggested edits can be in fact made using it. I think we should disable Add Image on those wikis and re-enable once the issue is fixed. Filling directly into Sprint, as it affects an actively maintained feature.

Event Timeline

I generated the total number of image recommendations on all Growth wikis that have Add Image enabled:

[urbanecm@mwmaint1002 ~]$ get_image_count() {
> echo -en "$1\t"; mwscript extensions/GrowthExperiments/maintenance/listTaskCounts.php --wiki="$1" --output json --tasktype image-recommendation | jq '.taskTypeCounts."image-recommendation"'                                                 
> }
[urbanecm@mwmaint1002 ~]$ while read WIKI; do get_image_count "$WIKI"; done < wikis.txt > T345188-data.tsv
[urbanecm@mwmaint1002 ~]$

The output is:

wikiavailable suggestions
arwiki4
bnwiki1
cswiki5
elwiki0
eswiki5
fawiki0
idwiki5
ptwiki233
rowiki5
trwiki6
viwiki3
zhwiki15

I spot checked few wikis in this list, and all of them appear to be practically out of image tasks (even ptwiki, whose's reported number is higher than for the other wikis by several orders of magnitude; all of them seem to be false suggestions and I wasn't able to get an actual suggestions there so far).

This issue affects all wikis. Rephrasing title+description accordincally.

Urbanecm_WMF renamed this task from Add Image: fr.wikipedia and pl.wikipedia ran out of image recommendations to Add Image: all wikis ran out of image recommendations.Aug 29 2023, 5:24 PM
Urbanecm_WMF updated the task description. (Show Details)

Change 953314 had a related patch set uploaded (by Urbanecm; author: Urbanecm):

[operations/mediawiki-config@master] Growth: Disable Add an image on all wikis

https://gerrit.wikimedia.org/r/953314

Change 953314 merged by jenkins-bot:

[operations/mediawiki-config@master] Growth: Disable Add an image on all wikis

https://gerrit.wikimedia.org/r/953314

Mentioned in SAL (#wikimedia-operations) [2023-08-29T17:53:27Z] <urbanecm@deploy1002> Started scap: Backport for [[gerrit:953314|Growth: Disable Add an image on all wikis (T345188)]]

I checked available documentation and this seems to be within Structured-Data-Backlog's area of responsibility. https://wikitech.wikimedia.org/wiki/Add_Image#High-level_summary mentions the data pipeline is supposed to load "that dataset into the CirrusSearch index", which doesn't seem to be happening (based on my hasrecommendation: image queries and the T345188#9128178 data). Adding team tag.

Mentioned in SAL (#wikimedia-operations) [2023-08-29T18:00:14Z] <urbanecm@deploy1002> Finished scap: Backport for [[gerrit:953314|Growth: Disable Add an image on all wikis (T345188)]] (duration: 06m 47s)

Based on the comment in: T345141: No ALIS for 2023-08-14 snapshot

Latest search indices update running now.

Does that mean this task is unblocked now?

Not yet, alas. Still only 68 recommendations for ruwiki :/

Based on the comment in: T345141: No ALIS for 2023-08-14 snapshot

Latest search indices update running now.

Does that mean this task is unblocked now?

As @Cparle said, not yet, unfortunately.

Not yet, alas. Still only 68 recommendations for ruwiki :/

In addition to this, I can only see 19 suggestions for cswiki, 59 for arwiki, 7 for bnwiki and 79 for eswiki. Certainly higher numbers than what I captured in T345188#9128178, but not sufficiently high for the task to be viable, unfortunately.

Thanks. I marked both T345545 and T345141 as subtasks of this one, as they're necessary for Growth to be able to re-enable the task. This would make it easier to track the status for us. Feel free to adjust the subtasks list if one of them is actually not needed to unblock Growth.

The search indices have been updated.

Change 955049 had a related patch set uploaded (by Urbanecm; author: Urbanecm):

[operations/mediawiki-config@master] Revert "Growth: Disable Add an image on all wikis"

https://gerrit.wikimedia.org/r/955049

The search indices have been updated.

Thanks for letting us know! Indeed, the recommendations are back now. Prepared a patch for re-enabling Add an image and I'll deploy it later today.

Thank you all for the efforts made to solve this problem!
What was the cause of this issue? Can it happen again?

What was the cause of this issue?

The indirect cause was a missing input dataset, see T345208: [Spike] Identify and mitigate risks associated with MediaWiki History pipeline.
This led to a domino effect:

  • the image suggestions data pipeline got skipped for 2 weeks, waiting for that missing dataset
  • it successfully ran the 3rd week, but didn't generate article-level suggestions
  • the direct cause is still unknown. It's likely related to the production platform, since the same run in a test instance yielded the expected results.

Can it happen again?

I think that we're not safe from future failures until the production platform stabilizes.
We plan to agree on data uptime as part of T338949: [L] Define SLOs/SLAs for image-suggestions pipelines.

Change 955049 merged by jenkins-bot:

[operations/mediawiki-config@master] Revert "Growth: Disable Add an image on all wikis"

https://gerrit.wikimedia.org/r/955049

Mentioned in SAL (#wikimedia-operations) [2023-09-11T08:42:16Z] <urbanecm@deploy1002> Started scap: Backport for [[gerrit:955049|Revert "Growth: Disable Add an image on all wikis" (T345188)]]

Mentioned in SAL (#wikimedia-operations) [2023-09-11T08:44:32Z] <urbanecm@deploy1002> urbanecm: Backport for [[gerrit:955049|Revert "Growth: Disable Add an image on all wikis" (T345188)]] synced to the testservers mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, and mw-debug kubernetes deployment (accessible via k8s-experimental XWD option)

Mentioned in SAL (#wikimedia-operations) [2023-09-11T08:52:43Z] <urbanecm@deploy1002> Finished scap: Backport for [[gerrit:955049|Revert "Growth: Disable Add an image on all wikis" (T345188)]] (duration: 10m 27s)

Etonkovidova claimed this task.