Developer of Tool-inteGraality. Maintainer of Tool-wikiloves and Wiki-Loves-Monuments
User Details
- User Since
- Oct 6 2014, 10:01 PM (533 w, 5 d)
- Availability
- Available
- IRC Nick
- JeanFred
- LDAP User
- Jean-Frédéric
- MediaWiki User
- Jean-Frédéric [ Global Accounts ]
Nov 9 2024
The service was going in 403 forbidden (at least something different!) When I tried to stop/start, it started as a PHP service ; I then ran the command logged above in SAL ; and edited webservice.template to specify web: python3.9.
@dhinus on IRC took some steps some 8 hours ago:
<dhinus> the pod is indeed in CrashLoopBackOff, and has already restarted 44 times
<dhinus> kubectl describe pod shows "Back-off restarting failed container webservice in pod wikiloves-6849f4ccb4-9w6b6_tool-wikiloves"
<dhinus> I will try the stop+start myself while I'm here
<dhinus> the pod is now rescheduled on tools-k8s-worker-nfs-74 and it seems more healthy
Asking for help on Telegram, I was told this might be T362867#10292196
Yesterday, just in case, I recreated the virtual-environment using
toolforge webservice python3.9 shell webservice-python-bootstrap --fresh
Two days ago, I ran webservice restart a few times, which did not help
The UWSGI logs are a vast loop of
*** Starting uWSGI 2.0.19.1-debian (64bit) on [Mon Nov 4 14:27:45 2024] *** *** Starting uWSGI 2.0.19.1-debian (64bit) on [Mon Nov 4 14:27:58 2024] *** *** Starting uWSGI 2.0.19.1-debian (64bit) on [Mon Nov 4 14:28:27 2024] *** *** Starting uWSGI 2.0.19.1-debian (64bit) on [Mon Nov 4 14:29:12 2024] *** *** Starting uWSGI 2.0.19.1-debian (64bit) on [Mon Nov 4 14:30:20 2024] *** *** Starting uWSGI 2.0.19.1-debian (64bit) on [Mon Nov 4 14:32:05 2024] *** *** Starting uWSGI 2.0.19.1-debian (64bit) on [Mon Nov 4 14:35:01 2024] *** *** Starting uWSGI 2.0.19.1-debian (64bit) on [Mon Nov 4 14:40:26 2024] *** *** Starting uWSGI 2.0.19.1-debian (64bit) on [Mon Nov 4 14:45:46 2024] *** *** Starting uWSGI 2.0.19.1-debian (64bit) on [Mon Nov 4 14:51:11 2024] *** *** Starting uWSGI 2.0.19.1-debian (64bit) on [Mon Nov 4 14:56:35 2024] *** ...
Oct 10 2024
Sep 25 2024
Sep 23 2024
Sep 18 2024
Sep 1 2024
Aug 28 2024
Aug 21 2024
I got something working, will wait overnight to see if it went well, and will then send to Gerrit the necessary changes.
Aug 19 2024
Status report (see also https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.heritage/SAL)
- Recreated the venv under k8s
- Tried to run a one-off job using toolforge jobs run update-monuments-min --command /data/project/heritage/bin/update_monuments_min.sh --image python3.7
Jul 27 2024
Jun 24 2024
Jun 21 2024
Could you elaborate ? @Trioslosdios850 :)
Jun 19 2024
May 24 2024
Oh, geez, we never followed-up on that >_>. Adding at least MIT sounds good to me.
Thanks for the answer!:
May 15 2024
For background − the tool looks up categories with the format Images from <contest> <year> in <country>.
Reason is that the tool expects the category names to be « Category:Images from Wiki Loves XYZ 20XX in the Democratic Republic of the Congo » (note the the), but the WLE categories are named « Category:Wiki Loves Earth 20XX in Democratic Republic of the Congo (without the)
May 4 2024
The story has been touched before in T284129#7144208 or T318944#8289726: since 2017 the tool expects the category name to be Images from Wiki Loves XYZ in Armenia & Nagorno-Karabakh (e810058). I was under the impression that this was supposed to be the standard pattern.
Also, slight refinement for the positive query:
May 3 2024
Ok, the code is now in a shape where I can do this. I’m only missing the SPARQL queries :) Let’s continue to take https://www.wikidata.org/wiki/User:Jean-Fr%C3%A9d%C3%A9ric/T251008 as example − what would be correct query for the last column?
Apr 27 2024
Mar 26 2024
Mar 17 2024
Mar 14 2024
Closing as resolved from my perspective. Please reopen if need be. Thanks!
Please see https://wikitech.wikimedia.org/wiki/Tool:Wikiloves#Scope : this tool currently does not work for global events.
Feb 10 2024
This tool can be deleted actually. I just disabled it in the toolsadmin console.
Done in April 2023 via 1b084c487a90 & 442fa0d0fdd2.
Jan 29 2024
Nov 21 2023
This was due to T326266: Remove the WMCS statsd/Graphite service: as the cloudmetrics0003 host was removed, and pystatsd has the interesting behaviour of crashing out if the statsd host is unavailable (https://github.com/jsocol/pystatsd/issues/130)
Oct 29 2023
💡 (thanks to “Fictional characters whose birth/death date is in the current decade” from the Query Service example page
BIND(YEAR(?date) as ?year). BIND(xsd:integer(?year/10) as ?decade). FILTER(?decade = 200).
This took me a long long time, but I think I’m mostly done.
3ba5e84 solves this.
Oct 22 2023
I quickly commented out the wd_item part
# { # "dest": "wd_item", # "source": "wikidata", # "check": "checkWD" # },
so that hopefully next harvest does not crash
@Lokal_Profil What do you think ? Is there a proper way to map to wd_item and having it nullable somehow ? Or shall we just revert that mapping?
Checked the logs quickly:
ERROR: Unknown error occurred when processing country ua in lang uk (1048, "Column 'wd_item' cannot be null")
So, yeah, this is definitely linked to 0a8c490 :-/
Oct 12 2023
Oct 5 2023
Syntax-wise, I checked for inspiration the JSON rendering (example) − the sitelinks are keyed as “frwiki” or “bnwikivoyage“. That looks good enough ; but can I then use that in a SPARQL query? The SPARQL seems to use URLs, eg
?article schema:about ?item. ?article schema:isPartOf <https://en.wikipedia.org/>.
Oct 3 2023
Oct 2 2023
Closing as invalid, as there is nothing much I can do there from the service side.
Sep 18 2023
Four years later, finally took the time to look into it properly :)
Aug 29 2023
Aug 28 2023
Harvesting has been stable for a few days now − closing as Resolved 🎉
Aug 27 2023
Aug 25 2023
@Lokal_Profil Thanks! I have the two STRICT_TRANS_TABLES open patches manually applied on the server, so I’ll only be able to deploy your changes once they are merged.
Aug 24 2023
Grepping through the logs for errors, only 2:
ERROR: Unknown error occurred when processing country de-he in lang de (1048, "Column 'wd_item' cannot be null")
Monuments Database is back to 1.7M monuments 🎉 https://commons.wikimedia.org/wiki/Commons:Monuments_database/Statistics
Aug 23 2023
Harvesting ran today without issue − until the very last step:
2023-08-23_18:22:51 Update monuments_all table... ERROR 1292 (22007) at line 514: Truncated incorrect DECIMAL value: ''
Edited the code in place on Toolforge to add a sql_mode argument to the pymysql connection object. If that works out, I’ll submit a Gerrit patch.
Ah, so the SQL mode setting simply does not stick. Running again:
MariaDB [s51138__heritage_p]> SELECT @@SQL_MODE, @@GLOBAL.SQL_MODE; +-------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------+ | @@SQL_MODE | @@GLOBAL.SQL_MODE | +-------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------+ | STRICT_TRANS_TABLES,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION | STRICT_TRANS_TABLES,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION | +-------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------+
Hmmm, running SELECT @@SQL_MODE, @@GLOBAL.SQL_MODE; again in s51138__heritage_p I’m getting
Harvesting ran over night − still 84 errors :/
ERROR: Unknown error occurred when processing country ir in lang fa (1406, "Data too long for column 'image' at row 1") -- ERROR: Unknown error occurred when processing country se-arbetsl in lang sv (1406, "Data too long for column 'id' at row 1") -- ERROR: Unknown error occurred when processing country de-nrw-bm in lang de (1406, "Data too long for column 'beschreibung' at row 1") -- ERROR: Unknown error occurred when processing country es-ct in lang ca (1265, "Data truncated for column 'prot' at row 1") -- ERROR: Unknown error occurred when processing country ro in lang ro (1406, "Data too long for column 'adresa' at row 1") -- ERROR: Unknown error occurred when processing country be-vlg in lang fr (1406, "Data too long for column 'classement' at row 1") -- ERROR: Unknown error occurred when processing country ie in lang en (1265, "Data truncated for column 'number' at row 1") -- ERROR: Unknown error occurred when processing country hu in lang hu (1265, "Data truncated for column 'site' at row 1") -- ERROR: Unknown error occurred when processing country ch2 in lang de (1406, "Data too long for column 'fotobeschreibung' at row 1") -- ERROR: Unknown error occurred when processing country gb-eng in lang en (1406, "Data too long for column 'name' at row 1") -- ERROR: Unknown error occurred when processing country rs in lang sr (1265, "Data truncated for column 'site' at row 1") -- ERROR: Unknown error occurred when processing country be-wal in lang fr (1406, "Data too long for column 'nom_objet' at row 1") -- ERROR: Unknown error occurred when processing country uy in lang es (1406, "Data too long for column 'monumento' at row 1") -- ERROR: Unknown error occurred when processing country es in lang ca (1265, "Data truncated for column 'prot' at row 1") -- ERROR: Unknown error occurred when processing country gb-nir in lang en (1406, "Data too long for column 'hb' at row 1") -- ERROR: Unknown error occurred when processing country aq in lang en (1406, "Data too long for column 'description' at row 1") -- ERROR: Unknown error occurred when processing country ch-old in lang en (1265, "Data truncated for column 'kgs_nr' at row 1") -- ERROR: Unknown error occurred when processing country no in lang no (1265, "Data truncated for column 'id' at row 1") -- ERROR: Unknown error occurred when processing country us-ca in lang en (1265, "Data truncated for column 'refnum' at row 1") -- ERROR: Unknown error occurred when processing country fr in lang fr (1406, "Data too long for column 'notice' at row 1") -- ERROR: Unknown error occurred when processing country it-bz in lang de (1406, "Data too long for column 'beschreibung' at row 1") -- ERROR: Unknown error occurred when processing country ca-prov in lang en (1366, "Incorrect integer value: '––' for column `s51138__heritage_p`.`monuments_ca-prov_(en)`.`idm` at row 1") -- ERROR: Unknown error occurred when processing country th in lang th (1265, "Data truncated for column 'site' at row 1") -- ERROR: Unknown error occurred when processing country il in lang he (1406, "Data too long for column 'description' at row 1") -- ERROR: Unknown error occurred when processing country de-he in lang de (1406, "Data too long for column 'beschreibung' at row 1") -- ERROR: Unknown error occurred when processing country pt in lang pt (1406, "Data too long for column 'designacoes' at row 1") -- ERROR: Unknown error occurred when processing country fr-object in lang fr (1406, "Data too long for column 'description' at row 1") -- ERROR: Unknown error occurred when processing country mx in lang es (1406, "Data too long for column 'id' at row 1") -- ERROR: Unknown error occurred when processing country be-bru in lang nl (1406, "Data too long for column 'bouwdoor' at row 1") -- ERROR: Unknown error occurred when processing country au in lang en (1366, "Incorrect double value: '' for column `s51138__heritage_p`.`monuments_au_(en)`.`lon` at row 35") -- ERROR: Unknown error occurred when processing country be-wal in lang nl (1406, "Data too long for column 'descr_nl' at row 1") -- ERROR: Unknown error occurred when processing country gb-sct in lang en (1265, "Data truncated for column 'hb' at row 1") -- ERROR: Unknown error occurred when processing country es-vc in lang ca (1265, "Data truncated for column 'prot' at row 1") -- ERROR: Unknown error occurred when processing country hr in lang hr (1406, "Data too long for column 'arhitekt' at row 1") -- ERROR: Unknown error occurred when processing country jp-nhs in lang en (1406, "Data too long for column 'comments' at row 1") -- ERROR: Unknown error occurred when processing country za in lang en (1406, "Data too long for column 'description' at row 1") -- ERROR: Unknown error occurred when processing country pa in lang es (1406, "Data too long for column 'descripcion' at row 1") -- ERROR: Unknown error occurred when processing country ar in lang es (1406, "Data too long for column 'direccion' at row 1") -- ERROR: Unknown error occurred when processing country sr in lang commons (1366, "Incorrect double value: '' for column `s51138__heritage_p`.`monuments_sr_(nl)`.`lon` at row 2") -- ERROR: Unknown error occurred when processing country pl in lang pl (1406, "Data too long for column 'nazwa' at row 1") -- ERROR: Unknown error occurred when processing country in in lang en (1406, "Data too long for column 'description' at row 1") -- ERROR: Unknown error occurred when processing country es in lang es (1406, "Data too long for column 'lugar' at row 1") -- ERROR: Unknown error occurred when processing country wlpa-es-ct in lang ca (1406, "Data too long for column 'descripcio' at row 1") -- ERROR: Unknown error occurred when processing country be-wal in lang en (1406, "Data too long for column 'descr_nl' at row 1") -- ERROR: Unknown error occurred when processing country us in lang en (1406, "Data too long for column 'description' at row 1") -- ERROR: Unknown error occurred when processing country at in lang de (1406, "Data too long for column 'beschreibung' at row 1") -- ERROR: Unknown error occurred when processing country dk-bygning in lang da (1265, "Data truncated for column 'systemnrbyg' at row 1") -- ERROR: Unknown error occurred when processing country by in lang be-tarask (1406, "Data too long for column 'name' at row 1") -- ERROR: Unknown error occurred when processing country eg in lang ar (1406, "Data too long for column 'description' at row 1") -- ERROR: Unknown error occurred when processing country de-nrw-k in lang de (1406, "Data too long for column 'beschreibung' at row 1") -- ERROR: Unknown error occurred when processing country mt in lang de (1406, "Data too long for column 'beschreibung' at row 1") -- ERROR: Unknown error occurred when processing country ug in lang en (1406, "Data too long for column 'description' at row 1") -- ERROR: Unknown error occurred when processing country ca-fed in lang en (1406, "Data too long for column 'address' at row 1") -- ERROR: Unknown error occurred when processing country tn in lang fr (1406, "Data too long for column 'monument' at row 1") -- ERROR: Unknown error occurred when processing country il-npa in lang he (1406, "Data too long for column 'description' at row 1") -- ERROR: Unknown error occurred when processing country nl in lang nl (1265, "Data truncated for column 'type_obj' at row 1") -- ERROR: Unknown error occurred when processing country cn in lang en (1406, "Data too long for column 'designation' at row 1") -- ERROR: Unknown error occurred when processing country be-vlg in lang en (1406, "Data too long for column 'address' at row 1") -- ERROR: Unknown error occurred when processing country pe in lang es (1406, "Data too long for column 'direccion' at row 1") -- ERROR: Unknown error occurred when processing country ee in lang et (1406, "Data too long for column 'aadress' at row 1") -- ERROR: Unknown error occurred when processing country nl-gem in lang nl (1406, "Data too long for column 'objnr' at row 1") -- ERROR: Unknown error occurred when processing country ph in lang en (1406, "Data too long for column 'description' at row 1") -- ERROR: Unknown error occurred when processing country it in lang it (1265, "Data truncated for column 'site' at row 1") -- ERROR: Unknown error occurred when processing country gh in lang en (1406, "Data too long for column 'id' at row 1") -- ERROR: Unknown error occurred when processing country sk in lang de (1406, "Data too long for column 'beschreibung-de' at row 1") -- ERROR: Unknown error occurred when processing country be-vlg in lang nl (1406, "Data too long for column 'adres' at row 1") -- ERROR: Unknown error occurred when processing country de-by in lang de (1406, "Data too long for column 'beschreibung' at row 1") -- ERROR: Unknown error occurred when processing country iq in lang ar (1406, "Data too long for column 'description' at row 1") -- ERROR: Unknown error occurred when processing country es-gl in lang gl (1406, "Data too long for column 'notas' at row 1") -- ERROR: Unknown error occurred when processing country fr in lang ca (1265, "Data truncated for column 'prot' at row 1") -- ERROR: Unknown error occurred when processing country cz in lang cs (1406, "Data too long for column 'description' at row 1") -- ERROR: Unknown error occurred when processing country dk-fortids in lang da (1265, "Data truncated for column 'fredningsnummer' at row 1") -- ERROR: Unknown error occurred when processing country am in lang hy (1406, "Data too long for column 'id' at row 1") -- ERROR: Unknown error occurred when processing country gb-wls in lang en (1406, "Data too long for column 'notes' at row 1") -- ERROR: Unknown error occurred when processing country wlpa-at in lang de (1406, "Data too long for column 'Beschreibung' at row 1") -- ERROR: Unknown error occurred when processing country ru in lang ru (1406, "Data too long for column 'description' at row 1") -- ERROR: Unknown error occurred when processing country ua in lang uk (1265, "Data truncated for column 'site' at row 1") -- ERROR: Unknown error occurred when processing country cl in lang es (1366, "Incorrect integer value: 'S/N' for column `s51138__heritage_p`.`monuments_cl_(es)`.`id` at row 1") -- ERROR: Unknown error occurred when processing country ch in lang de (1406, "Data too long for column 'anzeige-adresse' at row 1") -- ERROR: Unknown error occurred when processing country pt-wd in lang pt (1366, "Incorrect double value: '' for column `s51138__heritage_p`.`monuments_pt-wd_(pt)`.`lon` at row 3") -- ERROR: Unknown error occurred when processing country co in lang es (1406, "Data too long for column 'id' at row 1") -- ERROR: Unknown error occurred when processing country ca-muni in lang en (1366, "Incorrect integer value: '––' for column `s51138__heritage_p`.`monuments_ca-muni_(en)`.`idm` at row 1") -- ERROR 1292 (22007) at line 514: Truncated incorrect DECIMAL value: '' 2023-08-23_04:30:04 Restart the categorization job... -- ERROR: Unknown error occurred when processing country in-com in lang commons Language 'commons' does not exist in family wikipedia
Aug 22 2023
(The annoying thing is that I don’t have yet support to test such things in my local docker-compose setup)
Aug 21 2023
It seems to me that "Data too long for column X" means bad data in the source tables. The harvesting used to ignore that, now it does not. The proper fix is to correct the source data, but we can’t do all that.
One example of
(1406, "Data too long for column 'image' at row 1")
One example of
(1366, "Incorrect double value: '' for column s51138__heritage_p.monuments_pk_(en).lon at row 1")
would be
REPLACE INTO monuments_pk_(en) (source, number, prov_iso, description, address, district, lon, monument_article, registrant_url) VALUES (//en.wikipedia.org/w/index.php?title=List_of_cultural_heritage_sites_in_Balochistan,_Pakistan&oldid=1139504719, BA-2, PK-BA, [[Nindo Damb]], Ornach Valley, Tehsil Wadh, [[Killa Abdullah District]], , Nindo_Damb, BA-2)
@Bodhisattwa Could you take care of adding the necessary configuration for the remaining years ?
Jul 4 2023
Hey @AndrewTavis_WMDE & @Manuel , I only used the dashboard to get a nice visualization of the external ID galaxy − see this slide (the red bubble is the video-game related IDs)
Jun 20 2023
I was pointed to this ticket by @Lydia_Pintscher: I wanted to update a presentation slide that uses https://wikidata-analytics.wmcloud.org/app_direct/WD_ExternalIdentifiersDashboard (that’s the only dashboard I can remember using)
Jun 14 2023
Likely related to the MariaDB upgrade T301949
Jun 13 2023
Looks like this is happening since April 7th