User Details
- User Since
- Mar 9 2019, 11:20 PM (302 w, 6 d)
- Availability
- Available
- LDAP User
- Unknown
- MediaWiki User
- Don-vip [ Global Accounts ]
Yesterday
The error is not always the same, for example with this one I get a different one:
Wed, Dec 25
Mon, Dec 23
Sun, Dec 22
Fri, Dec 13
Nov 26 2024
Nov 18 2024
If it helps, I still face problems, last one three minutes ago with https://commons.wikimedia.org/wiki/File:Telline_(Donax_trunculus)_(Ifremer_00673-78543).jpg
Nov 11 2024
Nov 9 2024
It's becoming more and more frequent
Nov 8 2024
Same problem with 6 new files:
- https://commons.wikimedia.org/wiki/File:Vagues_sur_le_littoral_s%C3%A9tois_(Ifremer_00631-74358_-_32014).jpg
- https://commons.wikimedia.org/wiki/File:Vagues_sur_le_littoral_s%C3%A9tois_(Ifremer_00631-74358_-_32015).jpg
- https://commons.wikimedia.org/wiki/File:Vagues_sur_le_littoral_s%C3%A9tois_(Ifremer_00631-74358_-_32017).jpg
- https://commons.wikimedia.org/wiki/File:Vagues_sur_le_littoral_s%C3%A9tois_(Ifremer_00631-74358_-_32018).jpg
- https://commons.wikimedia.org/wiki/File:Vagues_sur_le_littoral_s%C3%A9tois_(Ifremer_00631-74358_-_32019).jpg
- https://commons.wikimedia.org/wiki/File:Vagues_sur_le_littoral_s%C3%A9tois_(Ifremer_00631-74358_-_32020).jpg
Nov 7 2024
Hi @Slst2020! I'm sorry I didn't see your last reply. The Maven part of the build currently takes less than a minute, and the whole build less than 2 minutes, it's perfect:
Oct 28 2024
@dcaro no it happened only once for me. Feel free to close the ticket :)
Oct 22 2024
Oct 15 2024
Yes sorry I just seen it. FYI I just relaunched the job and it worked :)
Oct 11 2024
Oct 5 2024
Would per-format canaries allow to allocate different hardware resources per file format? TIFF files would benefit to get more memory, for example.
Oct 3 2024
With the power of buildpacks I was even able to update very easily to Java 23 :) https://gitlab.wikimedia.org/toolforge-repos/spacemedia/-/commit/c00aee97338ec2cb0af4051e2d82555aa8033e8f
Sep 5 2024
I managed tonight to:
- donwload jdk21 from Adoptium at build time on GitLab CI and cache it using GitLab cache mechanism. It's fast enough for me
- use jdk21 with toolforge build service / buildpacks. It's also pretty fast
Sep 4 2024
Aug 26 2024
For me the errors are gone (toolforge job service works, I was able to build and deploy my tool. No more DNS errors, everything looks fine).
Wow thank you Zache for having found this. I think this is definitively the way to go.
Aug 25 2024
Probable duplicate of T373243
Same for my tool (pod spacemedia-6fdcc8d798-8sncn). Started to fail at 2024-08-25T17:38:18.469Z with error message "java.net.UnknownHostException: tools.db.svc.wikimedia.cloud"
I don't see name resolution problem on bastion nor my cloud vps instances.
Aug 18 2024
new one, lot of issues this week, 600s of replag right now on s4 :(
Aug 17 2024
Aug 14 2024
Aug 10 2024
Aug 9 2024
Aug 4 2024
Aug 3 2024
OK, thank you, let's consider it fixed then. We'll try to solve the remaining issues one by one.
Aug 2 2024
Status update: the system has been completely updated and restructured as follows:
- Reduced the number of celery workers to 4 workers per encoding instance
- Doubled the number of instances from 3 to 6
Thank you! I need this quota even after the migration to fix T365154: one of the many errors causing the permanent outages was too many celery workers per encoding instance, causing OOM errors under heavy load. in order to keep the same number of workers with a more stable system, I reduced the number of workers per instance to increase their available memory, and doubled the number of instances (from 3 to 6).
Update completed!
Jul 26 2024
@fnegri thanks a lot, it works! Is there a procedure for me to follow to avoid this problem? I am going to recreate encoding06 instance.
Jul 25 2024
@Andrew can I get support from WMCS team, I got something that doesn't work and I don't understand what?
Jul 24 2024
Thank you Andrew!
Update: OK I understand that NFS and role::labs::lvm::srv are not related:
@Andrew in fact I need help.
I see the old encoding instances use the role::labs::lvm::srv puppet role to get more disk space.
I understand this is outdated and now we're using Cinder volumes directly from Horizon. Should I go this way? Does it mean the NFS server will no longer be necessary? I'm not sure to understand exactly how the storage part works.
@ Everyone: I wasn't able to work on v2c the past two weeks, starting from right now I'm going to update the whole infra to hopefully get rid of this problem for good.
Jul 23 2024
Jul 22 2024
Sure, it's don-vip@github
Hi Andrew!
Sorry for the delay. Thanks a lot for the maintenance of the nfs server :)
For the rest of the activities, it's ok, I hope to complete them this week :)
I won't hesitate to ask for help if I face difficulties.
Jul 19 2024
Thank you @dcaro! I managed to perform an analysis on spacemedia:
Jul 15 2024
Jul 14 2024
I managed to fix almost all files except these two:
Jul 10 2024
I found a way to solve the issue :)
Jul 9 2024
I've setup a local mediawiki instance (1.39, using Ubuntu package) with default config:
Tried Brasilia 20170312a0416 Mediana B151413.tif
Jul 8 2024
Thank you @Urbanecm_WMF!
I see the files have been imported, but the thumbnails have not been generated:
Jul 5 2024
Jul 4 2024
Hi @Urbanecm_WMF !
It's weird, I can download files without problem from Cloud VPS:
@Xqt I reviewed the code in gerrit and am OK with it. I'm pretty confident it will work :)
Jul 3 2024
Thank you!
Jul 2 2024
Sorry Andrew, still one request!
Yes, it works! https://codeclimate.com/github/toolforge/video2commons
Thank you :)
Hello Andrew, one last request (I hope).
My plan to solve this, with current status:
Jul 1 2024
Is it possible to make this behaviour configurable? It broke video2commons in an horrible way, as we serialize API errors in our redis instance, we now have keys that contain whole serialized videos (hundreds of megabytes of binary data). It's really interesting to have most of the parameters (filename, mime type and so on) but we clearly don't want to have file contents in the string representation.
Caused by T333957 change in Pywikibot: https://github.com/wikimedia/pywikibot/commit/afab98e41f4366a98659ae97ba43c9428fa590ce
Found a big problem. After connecting to the Redis instance, it appears the db1 database contains 3.6GB of data for ~6000 keys, which is a lot for a database that only contains the current status of each task (the text displayed in the frontend).
If we look at the 15 biggest keys, they are between 86 Mb and 435 Mb, which is absolutely not expected (they usually only a few bytes/kilobytes of text).
The (truncated) contents of the biggest key is:
Credentials mistake fixed (thanks to Andrew). Now video2commons should be fully up again.
Thank you again!
Jun 30 2024
... and to solve the mistake I made :/ v2c is temporarily unable to upload new files :(