Wikipedia:Bots/Requests for approval/718 Bot 2
- The following discussion is an archived debate. Please do not modify it. Subsequent comments should be made in a new section. The result of the discussion was Approved.
- Operator: east718
- Automatic or Manually Assisted: Fully automatic
- Programming Languages: Python plus Twisted
- Function Summary: Convert all images on the English Wikipedia to the more efficient PNG format when necessary.
- Edit period: One very long run, then once a week.
- Already has a bot flag: yes
- Function Details: Just like it says on the can, this bot will attempt to optimize all images on the English Wikipedia. On the first run, I'll let it loose on all images; from then on, it'll only attempt to convert ones tagged with {{ShouldBePNG}}, {{badJPEG}}, and {{badGIF}}. Now for the technical details: it'll attempt to convert the image using imagemagick's convert, then downsize it with optipng -o7 plus advpng -z4, pngcrush -brute, and pngout. If the lightest of these three resultant PNGs is smaller than the original image, the bot will upload it, preserving the image page and adding on all of the history associated with the old image. Lastly, it will update all references from the old image to the new PNG and tag the old image with {{PNG version available}}. All free images will remain until a human decides to clear out the PNG duplicate backlog, and all fair-use images will eventually be killed off by the bots. east.718 at 05:36, June 17, 2008
Discussion
[edit]I like this idea and am inclined to trial it if I don't hear objections soon. Can you resize large unfree images during the conversion? MBisanz talk 06:08, 17 June 2008 (UTC)[reply]
- This is not a task which I feel is appropriate for a bot. east.718 at 16:05, June 17, 2008
- Ok, I understand, just checking, I do see where certain images would be worse resized. MBisanz talk 21:36, 17 June 2008 (UTC)[reply]
Is it worth trying to convert JPEG images? I'd expect that JPEG artifacts would compress especially poorly in PNG format. --Carnildo (talk) 08:30, 17 June 2008 (UTC)[reply]
- I expect so, but it's my computing cycles being wasted. :) If I manage to downsize even a pittance of a thousand JPEGs, I'll have done some good here. east.718 at 16:05, June 17, 2008
Why are uploading them as 'new' images? Can't you just replace the existing one with the new version? Also, why don't you run this on commons as well? -- maelgwn - talk 11:57, 17 June 2008 (UTC)[reply]
- Well, uploading a PNG over a JPEG or GIF is kind of silly, no? Notwithstanding that, MediaWiki will automatically rename the file anyway. east.718 at 16:05, June 17, 2008
Shouldn't this task be restricted to GIFs? PNG was designed as a replacement for the GIF format, not for JPEGs. JPEGs should remain as JPEGs. Also, how are you planning on handling animated GIFs? Does your bot specifically detect and ignore them? Are you also planning on converting all SVG images? If so, what would be the point? Kaldari (talk) 22:14, 17 June 2008 (UTC)[reply]
The {{ShouldBePNG}} template states that:
- This template should not be used for
- images for which only a JPEG source is available; recompressing with PNG will not remove artifacts and will produce larger files
- animated images. PNG does not support animation so GIF should be used instead
- images which contain strictly vector (non-raster) data. SVG should be used in this case.
I would recommend applying the same criteria to this task, i.e. only converting non-animated GIFs (and maybe {{BadJPEG}}s). Kaldari (talk) 22:24, 17 June 2008 (UTC)[reply]
- To answer your questions one by one:
- True, recompressing JPEGs will not remove artifacts, but it will only often produce larger files; images will get the reup treatment if and only if there is a reduction in filesize. The artifacting problem is a whole different beast that is far removed from what this bot is intended to do; this task will neiter resolve nor exacerbate the problem in the slightest.
- Animated GIFs, multi-layered or indexed XCFs, and vector images will be completely ignored. PNGs will also be skipped over, but I might try that with a later bot.
- Most bitmaps can be expressed as vector data given the effort anyway, but I can skip over all images already tagged with {{ShouldBeSVG}}. Alternatively, I can attempt conversion as usual and preserve the tag, which is the current behavior; again, this doesn't affect the problem of the image being rasterized to begin with.
- Thanks for the questions and ideas! Is there anything I've missed or can help with? east.718 at 23:58, June 17, 2008
- Thanks for taking the time to answer my questions. I think I'm satisfied that you've thought this through sufficiently. Kaldari (talk) 15:23, 18 June 2008 (UTC)[reply]
The only thing that pops into my mind is that there are a handful of images (just a handful) in Category:Images which should be in PNG format that require renaming (tagged with {{rename media}}, some with a suggested title, some without). I can think of no better time to rename them than when a bot is re-uploading them anyway. It would certainly add another layer of complexity to this task, but I thought I would throw it out there. - AWeenieMan (talk) 00:47, 18 June 2008 (UTC)[reply]
- I was thinking about this, but came to the conclusion that this is also unsuitable for a bot. A while back, I tried surreptitiously running a mass-deletion bot under my main account that would find and remove duplicate images, and the one crippling (and unfixable) flaw was that it wasn't able to choose which filename should be preferred. The same problem pops up here: a bot just isn't smart enough to figure out that moving Descriptive_filename_12.jpg to a8fh3jkg9f3j39f.pdf or HAGGER?????.jpg isn't appropriate. To distill somewhat, the {{rename media}} tag is applied with human judgment, and that's where the inherent failure in the system is. east.718 at 02:32, June 19, 2008
- I would agree with that...we have a separate process for renaming and I think that's appropriate. The one thing that could possibly be taken into consideration here is that the {{rename media}} contains a field for the new filename, including extension. If this bot converts an image with the rename template, the template should be carried to the new file - but possibly the file extension in {{rename media}} should be changed to .png. For example, if this bot converts Image:ASDGGFCHJGV.gif, and the old image had {{rename media|Picture.gif}}, the new image Image:ASDGGFCHJGV.png should have a template that now says {{rename media|Picture.png}}. Hopefully I explained this correctly. Kelly hi! 02:38, 19 June 2008 (UTC)[reply]
- Yep, that's a great idea, and one which I've thrown into the code now. east.718 at 02:40, June 19, 2008
- I would agree with that...we have a separate process for renaming and I think that's appropriate. The one thing that could possibly be taken into consideration here is that the {{rename media}} contains a field for the new filename, including extension. If this bot converts an image with the rename template, the template should be carried to the new file - but possibly the file extension in {{rename media}} should be changed to .png. For example, if this bot converts Image:ASDGGFCHJGV.gif, and the old image had {{rename media|Picture.gif}}, the new image Image:ASDGGFCHJGV.png should have a template that now says {{rename media|Picture.png}}. Hopefully I explained this correctly. Kelly hi! 02:38, 19 June 2008 (UTC)[reply]
- Approved for trial (20 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. BJTalk 02:53, 19 June 2008 (UTC)[reply]
Rather than sic it on random images, I decided to cherry-pick the test sample to cover all possible bases.
- Image:718test1a.jpg was a poorly optimized JPEG, which the bot correctly moved to Image:718test1a.png, replacing {{badJPEG}} with
{{PNG version available|718test1a.png}}
. - Image:718test1b.jpg was a well optimized JPEG which remained untouched, save the removal of the {{badJPEG}} tag.
- Image:718test1c.gif was a poorly optimized GIF which was used in User:east718/test. The bot correctly moved it to Image:718test1c.png, copying over all entries in the history and replacing its usage on the test page while tagging the original with
{{PNG version available|718test1c.png}}
. - Image:718test1d.gif was an animated GIF with a {{badGIF}} tag that remained completely untouched.
- Image:718test1e.svg was a vector image and also remained untouched.
There was one bug: the wikitext in the edit summary portion of the history in Image:718test1a.png got parsed. I squashed this, as evidenced in Image:718test1c.png. I can haz approval plz? :) east.718 at 04:34, June 19, 2008
- Edits appear proper, Approved. MBisanz talk 04:39, 19 June 2008 (UTC)[reply]
- The above discussion is preserved as an archive of the debate. Please do not modify it. Subsequent comments should be made in a new section.