Commons:Bots/Requests/Embedded Data Bot (adminbot)
Embedded Data Bot (talk · contribs) (adminbot)
Operator: Zhuyifei1999 (talk · contributions · Statistics · Recent activity · block log · User rights log · uploads · Global account information)
Bot's tasks for which permission is being sought: Because human admins have real lives and can't monitor bot speedy requests every single hour, this bot request for approbal is for extension to the original task, to delete such files according to COM:CSD#F9 on sight if and only if all of the conditions are satisfied (to reduce false positives to a minimum):
- The file ending is determined precisely with a parser according to the format specifications. i.e. not determined with a remuxer or converter like ffmpeg or jpegtran.
- The MIME detection of the embedded part returns any of "application/x-rar", "application/zip", "application/x-7z-compressed", and other archival formats (if abuse is found), subject to changes if the MIME themselves change (i.e. if future "application/x-rar" renamed to "application/rar", this list change as well).
- Has a hit on Special:AbuseFilter/166, via API
action=query&list=abuselog&aflfilter=166&afltitle=...
. (If the false negative rate goes too high because of this condition I'll file another BRFA to remove this condition). - The file has only one entry in its upload history. This is to prevent anyone making the bot delete an arbitrary file just by overwriting with a file subject to deletion.
Automatic or manually assisted: Automatic
Edit type (e.g. Continuous, daily, one time run): Continuous
Maximum edit rate (e.g. edits per minute): 6 per minute
Bot flag requested: (Y/N): N
Programming language(s): python: pywikibot
Zhuyifei1999 (talk) 09:56, 14 January 2017 (UTC)
Discussion
- FWIW, I've granted @Steinsplitter: access to the bot on tool labs. --Zhuyifei1999 (talk) 09:56, 14 January 2017 (UTC)
- Support Zhuyifei1999 is doing valuable work with this task, and the admin request is reasonable. --Krd 11:34, 14 January 2017 (UTC)
- I don't have objections, but will be good idea to hear opinion about errors rate from administrators, who deleted such files in past. --EugeneZelenko (talk) 15:01, 14 January 2017 (UTC)
- Well, the conditions laid above are much higher than those that actually add {{Embedded data}} tags, and some false positives are visible in Special:Contributions/Embedded_Data_Bot. Anyways, pinging @Jdx, Didym, Ronhjones, Srittau, and Hedwig in Washington: @Josve05a and Herbythyme: (sorry if I missed anyone) who did some related deletions, @Ninjastrikers: who speaks their language and actively monitors the abuse, and @Revent: whom I talked to about this on IRC. --Zhuyifei1999 (talk) 16:11, 14 January 2017 (UTC)
- Support Yes please. The sooner the better. Noted that there has been 35 embedded files in last 24h. I know Telenor are bring in the Fair Use Policy next week, and this might slow down the attacks (as the Telenor users will not be able to download anything over 150MB per day), but it's probably only a matter of time before another loophole is found. I suspect filter 166 could be tweaked to improve it's function - all recent files have been the first download of a new user, I think 166 allows 7 days old - could be reduced. Ronhjones (Talk) 16:29, 14 January 2017 (UTC)
- Filter 166 is currently set to upload within 7 days of account creation, with less than 50 edit count prior to upload. This time range is obviously to prevent sleeper accounts. As for adding the condition of first upload (which is currently 50), surely you don't want them to go undetected by the filter after simply uploading a small file, right? I'd suggest a number no smaller than 10, preferably no smaller than 20, if we really need to reduce it. The latest five false positives (well, not-yet-deleted is more accurate) from the filter are Agatyr's File:1665_Girl_with_a_Pearl_Earring.jpg, ProgramaConecta's File:Alisson_Wanderfillk.png, Shameem_Reza's File:সাহেব_বাড়ি।.jpg, Fizwizviz's File:Musee-education-tunis-1.ogg, and B235R's File:Actin_UR5_Robotic_Tool_Path.gif, all within first 5 edit/uploads (well, except the last, but his sixth upload triggered the filter anyways), so I doubt a reduction in edit count requirement will reduce the filter's false positive rate.
- For a comparison, the bot use an condition of user having no more than 200 edit count. The reason isn't the false positive rate, but the workload. --Zhuyifei1999 (talk) 16:59, 14 January 2017 (UTC)
- Support for the reasons outlined above. Zhuyifei1999 is also very receptive to suggestion and additions, so I am confident that all potential problems that arise will be addressed. Sebari – aka Srittau (talk) 16:55, 14 January 2017 (UTC)
- Strong support This should have been a RFA instead since what is requested here is the admin flag and not an approval for a new bot task, but either doesn't matter too much though. Zhuyfei1999 is a very trusted bot operator and an admin, in fact he is the one who revived FlickrreviewR and Panoramio review bot, so he is definitely trusted to run this kind of adminbot. (I know this is not a vote, I just want to provide my support...) --★ Poké95 08:11, 15 January 2017 (UTC)
- Support I support this task, and admin rights for it. It would help to post a link to this request on rfa if it hasn't already been done. --99of9 (talk) 00:45, 17 January 2017 (UTC)
- Strong support as per above comments. Ninja✮Strikers «☎» 04:04, 17 January 2017 (UTC)
- Strong support This is basically, as noted, a request for both Zhuyfei AND Steinsplitter, as bot operators. The bot will, per this request, only delete files that are explicitly positive for containing an embedded archive, with limits that mainly serve to reduce what must be checked. Such files would be deleted as CSD F9 violations anyhow. - Reventtalk 08:24, 17 January 2017 (UTC)
- Support Sure! Thanks for getting rid of these pirated material. Regards, Yann (talk) 09:45, 17 January 2017 (UTC)
- Support Storkk (talk) 10:17, 17 January 2017 (UTC)
- Support Good work! - Jcb (talk) 10:57, 17 January 2017 (UTC)
- Support. -- Geagea (talk) 14:22, 17 January 2017 (UTC)
- Support - I generally don't like admin bots however in this case it is a great idea. Go careful but many thanks for the help. --Herby talk thyme 15:10, 17 January 2017 (UTC)
- Support A very reasonable adminbot use by two trusted admins. Pi.1415926535 (talk) 17:04, 17 January 2017 (UTC)
- Comment @Zhuyifei1999: Would it be better to have all versions of the file be by the same user, rather than having to have one version? That would prevent uploaders trying to evade the bot by uploading multiple versions of the same image. Pi.1415926535 (talk) 17:05, 18 January 2017 (UTC)
- I was thinking about this earlier. The downside would be that I won't want anyone to abuse this bot to delete their own images either; it would add the complexity of checking if the latest version and the version before that < 7d, and the file is unused (COM:CSD#G7). That's do-able, but complex. --Zhuyifei1999 (talk) 17:16, 18 January 2017 (UTC)
- Comment @Zhuyifei1999: Would it be better to have all versions of the file be by the same user, rather than having to have one version? That would prevent uploaders trying to evade the bot by uploading multiple versions of the same image. Pi.1415926535 (talk) 17:05, 18 January 2017 (UTC)
- Support matanya • talk 18:50, 17 January 2017 (UTC)
- Support Very sensible, trusted operator. ~ Rob13Talk 23:23, 17 January 2017 (UTC)
- Comment I suspect AbuseFilter 166 (Large files by newbies) will be obsolete once Telenor's 150 MB/day data limit kicks/kicked in as pirates will have to split archives. Dispenser (talk) 05:58, 18 January 2017 (UTC)
- Support --sasha (krassotkin) 12:23, 18 January 2017 (UTC)
- Support Natuur12 (talk) 13:08, 18 January 2017 (UTC)
- Strong support Definitely. Jianhui67 talk★contribs 14:00, 18 January 2017 (UTC)
- Support --DCB (talk) 19:22, 18 January 2017 (UTC)
- Support Thought I already commented earlier. Sure thing, grant admin flag. --Hedwig in Washington (mail?) 03:25, 19 January 2017 (UTC)
- Approved and made the bot an admin. I'm taking the liberty to close this discussion before the 7-day period is up—which, incidentally, has never actually been an obligatory requirement for bot requests—as there is an overwhelming community approval for the bot to be granted administrator privileges and because it is helpful for the project to have the bot functioning as soon as possible. Thanks everyone for participating in this vote and discussion. odder (talk) 15:00, 19 January 2017 (UTC)