Wikidata talk:Bots

From Wikidata
Jump to navigation Jump to search


When do you need to create a bot account?

[edit]

This page currently doesn't mention any criteria for when a mass-editing operation would need a bot account. This is relevant because Quickstatements enables regular users to perform mass-editing operations. The only information I could find was on the Quickstatements help page which currently says "Very large runs or potentially-controversial runs should go through the approval process described in Wikidata:Bots.", but that is not a well-defined criterion. Silver hr (talk) 20:52, 16 July 2022 (UTC)[reply]

Unattributed proxy edits

[edit]

Picking up on this thread, I propose that we add:

Bovlb (talk) 18:03, 30 August 2022 (UTC)[reply]

I think we should rather phase out the use of proxy bot accounts completely. As much as I am aware, OAuth allows tools to make edits from the Wikimedia account of the tool user anyways.
Btw. we do have a related situation at User talk:Reinheitsgebot#Who is triggering edits of this account?. —MisterSynergy (talk) 18:32, 30 August 2022 (UTC)[reply]
Eliminating proxy edits entirely would also meet my needs. Bovlb (talk) 20:47, 30 August 2022 (UTC)[reply]
Using proxy bots used to have three advantages:
  • Before the introduction of OAuth, it was the only possibility. This is no longer true. (Extra grant needs to be requested through OAuth for the tool to be able to edit, but users should be comfortable with granting it if they want to use an editing tool.)
  • Their edits can be marked as bot edits. This is what you want to prohibit.
  • They can use higher API limits than ordinary users. This is what would remain, although I’m not sure if bots can actually take advantage of this, since they need to respect replication lag. It’s also a question if it’s an advantage or disadvantage that any logged-in user may quickly edit many pages.
  • Bots have two additional rights that autoconfirmed users don’t have—suppressredirect (Not create redirects from source pages when moving pages) and nominornewtalk (Not have minor edits to discussion pages trigger the new messages prompt)—, but these apply only for wikitext pages, not for entities, so they’re mostly uninteresting for Wikidata bots, especially proxy bots.
Considering that the only advantage that would remain is the higher API limits, and even that is of questionable value, I’m also for entirely banning proxy bots. However, I think such an important policy change should be discussed at a more visible place, e.g. on Wikidata:Project chat, so that all interested people can take part. —Tacsipacsi (talk) 08:08, 31 August 2022 (UTC)[reply]
  • On "API limits": bot accounts have the right "apihighlimits" which allows them to read data from the API more efficiently in some scenarios. However, they do not have "noratelimit" any longer: the maximum edit rate for both bots and regular users is 90/minute. Bot accounts cannot edit quicker than regular ones. —MisterSynergy (talk) 08:58, 31 August 2022 (UTC)[reply]
    Oh right, so bots can only query a bit more quickly. Thanks to continuation, this is probably a negligible difference, and even if/when not, nothing stops the tool from querying through a bot account; we want to ban only proxied edits. Then there’s really no reason to use proxy bots. —Tacsipacsi (talk) 18:42, 1 September 2022 (UTC)[reply]

Formally describing bot tasks

[edit]

We have quite many bots. I recently created a bot to create a better overview over our various bots by scraping the bot User:* profiles for {{Bot}}. One bot can perform many different tasks. I would like to make the individual bot tasks discoverable by the involved properties, e.g. show me all bots that add official website (P856) as a main statement, or all bots that use point in time (P585) as a qualifier, or all bots that edit lexemes.

Currently bot tasks are only described in free text ... so this would require us to introduce a way to formally describe the tasks of a bot. I therefore suggest the introduction of a new tasks parameter for {{Bot}} which would accept a JSON array where each contained object has the following properties:

  • description: English description of the task in plaintext (no wiki markup). Mentions of properties are automatically linkified.
  • space: In which space the edit is performed, acceptable values are: entity types (Item, Property, Lexeme, Sense, Form) or Wikitext to denote that the edit changes regular wikitext pages

Additionally a task may specify one of the following:

  • tasks that add or remove claims can specify which properties they use with "properties": { "mainStatement": [...], "qualifier": [...], "reference": [...] }
  • "fingerprint": true specifies that the tasks edits labels, descriptions and/or aliases
  • "sitelinks": true specifies that the tasks adds or removes sitlinks
  • "sitelink_badges": true specifies that the tasks adds or removes sitelink badges

The JSON would reside directly in the wikitext, making it easy to scrape and for humans visiting the page the JSON would be rendered via Module:BotTasks, as shown in the following examples.

This is just my first idea of how to formally describe bot tasks ... feedback is very much welcome!

--Push-f (talk) 08:42, 8 December 2022 (UTC)[reply]

Examples

[edit]
SpaceDescriptionProperties involved in the edit
Main statementQualifierReference
ItemAdd software version identifier (P348) to items that have source code repository URL (P1324) set to a GitHub.com repositorysoftware version identifier (P348)publication date (P577)reference URL (P854), retrieved (P813), title (P1476), publication date (P577)
ItemAdd official website (P856) to items that have source code repository URL (P1324) set to a GitHub.com repositoryofficial website (P856)reference URL (P854), retrieved (P813)
SpaceDescriptionProperties involved in the edit
Main statementQualifierReference
ItemAdd pronunciation audio (P443) claims for records made on lingualibre.orgpronunciation audio (P443)reference URL (P854)
FormAdd pronunciation audio (P443) claims for records made on lingualibre.orgpronunciation audio (P443)language of work or name (P407)reference URL (P854)
SpaceDescriptionProperties involved in the edit
Main statementQualifierReference
ItemAdds descriptions for various languagesThis task edits labels, descriptions and/or aliases.

Discussion

[edit]

Somehow this feels too static in my opinion:

  • My own bots currently have more than 10 tasks; I am also co-maintaining Deltabot and PLbot meanwhile, with more than 50 different scripts
  • Some tasks involve non-content namespaces
  • Some tasks involve actions such as "patrol", or "protect", "delete", etc. (admin-bot); some interact with sitelinks and badges, or terms in the widest sense; some may use "undo" or "rollback"
  • Some tasks may decide what to do on-the-fly

It would be quite an ask to provide a definite list of things the bots edit during operation. —MisterSynergy (talk) 09:24, 8 December 2022 (UTC)[reply]

I guess by non-content namespaces you mean regular wiki pages? I already accounted for those with "space": "Wikitext".
Right I think it's okay if we leave out admin actions such as "patrol", "protect" and "delete" for now. Most bots aren't admin bots anyway.
I just added three other options "fingerprint", "sitelink" and "sitelink_badges" ... note that I am not proposing to model these in detail (e.g. which bot edits which labels/descriptions/aliases in which languages or which sitelinks are edited)... I think it's good enough to be able to differ a bot that only edits properties from a bot that only edits something in the fingerprint or something about sitelinks.
I don't know what you mean by "terms in the widest sense".
So yes I don't think this scheme has to cover everything, I think it's already valuable if it can describe most tasks of the average bot.
--Push-f (talk) 16:28, 8 December 2022 (UTC)[reply]

Is a bot flag required for a bot that is expected to make very few edits (if any)?

[edit]

Please see phab:T370842 and wikipedia:Wikipedia:Administrators'_noticeboard#Bot_to_inform_temp_users_of_expiry for context. On Wikidata, it appears that this feature is almost never used, and my question is whether I still need to go though the bot approval process for this. Leaderboard (talk) 07:12, 17 August 2024 (UTC)[reply]

I thought we automatically accept global bots. Ymblanter (talk) 18:42, 18 August 2024 (UTC)[reply]
@Ymblanter: meta:Global bots is a specific flag which requires a two-week global discussion period, which this bot does not have. (also: global bots are disabled on this wiki anyway) Leaderboard (talk) 06:11, 19 August 2024 (UTC)[reply]
Then I would say it would be good to go through a request, mostly to see whether the community thinks the task is worthwhile to perform. Ymblanter (talk) 06:48, 19 August 2024 (UTC)[reply]
Deploying a bot on Wikidata to notify users of rights which will expire sounds like a waste of effort to me. Multichill (talk) 20:13, 20 August 2024 (UTC)[reply]
Hi @Multichill: you are right in that this is pretty much never used on Wikidata (I wasn't able to find one in limited testing), but I intend to run the bot globally on as many wikis as possible and hence this process (because nothing stops this from being used in the future). Ideally this is something that shouldn't require any kind of approval at all, but the rules don't work like that. Leaderboard (talk) 18:26, 21 August 2024 (UTC)[reply]

Are bots subject to edit rate limits?

[edit]

And if so, is there an approval process for gettin the noratelimit right, short of becoming a sysop? 2600:1003:B13C:72FD:9AC6:700D:1BF1:6E08 21:47, 14 November 2024 (UTC)[reply]

Yes, accounts with the botflag are ratelimited at 90 edits per minute just as regular user accounts, and there is no approval process for getting the noratelimit right.
Admins do have this right only because certain admin functions might otherwise not work properly. —MisterSynergy (talk) 23:50, 14 November 2024 (UTC)[reply]