Wikidata:Project chat/Archive/2022/02

From Wikidata
Jump to navigation Jump to search

Editing WikiProject

Hello. I've initiated a WikiProject page as an editor and hope to engage others on the project. My question is this: is it alright for me to note my own contributions to the project (including public domain materials) myself on the WikiProject page?  – The preceding unsigned comment was added by Lyneedit (talk • contribs).

@Lyneedit: This appears to be a reference to Wikidata:WikiProject Global Discourse. Normally a WikiProject will be started because there is already a group of established editors with the same interest. It's somewhat unusual for a WikiProject to be created by a new user in their first edit. It's also not typical to centre a WikiProject around a patented process. The project page does not give any indication of how it is intended to contribute to Wikidata's goals. Instead, it gives the appearance that you are not here to support the project, but rather to push your own external concerns. A better approach for you would be to first find your feet as a Wikidata editor, and then open a discussion here about whether there would be any interest in such a project. Cheers, Bovlb (talk) 21:04, 31 January 2022 (UTC)
@Bovlb: Thank you for the response. I did not intend to leave my question unsigned. Judging from the preview, my handle and timestamp will appear after this comment. Global discourse is a goal being pursued in a number of disciplines by a multitude of individuals, including myself. Productive discourse involves avoiding assumptions, or snap judgements. I'll continue to strive to contribute to both human and machine learning on the subject, in addition to becoming educated on the nuances of the expanding Wiki universe. I'll take to heart your suggestion on finding my feet as a Wikidata editor and opening a discussion here. Best, Lyneedit (talk) 00:01, 1 February 2022 (UTC)

Porn Ids being added

I know a lot of Wikimedians don't care for certain off-wiki forums, but there is a very real problem identified in the latest blog from Wikipediocracy [1]. If you don't want to look at the evil website I will summarize the main point: Wikidata allows users to add identifiers that lead to porn repositories, and they can be added to the pages of living persons with no evidence. The blog points out that one porn site has dizens of supposed pornagraphic films of Marilyn Monroe, when she never actually appeared in any porn films. I was easily able to find other examples of these identifiers being added to people who are not pornographic actors. I question why Wikidata would link to sites like XHamster, YouPorn, etc at all. These sites are not reliable sources of biographic information, they quite often have fake videos with look-alikes or impersonators. Is fake celebrity porn really an important data point that Wikidata should be hosting? If so, is there some way to better control when and where it is added? An edit filter or something of that nature? Beeblebrox (talk) 23:46, 11 January 2022 (UTC)

These domains should all be blocklisted immediately. To the extent that any identifiers like this should be added, it should be by autoconfirmed or even admin users. —Justin (koavf)TCM 02:17, 12 January 2022 (UTC)
  • These domains are already blocklisted, but since only the identifier part is added, the blocklist does not apply. The full URLs are only generated for convenience in the web UI, they are not found in the database.
  • Edit restrictions to certain user groups would likely be implemented with edit filters if deemed appropriate by the community.
  • However, for all 23 identifiers which could be problematic, there are currently only 5 unpatrolled edits from within the past 30 days by non-autoconfirmed users. More patrolling would help, as so often, but it's really not a huge task in this case.
MisterSynergy (talk) 09:44, 12 January 2022 (UTC)
CC @Matlin (1) and @Trade (2, 3, 4), who apparently added some of the identifiers in question. Maybe we shouldn't have them on mix'n'match. Bovlb (talk) 17:35, 12 January 2022 (UTC)
Due to the amount of identifiers many of these people have in Mix'n'Match it's unfortunately easy to miss the name of the property. The best way to solve it would be to hide pornographic catalogs by default (there is a dedicated toggle group that can be used to include it.
@MisterSynergy:, could you make a filter that stops the identifiers from being added to items without the correct occupations so this doesn't happen again?
I do regret voting for the Xhamster identifier as it doesn't really contain anything of worth. --Trade (talk) 22:36, 12 January 2022 (UTC)
Not a filter expert here, so I’d rather not try this by myself. However, I don’t think we can make this dependent on occupation claims or pretty much any other data in the items. This can only be verified using constraints *after* a claim has been added. We also need to consider that filters are relatively expensive performance-wise, since they are being executed at every single edit.
Regardless of the situation, I don’t think that these identifiers will go away any time soon, but there is definitely need for improvement. Users interested in these properties should volunteer to patrol changes by not-autoconfirmed editors; the properties need a clear set of relatively strict constraints; constraint violations need to be removed quickly. —MisterSynergy (talk) 22:51, 12 January 2022 (UTC)
There are definitely gonna be some on-edge cases. It's not always blck/white @MisterSynergy:--Trade (talk) 00:12, 13 January 2022 (UTC)
Hi @Trade:. Would you agree that the xhamster-ID property should be deleted? Could you also explain why you added a YouPorn ID to Emily Ratajkowski (Q5372335) (here)? If you follow the link you added, it appears clear that the pornographic content you wanted to link to was either deleted or never existed (?) SashiRolls (talk) 19:28, 21 January 2022 (UTC)
I was going through the catalogue and did not realize they also had people outside the adult idustry back then. I do agree xHamster ID have little worth. @SashiRolls:--Trade (talk) 21:32, 21 January 2022 (UTC)
In my case, it is only manually syncing MNM matches with Wikidata. I don't want to blame someone else, but please take a look at these matches: https://mix-n-match.toolforge.org/#/entry/96446458, https://mix-n-match.toolforge.org/#/entry/59522573, https://mix-n-match.toolforge.org/#/entry/52971684. If they will not be removed from MNM, only from Wikidata, they will be recreated every time someone will sync MNM matches with Wikidata.
The community - we all must often use MNM and add-on script's helping to check a MNM matches. There are some of them.
Indeed, the author of this blog post seems to not understand what properties are. They aren't always a source of biographical information. They aren't search query too. Of course, not every porn site has the own ID for actors. And last, but not least, questions about ethics (is porn good or not) moral.pamic are out of the project's scope. Matlin (talk) 11:01, 13 January 2022 (UTC)
While you are correct, that is because Wikidata doesn't track "what is true about X", but "what sources A, B, C (etc.) say about X". The question is - do we care what these particular sources have to say in this case? Fundamentally, we do still care about what is true, and not all sources are equally reliable. If a site is unreliable, as quite a few of these seem to be, it's not "moral panic" to say we shouldn't use it, especially when getting it wrong about a sensitive topic is considerably more misleading to the end-user than it might be in other contexts. Theknightwho (talk) 16:14, 13 January 2022 (UTC)
Require the item be a porn actor make no sense, as the "instance of" statement may by itself unsourced. "Marilyn Monroe, when she never actually appeared in any porn films" - (1) This supposes we have a precise and objective definition of what is porn and what is not porn, but it seems it is not the case. (2) Porn sites may cover a very board topic, much more than what we usually considered as "porn" - such as, something tagged as "sexy", as Monroe be.--GZWDer (talk) 17:47, 13 January 2022 (UTC)
While I realize what qualifies as pornography and was does not varies widely depending who you ask, the identifiers explicitly say that the person so tagged is a a porn performer who appeared in a video hosted by a specific website, for example xHamster performer ID. It is literally impossible that Monroe appeared in a production made for XHamster, yet by adding this identifier, Wikidata is explicitly saying she did. It's just wrong. Wrong as in ethically wrong, but also as in obviously incorrect. Beeblebrox (talk) 20:14, 13 January 2022 (UTC)
I'm really hoping this thread doesn't just get lost. Here's another example: this page claims Jenifer Aniston is an XHamster actress. And you can click on the link and be brought to a page (VERY NSFW) entitled Jennifer Aniston Nude Porn Videos (hilariously, an edit filter diallows adding a link here, when her Wikidata entry links right to the exact same page). The first entry at the moment is a lovely item entitled "Jenifer Anderson-ULTIMATE FAP CUMPILATION" which appears to be stills and short clips from television and movies cobbled together for those who would care to pleasure themselves to that. If that's not your thing, just below it is a video of a woman in a French maid outfit who vaguely resembles Aniston having all types of sex with a young man that I don't remember being on Friends at all... These links are lies, nothing on the page is an actual pornographic movie with Aniston in it, which is what the link explicitly says it will link to. These are living people and it's shameful that Wikidata would willingly host such falsehoods about them. Beeblebrox (talk) 04:36, 15 January 2022 (UTC)
Strong agree. It's embarrassing to the project, not just because it obviously is, but also because the more people care about a sensitive topic, the more it matters when Wikidata gets information about that topic wrong. In any event, the number of qualifiers and technicalities required to make any of these statements even close to factual just means they'd fall well short of the standards required in Wikidata:Living people. Get rid immediately. Theknightwho (talk) 01:11, 16 January 2022 (UTC)
I have removed it for you, @Beeblebrox:, because it was the right thing to do. SashiRolls (talk) 19:36, 21 January 2022 (UTC)
Actually @GZWDer:, I believe that what you say is incorrect. Insofar as these various porn IDs are potentially privacy violations (revenge porn, misidentification, underage models, etc.) these properties are identified as belonging to the living people protection class (P8274) and therefore must have a reference to a reliable third-party source. You might find this article in Vice about xHamster's review policy helpful in considering the larger issues involved.
Also, parenthetically, I had a look at Betty White (Q373895) and noticed that Ms. White does not have a CBS ID, ABC ID, NBC ID, leading directly to the commercial catalogue of these content providers. She does have a Disney A to Z ID (P6181), but if you click on it I think you'll find it's very different from an xHamster pornstar ID (P8720), which leads directly to content...
Last night, I started a discussion concerning possible modifications to the Living People policy at Wikidata with the idea of discouraging these direct links to (potentially misleading) commercial pornographic content. Anyone reading should feel free to express their opinion directly on the Living People policy page. (section: 8474 direct links to commercial pornographic content)
SashiRolls (talk) 08:46, 15 January 2022 (UTC)
It looks like an RfC on this subject might be useful given the divergence of points of view on the utility of linking to porn repositories at all. SashiRolls (talk) 07:29, 20 January 2022 (UTC)
If some of the sites are known to host mislabeled content en masse, then should Wikidata still include identifiers to those sites, out of the possibility of connecting incorrect information into the database? C933103 (talk) 14:07, 25 January 2022 (UTC)

Wikimedians

At "Requests for deletions" I continue to see nomination of entries for Wikimedians. My own entry was deleted last year without a nomination, I only noticed because of the broken links to Commons. Can we firm up the rules of whether Wikimedians get entries or get deleted? We currently have an ad hoc policy that leads to some deleted and some retained based on no objective policy. I see a need for entries for people that contribute images. 70 years after their death their images will be in the public domain and be freed from the restrictions of their current licenses. I don't see us being overwhelmed with contributors doxing themselves. I am not against having a minimum number of contributions to various projects. We need objective rules to avoid ad hoc deletion. --RAN (talk) 18:23, 25 January 2022 (UTC)

Which part of WD:N does your 'I see a need for entries for people that contribute images' satisfy? By & large WD:N has guided the curation of the other 96 million items; unsure why it cannot do same for Wikimedians. --Tagishsimon (talk) 20:45, 25 January 2022 (UTC)
Then maybe it is time to update Wikidata:Notability to specifically address Wikimedians and the 70 year pma issue. --RAN (talk) 21:45, 25 January 2022 (UTC)
There is an objective rule for most of these cases: “can be described using serious and publicly available references” (WD:N #2). User-generated content and self-provided information can’t be enough to establish notability. Emu (talk) 13:44, 26 January 2022 (UTC)
I see no reason to change WD:N to explicitly include Wikimedians. If they don't meet any of the existing criteria, they shouldn't have items. I don't see what contributed images have to do with that. --Dipsacus fullonum (talk) 19:20, 26 January 2022 (UTC)
As someone who often suggests deleting data objects from Wikimedian, I have to agree with the two previous speakers Emu and Dipsacus fullonum. If the Wikimedians meet the normal notability criteria, then they are welcome to stay. However, if they do not meet these requirements, then they must be deleted like any other data object. I see no basis for giving Wikimedians a special right here. Doing this only shows that Wikidata or the Wikimedia Foundation wants to promote their own "stuff" and misuses Wikidata for promotional purposes just like other companies and people. --Gymnicus (talk) 09:02, 28 January 2022 (UTC)
@Gymnicus The reason for concern is that QIDs of those Wikimedians are sometimes used as parameters in author templates on Wikimedia Commons. Hence, deletion of such an item causes hundreds or thousands of file metadata on Commons to give an error and connection to the original author is lost. Vojtěch Dostál (talk) 09:25, 31 January 2022 (UTC)
Also ping @Dipsacus fullonum who seemed to have questioned "what contributed images have to do with that". (I try to explain that above.) Vojtěch Dostál (talk) 09:26, 31 January 2022 (UTC)
You seem to be arguing that such items fulfill a structural need, because they are required as parameters of templates on Commons. Can you give examples of such usage? Would this argument extend to any Wikimedian who has ever uploaded a file to Commons or is there some natural limit? Bovlb (talk) 22:33, 1 February 2022 (UTC)

P180 vandalism

please see depicts (P180), and restore the last ok revision (2022-01-24‎ by FlyingAce). Also check the rest of Special:Contributions/Solman9. --2003:E5:3712:1C00:716F:B9C1:CF8F:530A 09:48, 2 February 2022 (UTC)

As this is not the first time they have made inappropriate changes to properties, I have blocked them from the property namespace for 3 months — Martin (MSGJ · talk) 16:00, 2 February 2022 (UTC)

Is there a property for immigrants?

subject Moïse Kabagambe (Q110795628) is from the Democratic Republic of Congo, but immigrated to Brazil in 2011. Which property can I use to add this information? Tet (talk) 19:27, 2 February 2022 (UTC)

  • You can use "residence=Democratic Republic of Congo" and "residence=Brazil" with start and stop dates. If the person gains citizenship, you add in the two countries with start and stop dates if you know when they gained citizenship. If no dates are known, I jut add the birth county first, "country of citizenship=Democratic Republic of Congo" with "series ordinal=1" then "country of citizenship=Brazil" with "series ordinal=2". I hope this helps. --RAN (talk) 21:32, 2 February 2022 (UTC)
RAN Thanks! Tet (talk) 01:00, 3 February 2022 (UTC)

Properties showing Q numbers instead of labels

If you look at Sundance Festival Awards (Q23688051), it has several values for the has part(s) (P527) property. All of the included values have English-language labels (my default language), but some of them show up just as Q numbers and some have labels showing in the display.

Shows up as a labeled item in the "has part" section: Sundance Film Festival World Cinema Dramatic Grand Jury Prize (Q969394) Shows up as a Q# with no label showing: Sundance Audience Award: U.S. Documentary (Q2366108)

Why does it do this? Why does the label not show? I thought there might be a delay since I just added the English language labels recently, but some showed right away and 24 hours later others aren't showing at all. Is there another thing that needs to happen for them to show their labels when referenced as part of a triplet?

Screenshot below:

Screenshot showing some items in triplet with labels showing and others with just Q numbers displayed
Screenshot showing some items in triplet with labels showing and others with just Q numbers displayed

Kenirwin (talk) 23:11, 2 February 2022 (UTC)

Solved by purging the page. (Can be done by adding ?action=purge at the end of the item URL - i.e. https://www.wikidata.org/wiki/Q23688051?action=purge ) --Tagishsimon (talk) 23:27, 2 February 2022 (UTC)

Property:P140

Property:P140 was "religion" and is now "religion or world view" so that philosophies like "atheism" can be included. At English Wikipedia this was so divisive, adding atheism to the religion category, that it ended with purging religion from infoboxes, so I understand the change here. We had the property "movement" previously for non religious world views. We should probably decide now which belongs to which before they get scrambled. Are we restricting world view to just a few things like "atheism" or going to move things like "vegetarianism" and "veganism" to "religion or world view" from "movement"? I am not for, or against, the change, just looking for clarity. --RAN (talk) 21:17, 2 February 2022 (UTC)

@Richard Arthur Norton (1958- ): I wish I'd known about movement (P135) while we were discussing this on the various property proposals and talk pages and in Wikidata:Properties for deletion for all these years. However, the English label "movement" doesn't capture the general concept of a non-religious worldview and I would argue it should mainly be applicable to less all-encompassing schools of thought. Veganism or vegetarianism are not views of "the world" but of what a person should eat or drink. Religions are generally at a more comprehensive level, addressing questions like what is the purpose of life, what is the foundation of morality, etc. So at least to me the distinction here seems clear. Note that political affiliation was also discussed which could fit under movement (P135) I suppose but definitely should not be religion or worldview (P140). ArthurPSmith (talk) 18:33, 3 February 2022 (UTC)

SPARQL tricks

I enjoy discovering new things about my favorite querying language, especially things that either is not covered in available documentation or just not very obvious. Just today I discovered that I could use STR(prefix:) e.g. STR(wd:) to get the URL prefix string without having to type out all of it.

Do you have any tricks to share? —Infrastruktur (talk) 08:22, 3 February 2022 (UTC)

Query Syntax Question

Hello! My apologies if I'm posting this in the wrong venue. I'm learning how SPARQL works and was curious why the "female scientists" query here always times out for me? I've tried making small tweaks to the query, and it seems the only way I can get it to work is if I remove the asterisk on line 6 and limit the language to AUTO and English. I've tried looking at language documentation but am still fuzzy on the role of the asterisk/why it would cause the time-out.

Thank you so much for your help!

The asterisk in wdt:P106/wdt:P279* is saying 'items that have an occupation of, or an occupation which is a subclass to any number of levels of the occupation. wdt:P106/wdt:P279 says, merely, 'items that have an occupation which is an immediate subclass of the occupation'. So the first checks the whole subclass tree, the second only the first level of subclasses. See also https://www.w3.org/TR/sparql11-query/#propertypaths
A couple of queries to show the difference in number of results (13k vs. 81k), and speed (2s vs 20s), looking only at P106 and P21 (and being heavy-handed in controlling the order in which the query runs):
SELECT ?item ?itemLabel ?lastnameLabel ?birthdate ?deathdate ?nationalityLabel ?itemDescription WHERE {
  { ?item wdt:P106/wdt:P279* wd:Q901 .  } hint:Prior hint:runFirst true.
  ?item wdt:P21 wd:Q6581072 .  
}
Try it!
SELECT ?item ?itemLabel ?lastnameLabel ?birthdate ?deathdate ?nationalityLabel ?itemDescription WHERE {
  { ?item wdt:P106/wdt:P279 wd:Q901 .  } hint:Prior hint:runFirst true.
  ?item wdt:P21 wd:Q6581072 .  
}
Try it!
Beyond that, WDQS seems to need /slightly/ more than 60s to run the full query, even if it's reformatted as a named subquery (which for this sort of query tends to be the best means of optimising the query), so best we can say is that WD now has more women scientist items than Blazegraph, its report engine, can report on using this query.
# Female scientists

SELECT ?item ?itemLabel ?lastnameLabel ?birthdate ?deathdate ?nationalityLabel ?itemDescription WITH { SELECT ?item WHERE {
  ?item wdt:P21 wd:Q6581072 . hint:Prior hint:runFirst true.
  ?item wdt:P106/wdt:P279* wd:Q901 .  hint:Prior hint:gearing "forward".
} } as %i
WHERE
{
  INCLUDE %i
  optional { ?item wdt:P734 ?lastname . }
  optional { ?item wdt:P569 ?birthdate . }
  optional { ?item wdt:P570 ?deathdate . }
  optional { ?item wdt:P27 ?nationality . }
  SERVICE wikibase:label { bd:serviceParam wikibase:language "nl,en,fr,de,es,it,no" }
  }
Try it!
There's a twitter thread here - https://twitter.com/Tagishsimon/status/1454103058112294913 - in which Adam Shorland, a WMDE Staff Engineer - looks in more detail at the size of the task of providing labels for a query producing about as many results as this one.
Wikidata:Request a query is probably where you want to be, both to ask questions like this, but also to study years-worth of query questions & answers. Wikidata:SPARQL query service/query optimization provides some approaches to optimization. --Tagishsimon (talk) 23:24, 3 February 2022 (UTC)

Desktop Improvements update and Office Hours invitation

Hello. I wanted to give you an update about the Desktop Improvements project, which the Wikimedia Foundation Web team has been working on for the past few years.

The goals of the project are to make the interface more welcoming and comfortable for readers and useful for advanced users. The project consists of a series of feature improvements which make it easier to read and learn, navigate within the page, search, switch between languages, use article tabs and the user menu, and more.

The improvements are already visible by default for readers and editors on 24 wikis, including Wikipedias in French, Portuguese, and Persian.

The changes apply to the Vector skin only. Monobook or Timeless users are not affected.

Features deployed since our last update

  • User menu - focused on making the navigation more intuitive by visually highlighting the structure of user links and their purpose.
  • Sticky header - focused on allowing access to important functionality (logging in/out, history, talk pages, etc.) without requiring people to scroll to the top of the page.

For a full list of the features the project includes, please visit our project page. We also invite you to our Updates page.

The features deployed already and the table of contents that's currently under development


How to enable the improvements

Global preferences
  • It is possible to opt-in individually in the appearance tab within the preferences by unchecking the "Use Legacy Vector" box. (It has to be empty.) Also, it is possible to opt-in on all wikis using the global preferences.
  • If you think this would be good as a default for all readers and editors of this wiki, feel free to start a conversation with the community and contact me.
  • On wikis where the changes are visible by default for all, logged-in users can always opt-out to the Legacy Vector. There is an easily accessible link in the sidebar of the new Vector.

Learn more and join our events

If you would like to follow the progress of our project, you can subscribe to our newsletter.

You can read the pages of the project, check our FAQ, write on the project talk page, and join an online meeting with us (27 January (Thursday), 15:00 UTC).

How to join our online meeting

Thank you!!

On behalf of the Wikimedia Foundation Web team, SGrabarczuk (WMF) (talk) 22:11, 24 January 2022 (UTC)

@SGrabarczuk (WMF) I liked the polished look, but the way important buttons (Contributions, User page, Discussion, Watched list) are hidden made me switch back. Plus phab:T300182 Vojtěch Dostál (talk) 18:12, 27 January 2022 (UTC)
Hello, @Vojtěch Dostál. Thanks for your comment. The watchlist icon is directly available now, next to the notifications icons and the user menu. Regarding the other links, I'll pass your feedback to the team. By the way, what's the resolution of your screen? SGrabarczuk (WMF) (talk) 06:10, 1 February 2022 (UTC)
@SGrabarczuk (WMF) 1920x1080. Vojtěch Dostál (talk) 06:43, 1 February 2022 (UTC)
It seems to interact poorly with the preview gadget (CC @Bene*) which now covers the label/description box instead of keeping to the right hand side. Hopefully that's something that could be fixed, given how much whitespace there is available.  – The preceding unsigned comment was added by bovlb (talk • contribs).
@ SGrabarczuk (WMF): I had to go back to legacy. The problem with the preview gadget is a deal breaker. Bovlb (talk) 17:28, 4 February 2022 (UTC)
Not useful to me, since the pages I read / edit never have a dynamically-generated table of contents. Where I do edit items with a table of contents, squeezing it into the left side of the screen would interfere with page readability. --EncycloPetey (talk) 16:42, 5 February 2022 (UTC)

Stop allowing unregistered edits

Imo, unregistered edits/ anonymous (IP-based) edits are creating more harm than benefit. It takes a lot time to clean up. They often go unnoticed for months and harm the data integrity/quality when doing SPARQL queries. I think if people want to do anonymous edits, they can still sign up and edit something immediately. But it will deter most, as it is an extra step. I personally don't know any website that allows anonymous edits of data. Germartin1 (talk) 07:18, 27 January 2022 (UTC)

Pinging @vigneron: : any opinion about that? Personally, I do not like IP editing but some people will stress that valid statements from IP are good too. Perhaps a better watch list for recent IP statements?
Imo, evidence to support the sentence 1 assertion would be handy. Right now we just have a controvertial policy proposal predicated on a dubious assumption and buttressed by confirmation bias. --Tagishsimon (talk) 08:22, 27 January 2022 (UTC)
Usually 5–10% of edits by unregistered users and newcomers (i.e. not yet autoconfirmed) are being reverted. Maybe we still miss some vandalism here, but the vast majority of these contributions are fine and valuable. —MisterSynergy (talk) 08:46, 27 January 2022 (UTC)
Good to hear that, where do you get this number? Germartin1 (talk) 09:16, 27 January 2022 (UTC)
This is the fraction of unpatrolled edits which have a mw-reverted tag attached to them. —MisterSynergy (talk) 09:24, 27 January 2022 (UTC)
Doesn't that exclude rollbacks then, since they automatically mark the edits as patrolled? - Nikki (talk) 09:57, 27 January 2022 (UTC)
By "unpatrolled edits" I am referring to all edits which are not autopatrolled. In other words: all edits by users without the (auto)confirmed status, regardless of whether they have been manually patrolled meanwhile or not.
My measure is missing (1) undetected vandalism, (2) cases where the tag is missing for some reason (rare, but possible), and (3) cases where the vandalism was overwritten without using revert functionality (also not that common). Based on what I see during my daily patrolling routine, I do not think that the estimation of 5–10% is completely off. Clearly most of the unpatrolled edits are valuable. —MisterSynergy (talk) 11:13, 27 January 2022 (UTC)
Why are you counting the percentage of reverted edits that have not been patrolled? That percentage is basically meaningless. If 5-10% of what has not been checked by patrollers are reverted, I can only imagine how much bigger the number of reverted patrolled edits made by IPs is Darwin Ahoy! 08:52, 28 January 2022 (UTC)
It's 5–10% of all edits that need to be checked. The actual number usually sits more at the lower end of this range, but given we deal with *some* uncertainty, I prefer to provide a range that extends to some meaningful larger fraction.
We explicitly patrol around 30% of all edits which need patrol, and sometimes even way beyond that. However, patrolling is a bit of a tedious task since most unregistered editors are indeed editing with good faith, and remarkably good skills. The fact that we do not explicitly patrol the other ~70% does not mean that this is all garbage. We simply do not have the means to patrol them efficiently. —MisterSynergy (talk) 09:53, 28 January 2022 (UTC)
I could imagine that IP editing is a bridge to getting registered editors: The ease of contributing without friction gets someone excited, then they create an account because they want their contributions associated with their account. Toni 001 (talk) 10:20, 27 January 2022 (UTC)
 Support Lectrician1 (talk) 11:32, 27 January 2022 (UTC)
 Oppose IP edits can actually be very beneficial to the addition of data on Wikidata. There is infinite amount of data that can be contributed to Wikidata and in no way will ever be able to develop the tools to add all of the data or expect people to create an account to add it either.
The example that convinced me that IP edits are needed is an idea for an interface that allows people to view and edit music-related data derived from Wikidata. Unfortunately, we do not have all the data on Wikidata we need to reliably show users what they're looking for so it makes sense to give them the ability to edit as well. This way, they can potentially benefit thousands of other users of the interface just by making a single edit. The only problem is that this is prone to vandalism. So, given the potential great benefit of IP edits, I'd say we should avoid preventing them as much as possible and look to alternative solutions like an edit review system where registered users approve edits made by IPs. Lectrician1 (talk) 18:27, 2 February 2022 (UTC)
That vandalism go unnoticed for months is more an argument that patrolling on this site is not done right or insufficient. I think automatically granting patroller to anyone with 50 edits does more harm than good since it is too easy to abuse, and I will leave out the how not to give the wrong people any ideas. This site needs more patrollers but this limit needs to be tenfold higher in my opinion. Infrastruktur (talk) 15:00, 27 January 2022 (UTC)
@Infrastruktur If data was actually used like on high-usage Wikipedias, we would have more people monitoring vandalism.
Then again, the problem is that vandalism shouldn't be happening in the first place and stopping IP edits is a very effective way to do this. Patrollers should not be wasting their time reverting things when they could be contributing instead. Lectrician1 (talk) 16:57, 27 January 2022 (UTC)
It really would be nice to get some reliable statistics on this somehow. I just checked my own watchlist (which includes thousands of items) and out of 16 IP edits in the last week, all but 3 were reverted (several by me because the item was on my watchlist!), so that's about 80% vandalism in that small sample. On the other hand I certainly have seen a few good edits from IP addresses once in a while and it would be a little sad to lose those. So I do think we need real stats on this. ArthurPSmith (talk) 18:03, 27 January 2022 (UTC)
A while a ago I made a tool at wdpd.toolforge.org that has more insight in many ways than what is available elsewhere. All information is being updated every 30 minutes, and it applies to all unpatrolled changes of the past 30 days (aka the "recent changes").
However, I explicitly mention that the tool emerged from a PAWS notebook that I was originally using, and its UI is pretty crappy since this really isn't my field of expertise. Until now I merely use the tool for my personal patrolling routines, but I have intentionally not advertised it much yet.MisterSynergy (talk) 20:29, 27 January 2022 (UTC)
Yes I have the same experience, I think those who really want to edit will just make an account. Sometimes even I do IP edits if I'm lazy to login. Germartin1 (talk) 04:50, 28 January 2022 (UTC)
Actually, I remember that a lot of edits that are marked patrolled are IPs changing labels or descriptions in a language I do not know. I don't ever patrol these and It's likely others don't as well. This could make judging the quantity of actual vandalism created by IPs hard. Lectrician1 (talk) 20:18, 27 January 2022 (UTC)
@Germartin1 Have you heard about Wikipedia? That's a website that allow anonymous edits of data in more than 99% of its language versions. Ainali (talk) 20:39, 27 January 2022 (UTC)
@Ainali Yeah and that (the enwiki) should ban them too. All someone (or I) has to do is run a study updating this 15-year-old one. Based on the anecdotal experience I've heard from people who monitor Huggle, an extreme proportion of IP edits are vandalism on the English Wikipedia nowadays. Lectrician1 (talk) 21:02, 27 January 2022 (UTC)
@Lectrician1 Well, I was just surprised that the statement "I personally don't know any website that allows anonymous edits of data." slipped by for so long. That seems unlikely, given it was made on one of the Wikimedia projects. Ainali (talk) 22:16, 27 January 2022 (UTC)
 Support we can can check the effect over few months. We could gather more refined data but it's generally discussed that IP quality here is worse than on other wiki platforms (not the first time I hear this complaint) and we have the example of ptwikipedia. If they survived with no issue for more than 1 year with no IP editing in ns0, we could probably do that as well, considering that on our platform the constructive role of IPs is probably lower. You could argue that is similar of course, but definitively not better... which means we know that we can start with a test and see how it goes comparing the number before and after, with no terrible impact.
However, we should address first the phabricator regarding the role of interlinks. This is the biggest con, more than some strong ideological or personal view about the role of IPs in our ecosystem.
Of course we have some alternative, we could maybe try to reestablish a decent auto-patrolled flag given after in-depth community evaluation, and shape the architecture of content-related interactions around that, focusing patrolling on less trusted users, still allowing IPs but drastically reducing namespaces or portions of items where not only IPs but also recent users are allowed to edit freely (or at least with a limited pace). This way the "ideological" idea of excluding IPs that seems terrible to the eyes of some users is replaced by something more subtle and closer to existing flag architecture and limitation on other Wikipedias. There is however probably some confusion between (auto)confirmed status and autopatrolled status. An autopatrolled status purely or massively based on automatic metrics is much closer to auto confirmed, for example. In general, the lack of literacy and confusion about flags comparing with other local Wikipedias where some users are mostly active or come from makes very complicated to have such discussion. I am active on many Wikipedias and I can feel how people talk about flag architecture with the limited view they were in contact at the beginning of their wiki experience. When these experiences are combined on a multilingual platform, it's very difficult to discuss them properly, since many users make stronger assumptions about certain aspects based on the original experience on some local Wikipedia that it's still their main reference. Flag architecture and combination of possible constraints are a dozens of very different scenarios and it's really difficult to agree, our personal experience is much more fragmented than we think. So I believe that instead of re-discussing that, testing a strong IP limitation is at the moment an easy way to address the problem while producing collective feedback. If it works and shows limited impact on overall metrics, we could go with that further.
In any case I am not a patroller, so let the patrollers decide. --Alexmar983 (talk) 23:34, 27 January 2022 (UTC)
Okay, I'm an active patroller and I find this response particularly unhelpful. Some remarks/answers:
  • Regarding rights: both "autopatrol" and "patrol" are identical to "(auto)confirmed" at Wikidata. In my opinion there is no reason to consider changes here since this seems already pretty balanced. If you want to discuss rights, consider adding "rollback" automatically to all accounts that have shown a certain experience (to be defined: # of edits, time since registered, etc). This would make the pool of potential users involved in patrolling much larger, compared to the current rollback-by-request approach; some Wikipedias assign this right automatically to experienced users as well.
  • There are currently no efficient ways to partially protect item pages from modifications by IP users. Abuse filters would probably work, but this would be extremely fragile and inefficient. In other words: I think there would be dev input required to make the software fit for a limited IP editing scenario. That said, I can't tell which part of items should not be editable by IP editors any longer. Problematic editing is not clearly concentrated to some limited area.
  • Regarding the patrol process itself, you have not even mentioned the most relevant issues:
    • Tooling; the patrol function is poorly accessible in the web UI, thus serious patrollers need to familiarize themselves with a couple of tools; some need to be installed (user scripts).
    • Workflow; at Wikipedias, a revision-based patrolling workflow works pretty well, but at Wikidata this is not really the case. With "atomic edits", users often make a series of edits to a given page, and the quality can often only be assessed when all changes are reviewed and patrolled at the same time. However, patrol nevertheless fundamentally works on revisions here at Wikidata; we would need a page-based and/or user-based patrol workflow.
    • Filtering; since this is a multilingual project involving diverse content from all over the world and beyond, it is often difficult to patrol an arbitrary unpatrolled edit. Filtering is key to make the workflow efficient since it allows to break down the overall patrol workload (5000–7000 edits per day) to doable smaller portions, and allow patrollers to be subjected to fairly familiar content changes only that are predominantly actionable for them. We do already have some tools to filter edits, but they are not well-known unfortunately.
That said, I would prefer to have a discussion based on facts, not feelings. Everyone seems to have a strong opinion based on some level of sporadic interaction with instances of vandalism. While it is true that we should oversee unpatrolled changes much closer and some cases of vandalism remain undetected for way too long, the situation is much better than apparently anticipated by many of us. —MisterSynergy (talk) 01:23, 28 January 2022 (UTC)
My comments are not based on "sporadic" interaction and I am not really a "feeling" person, but whatever. Beware: the architecture of flags are very peculiar on different platforms and sometimes a switch or knob looks similar on the outside but not its effect. That's why I think it's better to try a clear selective option and build on that, such as testing IP removal. We can afford the test, we know from ptwiki. But since this is "particularly unhelpful" go on.--Alexmar983 (talk) 02:02, 28 January 2022 (UTC)
The first paragraph of your initial comment is pure speculation, and the language you have been using is a clear indicator of this. You can prove me wrong by providing references for your claims.
The effects of disabling IP editing cannot be measured within a few months since the conversion to registered editors is happening on a fairly low level. For instance, if you were to cut the conversion rate by 50% for a few months or even a year, you would likely not be able to notice any difference. However, in the long run these missing editors are accumulating to a reasonable deficit, but it is way too late to react then. For Wikipedias, for instance, new editors registered in the past year usually make up fewer than 5% of the community. It takes a while, clearly longer than a year, until missing new editors are noticable. —MisterSynergy (talk) 10:03, 28 January 2022 (UTC)
I might tell you what your language indicates to me, but that sounds like a speculation. And the second part of your comments lacks any number or facts, which is not a problem to me (it's pretty obvious it should be this way, this platform is peculiar) Just be aware this is the inevitable level of this discussion by both sides, so let's continue this chat based on.... feelings.
I have worked for years in helping newcomers on WIkidata (literacy class, cross-platforms introductions, welcome messages) and I don't expect the long-term loss of contents from limiting IP-users for few months to be that tragic. I could feel the problem on some local Wikipedia for historic reason... If this were a local Wikipedia with its established history, we would have at least a couple of IP already commenting in this discussion by now. Also, I never received a comments from IPs on my talk page here, nor witnessed a comment in the many property creation discussions. Our IP users do not seem very engaged in this community like they are on the other platforms, so I don't expect the disruption of their long-term contribution by a limited test to be so critical. Even if not adopted, I would expect with a correct use of warnings before and during the test, the result to be more a shrug of shoulders than a structural loss. Just link to a landing page from a preliminary warning and collect the feedback of the IPs if you want to be safe and avoid some long-term damage. But for a test of few months the scenario depicted in your comments looks more like a a worst possibility than an average projection. It's useful to consider discussing how to refine the test to minimize its impact, more than a reason to actually avoid to do it at all.--Alexmar983 (talk) 05:12, 29 January 2022 (UTC)
The 5% estimation comes from German Wikipedia. Since ~2010, they gather round 200 new users per year that make 50+ edits annually; in total, there are ~5000 users that make 50+ edits per year. In fact, since most Wikipedias are ~20 years old, the 5% estimation is not surprising at all. It could be somewhat more for Wikidata (only ~10 years old).
My numbers related to vandalism in Wikidata, IP editing, and patrolling come from my patrolling efforts. I am the most prolific patroller here for quite a while in terms of use of the "patrol" function (I have done 60% of all patrol actions during the past 365 days, or ~352.200 of ~586.700 as of now). For my patrolling efforts, I have developed a bunch of scripts with a data table of ~30 fields for all unpatrolled edits as a backbone that allows me to have more insight into the problem than what is available anywhere else. Some of that insight is shared at wdpd.toolforge.org. I am not speculating here. —MisterSynergy (talk) 06:25, 29 January 2022 (UTC)
Your language indicated to me that you are modelling assumptions originally build a dewikipedia-centric view, your example proves it. Your insight is on patrolling, not on the relationship between patrolling and the quality of content, or the general overall behavior of users. This is the self-referential aspect I was talking about. This is what happened on ptwiki if I reconstructed it correctly. Somew historic users working as "professional" patrollers inside the community opposed the change, than it was done, and nothing serious happened. Apparently, they were not as expert as they imagined to predict the output, which is normal when you only see one aspect very very well. And that was a Wikipedia of "discursive" content, in a structured database attracting users outside the wikipedia standard ecosystem, I would not expect it to be worse.
Try this hypothetical question: if it works like on ptwiki not really altering the overall metrics of growth, that is we have less patrolling and growth is not altered, what are you planning to do in your free time here? I can tell what we can do if it does not work, creating a real autopatrolled flag since we are not dewikipedia, or we can be like more like dewikipedia and not allowing IP edits to be shown unless approved.--Alexmar983 (talk) 13:45, 29 January 2022 (UTC)
  • During patrolling I do see all facets of IP/newbie editing, the good ones and the bad ones. What I see is pretty much at odds with your claims here.
  • Regarding growth: as I mentioned earlier, it takes years until we’d realize that something is wrong, and it might even be difficult to relate it to the removal IP edits since the editor base changes slowly. Not worth the risk.
  • If we were to remove IP editing, we would immediately lose ~15.000–25.000 different IP editors and ~100.000 edits per month of which the vast majority have high quality. The damage would be there the very moment we’d change this.
  • German Wikipedia uses "flagged revisions", an approach which many consider as a failed one. The underlying software is in poor shape. I do not consider it an option for Wikidata (and would support removal in German Wikipedia as well); since Wikidata content is distributed to several secondary places (WDQS, connected wikis, some SQL databases, etc.), it would likely not be feasible to introduce flagged revisions here anyways.
  • My available time for Wikidata allows me to engage in microtasks such as patrolling where cases are immediately actionable, as opposed to more complex editing that often requires some offline preparation and a longer timeframe of available time. If there was no patrolling to do any longer, I would probably simply spend less time here.
MisterSynergy (talk) 14:16, 29 January 2022 (UTC)
You know, let's add few more thing things. For the future, when I might link this discussion.
The "facts" aspect exist only when you show data or find a way to have them. If you are not going to do any of those things (they are very boring, I did crunch some numbers in the past), there is really no need to point out if someone is giving you an impression. Your comments are mostly impressions, which is normal in a brainstorming. No problem from my side in assuming they are not sporadic, of course.
Let's try to express how it looks to me with an example. It's like those discussions in real life about police funding, tools used by cops, invasive monitoring, perception of crime... it's very difficult to have a broader view and in the end there might be no effective control of problems without mastering people's patterns (i.e. the root of behaviors of citizens). Without that, some self-referential social structure might evolve: police might say how unhelpful is to listen to a workshop about social strategies where they firstly need more guns, for example. To me, a community that sees no difference between autopatrolled and auto-confirmed starts with an handicap on understanding users patterns, and like other wiki communities this will take its time to adjust and evolve in some stable situation, maybe simply compensating the gap with bigger control structures. Sometimes communities might not realize how time-consuming these structures have become. I was talking with some ptwiki users and one of the good thing of removing IP editing is that part of their patrolling architecture and discussions was downsized, sparing much more time to focus on content.
The problem in our case compared to a standard Wikipedia is that we might not have the time to go though all the passages required to refine the patrolling system so that it more or less works, providing some efficient and cost-effective output. Normally, I would not care about that on the Wikipedias, but:
(i) we grow much faster. On a Wikipedia you can wait 5 or 10 years for people to admit that overfocusing on patrolling techniques might not be the core point. In that similar time range we explode and risk some backlog (maybe different from those on Wikipedias).
(ii) this database is used more and more and cannot afford too many problems with undetected vandalism, most of the databases that are interacting with us are based on some "trust" of user editing them and not on how great and effective they are at cleaning up. This relationship is more structured, intimate and strong than the one that Wikipedia has with its own various types of sources. We need such external IDs for notability and third-party sources in all the content that goes beyond providing metadata for out Wiki sister platforms, and they need us at growing pace. New and very active users creating content might work on both sides... so these two worlds are going to collide and we might not have time to experience all the phases that a community requires to master a proper patrolling workflow.
(iii) There processes of refinement of patrolling are quite slow even when people speak the same language and come form a collective history, here these discussions suffer even more.
That's why I feel that trying some simpler solution now might be worth the effort. I like refined metrics, I studied them for years, but in our case, I would go more with a simple option such as removing IP editing in ns0.--Alexmar983 (talk) 03:40, 28 January 2022 (UTC)
Thanks for comments Alex, I  Support your proposal, testing the effects for a few months/weeks sounds good. And to see if there is an increase in new user accounts. Germartin1 (talk) 04:54, 28 January 2022 (UTC)
 Support --Derzno (talk) 07:50, 28 January 2022 (UTC)
 Support For 3 main factors. 1) it's an huge privacy and security hazard, even for registered users who log out accidentally; 2)It has been common to find high profile vandalism made by IPs lasting for weeks or months. My experience is that this is especially prevalent with IPs 3) it's very difficult, and often impossible to communicate with IPs. This is especially true for the Wikipedia app, which uses Wikidata as playground for Wikipedia users telling them they are editing Wikipedia when in fact they are playing with Wikidata descriptions. Not only the app completely blocks any communication with its users, but it's considerably difficult to even understand we are editing using an IP. But just the first point would be more than enough for that that kind of edit never have been allowed here. Darwin Ahoy! 09:10, 28 January 2022 (UTC)
ad 1) well, IP edits are pretty common on practically all Wikipedias. WMF is actively working on a solution to make them less visible in some way, so the privacy hazard should not be a major issue here for much longer.
ad 2) new registered accounts have a very similar editing pattern as unregistered editors, but they are more difficult to track since they hide the information an IP is exposing
MisterSynergy (talk) 10:10, 28 January 2022 (UTC)
Common or not, it still is an huge privacy and security hazard that should never have been allowed in, and should be fixed ASAP, with or without a WMF solution. Darwin Ahoy! 23:23, 28 January 2022 (UTC)
BTW, that you are tracking "the information an IP is exposing" is definitive proof that things are very, very wrong here. You should not know that information at all. In Brazil, a twitter account was set up to expose government workers editing from certain IP addresses associated to their workplaces. In at least one case, one person was the target of a disciplinary process after editing in Wikipedia using the workplace network. That is not an argument to keep IPs here, that's a very strong reason to ban them away, and the sooner the better. Darwin Ahoy! 14:25, 29 January 2022 (UTC)
If we discuss this problem in terms of vandalism prevention, this is indeed a relevant factor. Different IPs used by a single individual can to a large extent related to each other, and users as well as administrators engaged in counter-vandalism activities use this information pretty much every day. If we were to remove IP editing, the vandals would to some extent use throw-away accounts which are much more difficult to combat.
That said, meta:IP Editing: Privacy Enhancement and Abuse Mitigation describes the foundation’s efforts to change visibility of IPs of anonymously editing editors. This will happen in the rather near future anyways. While I am not aware of all the details, I think there will be user groups in the future which will still be able to see full IP addresses, in order to combat vandalism efficiently in the future as well. This visibility change on Wikimedia-level should really address all the concerns regarding privacy—this is not something we should be doing here at Wikidata locally. —MisterSynergy (talk) 14:38, 29 January 2022 (UTC)
 Strong oppose Show us some statistical evidence and open an RFC. We shouldn't be making such drastic decisions based on opinion and anecdotal evidence. I say this as someone who has seen my fair share of vandalism from IP edits, but it's confirmation bias to assume that means most are vandalism. Also the third and the final sentence of the OP directly contradict one another. --SilentSpike (talk) 12:00, 28 January 2022 (UTC)
I'm not sure why statistical evidence is so important. The case is more complex than X% of edits are "bad". For me it doesn't matter if 90% or 20% of edits are reverted in the end. 2% is already too much. For me, the overall quality of Wikidata is more important for which I'm willing to sacrifice some good edits, as mentioned above that not even half of IP-edits are even patrolled. I understand that not everyone will share this point of view.
The only statistical evidence for a "registration only scenario" is to have a control group, which obviously doesn't exist. That's why I propose a trial for a few weeks/months and see the effect. But if someone can come up with a number of a quality ratio between IP-edits and registered users, that would be helpful. Germartin1 (talk) 15:42, 28 January 2022 (UTC)
"the point of view" is the background of our usual "social ecosystems" at which we are more or less accustomed as avarage wikimedians, the belief that patrolling is inevitable, and a necessary cost to pay in order to allow some flexible and multi-faceted freedom of editing. In a way, ptwiki simply deconstructed this vision more than usual and adopted a model that values pro and cons in a different way. For users who have been experiencing the "average" wiki experience, patrolling is a core aspect and you need a lot of discussions about it, all of this is inevitable and delicate and worth many battles, a crossroads of how the vision of the projects are shaped. This evolved in dozens of different patrolling models all supported by very accurate data when they were adopted or confirmed or improved. The nature of Wikidata however drastically encourage interaction with third parties, who do not share this balance of values. They don't care if you can invest ideas and energies in patrolling tools, they know big database exist and grow in any case with lower freedom of input. This is the cultural "battle" that you see here. I am mostly surprised how civilized it is. It usually gets more tense on local Wikipedias.--Alexmar983 (talk) 06:25, 29 January 2022 (UTC)
 Strong support. Perfektsionist (talk) 15:17, 28 January 2022 (UTC)
 Strong support Vandalism damages credibility of the project. It has a negative effect on external use of wikidata as well as interfering with queries. As the number of external identifiers increases, the number of maintenance increases, so vandal edits can be easily covered by "good" bot edits and remain undetected for a long time. As for my humble watchlist (89k items), IP vandalism or test edits are pretty common. Test edits are not as dangerous as vandalism, but there is still a reasonable risk of external use of a "bad" snapshot version. Granular protection for wikidata items is unfortunately still not available.
In my opinion, registration is not a big obstacle for a "good" users, and it also ensures that they know that they are registering for a separate project with its own rules and it is not just a game.--Jklamo (talk) 17:46, 28 January 2022 (UTC)
 Strong support Most of the label/description vandalism are from IP-users, so unallowing anonymous editing might be useful against hit-and-run IP-user edits. -CrystalLemonade (talk) 20:22, 28 January 2022 (UTC)
 Question: Given that one of Wikidata's main goals is to provide services to client projects such as the Wikipedias, how would that be affected by a ban on unregistered users? For example, what about page moves? Would it cause a move away from hosting infobox data here instead of locally? Bovlb (talk) 21:17, 28 January 2022 (UTC)
 Oppose we need hard data to make such a large change. i've seen lots of anons that have made useful contributions and lots of vandals with accounts. I would propose that we as a community attempt to patrol all edits in some randomly chosen period and then compute statistics based on this. the data (or lack thereof) presented so far is unconvincing to me. BrokenSegue (talk) 01:08, 29 January 2022 (UTC)
@BrokenSegue: Ok but if we had the data where would be the threshold that would support this change? Germartin1 (talk) 01:20, 29 January 2022 (UTC)
@Germartin1: honestly I don't know. but if a large majority of anonymous edits were deemed vandalous and we failed to revert most of them I would possibly support. This decision is far too large to be made without a formal RfC and more supporting material. BrokenSegue (talk) 01:24, 29 January 2022 (UTC)
@BrokenSegue: What about the time between the vandalism and reversion, and the time/cost it takes to revert them, which could be used for more useful things. Yes I know this chat is just to get an idea of what others think and not binding. Germartin1 (talk) 01:33, 29 January 2022 (UTC)
i don't see how we could quantify those things easily. people who volunteer to patrol are choosing to donate time to do this already and so I don't see how we can account for the cost/alternate uses of that time. already lots of effort is put into areas of wikidata which are probably not "worth" that investment. maybe another good result of a survey of vandalism would be to see how reasonable a anti-vandalism bot would be to make. BrokenSegue (talk) 02:29, 29 January 2022 (UTC)

I have to say I am surprised that the balance is already shifting so much in favor of removing IP editing. I expected this discussion to be closed quickly, resuming it in one year for example, but maybe we have the chance to go forward with some test sooner than I imagined. It could be possible to create a specific permanent page on this topic and gather more feedback with some mass messages to all active users and share more opinions. I have been organizing classes in real life, there are dozens of users that I have trained currently active here, they take care of specific group of articles in their watchlists, they perform de facto patrolling not because they like to do so, but because it's inevitable. They care about the quality of the items they edit (for example they need transclusion of our metadata on some external database), I feel that there might be a silent majority who prefer a more selective editing to preserve quality. Who knows if this is the same in all our fields, maybe it's like that for biographies while for other topics the IP is still much more welcome. I am also curious if we could start a test on more structured items such as those with more than 10s Wikiprojects links or 20s statements or specific instances.--Alexmar983 (talk) 05:32, 29 January 2022 (UTC)

@Alexmar983: I would appreciate if you can do it ("specific permanent page") or RfC, as I'm less experienced in that domain. Germartin1 (talk) 09:03, 29 January 2022 (UTC)
I have never requested RfC, I don't know myself. Mostly focused on content creation and literacy classes than general discussion here.
Also m:Limits_to_configuration_changes#Prohibited changes is not again this rule of IP editing, the discussion was already settled with ptwiki. Everyone can still edit ptwiki, registration is not an impossible task, it's even lower step than for example getting a revision approved on dewikipedia. it's actually closer the standard experience of newbies on various P2P platforms on the web. If a "normal" and established Wiki can do it, of course we can do so. Our main problem is interaction with local Wikis and NOT the choice of editing structure related to our internal dynamics. But for example if we do this change maybe surgically removing IP editing from the rest of labels and metadata might be easier. Still the meta page has no real effect on this issue. It's up to the community to decide.--Alexmar983 (talk) 13:32, 29 January 2022 (UTC)

Question : would it be technically feasible for an IP to edit, that edit would be 'visible but not hard set until patrolled/revised', kinda greyed ; when patrolled/revised then accepted/hard set (something like in the ruwiki)?Bouzinac💬✒️💛 13:39, 29 January 2022 (UTC)

I briefly addressed this in Special:Diff/1569776028. Short answer: probably not feasible. —MisterSynergy (talk) 14:17, 29 January 2022 (UTC)
@Bouzinac We had a similar thing in wiki.pt previously to banning the IPs ("validation"). It worked miserably, making a mess in the article history, and severely increasing the backlog of maintenance. It was entirely discarded. Darwin Ahoy! 14:29, 29 January 2022 (UTC)
Thanks, seems rather evil idea. One last question : are there (roughly) same IP editors number in ptwiki compared to IP editors in wikidata ? Bouzinac💬✒️💛 15:46, 29 January 2022 (UTC)
  •  Oppose for now. Let's make a data-driven decision, I agree with what BrokenSegue has written. Vojtěch Dostál (talk) 16:42, 29 January 2022 (UTC)
  •  Oppose You can’t make decisions like this without compelling evidence. There is no compelling evidence that the benefits could outweigh the downsides at the moment. --Emu (talk) 21:20, 29 January 2022 (UTC)
  •  Support When Ip vandalism is reported in the administrator noticebard in more than one thread and by more than one user in the space of 24 hours and yet no one takes action, when two wikipedias took actions in minutes or a couple of hours, it is time to cut the problem by the root. Tm (talk) 23:18, 29 January 2022 (UTC)
    I know that there are some IPs that make good or excelent contributions but, according to my subjective experience and "fellings" (no hard data), IPs are more a nuisance and\or vandalism then a positive contribution to this project, together with what appears to be enough people patrolling the IPs edits explain the bad signal-to-noise ratio. Tm (talk) 23:49, 29 January 2022 (UTC)
    That happens with registered users as well. We have too few administrators, among other problems. Emu (talk) 00:23, 30 January 2022 (UTC)
    it often doesn't make sense to take action if the vandalism is "done". BrokenSegue (talk) 01:51, 30 January 2022 (UTC)
  •  Comment Sadly, there is still overdue of some tasks that would integrate combating vandalism in the system: phab:T116923, phab:T190529, phab:T213630, etc. --Matěj Suchánek (talk) 09:51, 4 February 2022 (UTC)
  •  Oppose with full ban,  Support with ban against creation of items. Most medium and large scale wikis don't allow creation of pages for IPs. This can at least reduce a lot of their spam Amir (talk) 16:10, 5 February 2022 (UTC)
    A lot of "page creation" comes from existing Wikipedia articles. From the Wikipedia perspective, this is more like an edit operation. We need to be sure that we don't impose restrictions that will interfere with our ability to support client projects. Bovlb (talk) 21:38, 5 February 2022 (UTC)

Query Service news

In case you missed it before:

--- Jura 11:44, 5 February 2022 (UTC)

Report for double-registered query

Hi, Wikidata editors, I have found the double-registered query, Q2122042 and Q4389469. Both of the entries refer to a specific animal.

I have never touched the administrative action on Wikidata (I'm a Japanese Wikipedian) and I'm busy now. I hope this report will support Wikidata project.--Sethemhat (talk) 14:05, 5 February 2022 (UTC)

The items have the same image, but their taxon name (P225) and taxon rank (P105) are different.
Also, if its parent taxon (P171) is correct, Japanese river otter (Q4389469) is of the species Eurasian otter (Q29995) and not Japanese otter (Q2122042).
So image (P18) probably needs fixing. --- Jura 14:23, 5 February 2022 (UTC)
The Japanese otter subspecies of Hokkaido (Lutra lutra whiteleyi) is considered to be the Hokkaido subspecies of the Eurasian otter, but its taxonomic position is not clear. Because there are too few existing specimens to fully examine.--Afaz (talk) 03:16, 6 February 2022 (UTC)
Actually, p18 are different too. --- Jura 11:54, 6 February 2022 (UTC)

Automated import/update of claim from Wikipedia possible?

Hi all,
I have a question/idea regarding an automatic import/update of data in Wikidata: Some basketball players (e.g. Nihad Đedović (Q781758)) have a statement called "Basketball Bundesliga ID" (P5724 (P5724)). This one is out of date for a few month now as the German basketball bundesliga restructured their website. Now, many player pages on the German Wikipedia include a template with the updated ID as a parameter, e.g.

* {{BBL-Spielerprofil|UUID=01b076ff-0366-48ad-a32c-2d993f6e6439|name=Nihad Djedovic}}

So the UUID here would be the new ID for Wikidata. On the German Wikipedia this ID was updated with the help of some kind-of bot, Google and some manual work. This is the only reliable source for the new ID AFAIK, there is no mapping or anything from the old ID to the new ID. So a bot would need to check if the German Wikipedia page includes a template called "BBL-Spielerprofil". If yes, then it should fetch the UUID parameter and update the P5724 property in Wikidata. What do you think about this, is this a good idea? --Bthfan (talk) 15:30, 6 February 2022 (UTC)

@Bthfan Is this what you're looking for? You can run the import yourself from this URL. However I think the property's format as a regular expression (P1793) may not be correctly set, nor is format constraint (Q21502404). Vojtěch Dostál (talk) 15:52, 6 February 2022 (UTC)
An important point to note here is that you should not remove the old IDs as they were correct at one point in time and may assist in retrieving information from Internet Archive's way back machine or enable cross referencing with other archives that recorded the old IDs. Instead, the old IDs must be deprecated with reason for deprecated rank (P2241) set as withdrawn identifier value (Q21441764). The new IDs can be inserted alongside the deprecated ones. From Hill To Shore (talk) 16:12, 6 February 2022 (UTC)
Ah, yeah, @Bthfan A new property proposal should first be started. I am reverting my test edit linked above. Vojtěch Dostál (talk) 16:18, 6 February 2022 (UTC)
OK, thanks for the information! Then I'll propose a new ID and continue from there on. --Bthfan (talk) 16:35, 6 February 2022 (UTC)

von Rohr or Von Rohr for noble family entries

We are currently about 50-50 in our entries for noble families that use "von", we have "von Surname" (no capital case) or "Von Surname" (capital case) for noble family entries, which should we harmonize on? --RAN (talk) 00:18, 7 February 2022 (UTC)

Are there any conventions here on spelling in different territories? There are cases of families holding land in two different territories and having two slightly different names to reflect the culture of each territory. Before agreeing that all instances are merged one way or the other, we need to confirm that nothing similar applies here. From Hill To Shore (talk) 06:22, 7 February 2022 (UTC)
I would recommend reading en:Tussenvoegsel, there are differences between the Netherlands and Belgium. Sjoerd de Bruin (talk) 08:41, 7 February 2022 (UTC)
Please refrain from harmonizing, both names may well exist as different names, different families. There is so much history in family names and so much to tell about them, it takes really good knowledge to decide to merge on into the other.
For Flemish names, the capital in 'Van der Velde' might be a reflection of a noble class, whilst 'van der Velde' used to be more for 'normal' people. German capitalisation is also worth a good research, not a quick overall decision. RonnieV (talk) 10:07, 7 February 2022 (UTC)

Q110496050 and Q106170847: Same item

Lulli (Q110496050) and Lulli (Q106170847)

Both these items refer to the same thing, i.e., Lulli, a 2021 Brazilian film. I, however, do not know which item should be merged into what. Thanks, Caehlla2357 (talk) 04:36, 7 February 2022 (UTC)

Merged. Conventionaly the higher-numbered QId is merged into the lower number. --Tagishsimon (talk) 05:54, 7 February 2022 (UTC)

Wikidata weekly summary #506

Dumbarton Oaks museum object data

Hello, wikidata community: I and a few of my colleagues in the Dumbarton Oaks library are starting a project to transform object data from our museum content management system into wikidata items. We've just put in a project proposal for the object IDs (Wikidata:Property_proposal/Dumbarton_Oaks_object_ID) so thank you in advance to whoever takes the time to review it. Bettinche (talk) 20:35, 7 February 2022 (UTC)

What if errata for a source has URL but no WD item?

Assume there is a book, or maybe a scientific article. It has an erratum or corrigendum. If it is published as a separate publication in a scientific journal, then it simply has (or may have) its own item, so we can just act like this:

⟨ Supercollider physics (Q21709583)  View with Reasonator View with SQID ⟩ corrigendum / erratum (P2507) View with SQID ⟨ Erratum: Supercollider physics (Q27346421)  View with Reasonator View with SQID ⟩

But what if errata is only posted online? What is the correct way to add a link to it?

⟨ Genius At Play (Q104403703)  View with Reasonator View with SQID ⟩ described at URL (P973) View with SQID ⟨ https://siobhanroberts.com/genius-at-play/errata/ ⟩
applies to part, aspect, or form (P518) View with SQID ⟨ erratum (Q1348305)  View with Reasonator View with SQID ⟩

Maybe this? --colt_browning (talk) 11:28, 8 February 2022 (UTC)

Merge

Please, merge Wiktionary:Forum (Q59653077) and Project:Village pump (Q16503).  – The preceding unsigned comment was added by 217.117.125.72 (talk • contribs) at 11:10, 25 April 2020‎ (UTC).

Is the Caspian Sea a lake?

I'm not sure which wikiproject this would fall under, but I would appreciate the perspective of some geographers at Talk:Q5484. Bovlb (talk) 17:42, 2 February 2022 (UTC)

Certainly a lake, I do not see how it could be classified as not a lake. Ymblanter (talk) 19:53, 2 February 2022 (UTC)
Ymblanter, A sea can be defined as a large body of water fed by various inflows but without an outflow. Similarly, a lake can be defined as a large body of water fed by various inflows but with at least one outflow. From my understanding, Caspian Sea (Q5484) has inflows but no outflows, so meets that definition of a sea. Another definition of "sea" is a large body of salt water connected to an ocean. Caspian Sea (Q5484) does not meet the second definition. Some sources will use one definition and call it a sea while other sources will use a different definition and call it a lake. Wikidata's role here is not to choose which source is correct but to map all claims made by legitimate sources. From Hill To Shore (talk) 21:27, 2 February 2022 (UTC)
@From Hill To Shore: 1st definition seems strange because it says that Urmia Lake (Q199551) is a sea. But the 2nd one doesn’t seem better because ocens may not to be connected among themselves. 217.117.125.83 08:00, 3 February 2022 (UTC)
I did not make those definitions and they do not need to be applied consistently. As with most things in life, definitions have changed over time and are in favour by one group or another at different times. There is also the complication that the definition in one language doesn't always match the definition of an equivalent word in another. The key point though is that we record the claims made by valid sources, not what individual editors think make sense within their limited world view. See National Geographic as one source that talks about defining seas. From Hill To Shore (talk) 08:41, 3 February 2022 (UTC)
Wikiproject Lakes The issue of Lake vs Sea was covered in the press in 2018 regarding Sea vs Lake - Wilson Center among other sources, 1, 2. There also is a peer reviewed article from 2013 calling it a lake. I'm not sure the current consensus, if any, but there can be conflicts on scientific classification versus geopolitical terms for Q's. Why a tomato is both fruit and vegetable -tomato (Q20638126), defined a vegetable for purposes of trade or use. The consideration here may be stating both and ensuring instance of, has qualifiers to adequately inform queries for distinguishing between the contested description or creating should separate Q's linked between the different concept with conditions differentiating based on the geographic object. This comes down to how is this concept best modeled for the geographic object query. Wolfgang8741 (talk) 20:57, 8 February 2022 (UTC)

Query service down?

https://query.wikidata.org/ seems to be broken currently — Martin (MSGJ · talk) 11:43, 10 February 2022 (UTC)

✓ Done--Lucas Werkmeister (WMDE) (talk) 12:35, 10 February 2022 (UTC)

How to query for the user who created an item?

Help much appreciated (even if it would mean to use some SQL or MWAPI call) - --Jneubert (talk) 15:08, 10 February 2022 (UTC)

That would be api.php?action=query&prop=revisions&rvlimit=1&rvdir=newer in the action API, I think. Lucas Werkmeister (WMDE) (talk) 16:05, 10 February 2022 (UTC)
Thanks a lot, Lucas! --Jneubert (talk) 07:58, 11 February 2022 (UTC)

Population: P1082 or P4179?

For many items about geographic entities, we have the number of inhabitants in population (P1082). This property requires an qualifier identifying the date (time) this number was measured point in time (P585) and in the documentation it is indicated that the most recent total value should be marked as preferred. Many places (etc.) have a number of counts of inhabitants in P1082, each with a time qualifier and preferably with a source.

Last year I have been working on the population for Slovenian communities, adding the number of inhabitants for 2020 and also for the female female population (P1539) and male population male population (P1540). The numbers are used on the Dutch Wikipedia, for displaying the number of inhabitants in the infobox and (sometimes) in the text. It is easy to implement this on other Wikipedia's as well, maybe starting with the Slovenian. Newer numbers can be found at [2] and I would like to add these to Wikidata, allowing our readers to get more recent information.

After having imported the Slovenian numbers, I started adding the number of inhabitants for the French communities, from 1968 till 2016. The script I use clearly identifies when a number already exists and does not override the existing value. It also signals problems with communities that changed their name in the course of time without adding numbers. I do solve these problems by hand. The script has run till the INSEE number 57645, but manual solving of issues stopped at 16062.

I realised that it would be much better to run this script using a bot account than my personal account, so I asked for permission to run RonniePopBot. During this request, Multichill posed the question Adding historic population numbers to items will make these items a lot larger. I'm not sure what the current consensus is. Store the current number in population (P1082) and the full set of historic numbers in tabular population (P4179)?. This somehow stopped the request and the processing of this data was stopped.

I would like to update the 8000 remaining values for France as soon as possible. P1082 is the fastest solution for me, as that script is available and I just need to sort out the rejected values. I am confident that I can make a switch to P4179, storing all values found in Wikidata in a text file, uploading this to Wiki Commons, linking that file to Wikidata using P4179 and then removing all but the most recent values from P1082. I can do the same for Slovenian, whilst adding the even more recent numbers.

But what is the view of the community on this? P1082 is easy to find and understand for all contributors, P4179 is far less known and therefore far less used. A huge number of geographical entities have multiple values in P1082, and I have seen many contributors adding values over there. Adding a new number requires inserting a new number with source and qualifiers, labelling this one as preferred and removing the other preferred label. Using P4179 requires the update of a Commons file, inserting a new number with source and qualifiers, labelling this one as preferred and the complete removal of the prior value.

I would like the community to take a clear decision on using P1082 and P4179, so my bot can start processing the remaining values and (if wanted) solve the current situation. Thanks, RonnieV (talk) 17:35, 6 February 2022 (UTC)

I think any sort of continuous function (e.g. population, subscribers, birth rate, etc.) we're sampling over time is better as tabular data and long term we should aim to migrate to this. Perhaps the latest value could be kept accessible for easy downstream use, but that does introduce data duplication which is then liable to discrepancy.
I'm a little unsure what you mean by updating the tabular data requires removal of prior value, surely only the file needs to have the new data inserted appropriately? SilentSpike (talk) 18:28, 6 February 2022 (UTC)
If we put all values only in a file, linked in P4179, we just need to update that file to have all data available.
The current practice is that the most recent value is in P1082 and marked as 'important'. Removing all values from P1082 might give problems with many scripts on many Wikipedia's (and maybe for other users). Having just the current value in P1082 does mean we have to store data twice and remove it from P1082. Or can we safely empty P1082 and does it fallback to the most recent value in the tabular data behind P4179? Thanks, RonnieV (talk) 19:18, 6 February 2022 (UTC)
See my comment at Wikidata:Property_proposal/Historical_Population. --- Jura 19:00, 6 February 2022 (UTC)

population of a country

What would be a good way to model this? It seems to me that Q42884 (and similar items) somehow keep getting edited to include various aspects, which may or may not include the population of a country.

Accordingly, wouldn't it be better to have a distinct item for that refers to the actual population of a country (or region)? Sample Q110834829.

Various values that are, were or could be used for instance of (P31) (with current English labels and descriptions from the items):

  • nationality (Q231002) - a legal identification of a person in international law, establishing the person as a subject, a national, of a sovereign state
  • human population (Q33829) - human that live in the same locality
  • ethnic group (Q41710) - socially defined category of people who identify with each other
  • national demonym (Q81058955) - demonym for citizens or residents of a country
  • people (Q2472587) - plurality of persons considered as a whole, from a government perspective

Also for subclass of (P279) (also with current English labels and descriptions):

  • person (Q215627) - being that has certain capacities or attributes constituting personhood (avoid use with P31; use Q5 for humans)
  • inhabitant (Q22947) - person who lives in a certain place
  • human (Q5) - common name of Homo sapiens, unique extant species of the genus Homo

Maybe I missed some. To simplify the discussion, I skipped the values that refer to other places, e.g. Europe.

How about the following (for P31/P279):

  • For the item Q42884 (for a group people):
✓ OK ethnic group (Q41710) main P31 of item
 Not OK human population (Q33829) covered by different item
 Not OK nationality (Q231002) as it refers to a specific country which may or may not be held to people of the group
 Not OK national demonym (Q81058955) as the item isn't about the name or the word
 Question people (Q2472587) possibly, as it could be an alternative qualification for any group
The following are for P279:
 Not OK person (Q215627) the concept is meant to refer to humans only
 Not OK inhabitant (Q22947) as that is a distinct concept
✓ OK human (Q5), but possibly redundant with some of the others
 Not OK parent classes based on the location of the place
 Not OK un-referenced parent classes based on some view of the group being part of some other group
✓ OK referenced parent classes based on some view of the group being part of some other group
✓ OK human population (Q33829): main P31 of item
 Not OK ethnic group (Q41710): covered by different item
 Not OK nationality (Q231002): status of some of the residents only
 Not OK national demonym (Q81058955) as the item isn't about the name or the word
 Question people (Q2472587) possibly, as it could be an alternative qualification for any group
The following are for P279:
 Not OK person (Q215627) the concept is meant to refer to humans only
✓ OK inhabitant (Q22947)
 Not OK human (Q5), redudants with "inhabitant"
✓ OK parent classes based on the location of the country/place
✓ OK national demonym (Q81058955)

Alternatively, we could just merge some or all of these concepts.

@DayakSibiriak, Infovarius, UWashPrincipalCataloger:. --- Jura 10:59, 7 February 2022 (UTC)

Ethnic or hereditary groups are quite different from inhabitants of a country, even if they can largely overlap, so distinct items makes sense to me. ArthurPSmith (talk) 19:11, 7 February 2022 (UTC)
A definition for the item Germans Q42884 is confused, mixes an Ethnic group with Multiethnic citizens/residents of Germany. For the Germans as ethnic group:
  • ethnic group (Q41710) as instance of
  • people (Q2472587) as instance of
  • inhabitant (Q22947) as subclass of (Nithwestern Europe, or Indian subcontinent etc, not for any ethnicity, some with no main land)
  • indigenous people [of] (Q103817) as subclass of (for any ethnic group). DayakSibiriak (talk) 10:17, 8 February 2022 (UTC)
    Interesting points.
    @DayakSibiriak: Q42884 is indeed confusing, but the question isn't limited to that item. Compare, e.g., with Q121842.
    For the "group of people":
    • As there is a dedicated property "indigenous to" (P:P2341), I'd avoid using "indigenous people" (Q103817 as P31 value.
    • About "inhabitant" (Q22947): wouldn't we end up listing every country? I don't think that works well in P31. If it should be included, the link from the place to the items would probably be preferable. --- Jura 09:55, 9 February 2022 (UTC)
      Well if only an ethnic ethnicity and data Different from the "People of Germany" item. Agree, the propety Inhabitant is not so good, if only nor is limited as Mainly inhabitant of Northwestern Europe. If each country, better without. DayakSibiriak (talk) 10:19, 9 February 2022 (UTC)

Dead external links: normal or deprecated rank?

Assume an item have a statement official website (P856)https://example.com. Then assume that the website has changed its domain name (and the old URL is dead), so we add the correct statement official website (P856)https://example.net. What to do with the statement containing the old, now-dead URL?

Help:Ranking says that historical data should have Normal rank and the up-to-date data should have the Preferred rank (and I have seen people following this protocol). In that case, one could set e.g.

Normal rank⟨ subject ⟩ official website (P856) View with SQID ⟨ https://example.com ⟩
end time (P582) View with SQID ⟨ 2022-01 ⟩

On the other hand, one could set the Deprecated rank for old URLs and Normal for the up-to-date one (and I have seen people following this protocol). Indeed, link rot (Q1193907) is an instance of Wikibase reason for deprecated rank (Q27949697) (although not a member of list of Wikidata reasons for deprecation (Q52105174)), so one could set

Deprecated rank⟨ subject ⟩ official website (P856) View with SQID ⟨ https://example.com ⟩
reason for deprecated rank (P2241) View with SQID ⟨ link rot (Q1193907)  View with Reasonator View with SQID ⟩

Which option is correct? Should link rot (Q1193907) be added to list of Wikidata reasons for deprecation (Q52105174) or stop being an instance of Wikibase reason for deprecated rank (Q27949697)? Also: should I post a link to this question on any talk pages? (which ones?) --colt_browning (talk) 11:21, 8 February 2022 (UTC)

@Colt browning: The first approach - preferred rank for the current value - is the correct model here now. Entries should only be deprecated if they were never correct; a website URL that was the correct official website for that entity in prior years should remain normal rank. ArthurPSmith (talk) 19:25, 8 February 2022 (UTC)
Pending something like phab:T210961, preferred is indeed the correct model. Jean-Fred (talk) 06:42, 9 February 2022 (UTC)
Wasn't there also a proposal to make URL datatype "smarter" about it? Maybe time to bring it up on Wikidata:Report_a_technical_problem once more. --- Jura 10:57, 9 February 2022 (UTC)

Adding language in item

Sorry for my bad English...

I am registered user (now I am edit on ip because I can open the Project chat page only by ip), and I can’t add more than 3 languages on a item. Is there some option to activate on my account that remove this limitation? --151.49.120.97 19:10, 8 February 2022 (UTC)

@151.49.120.97 There is no such limitation, and never has been (according to my knowledge).
What exactly are you trying to do and what error are you seeing? Michgrig (talk) 06:33, 9 February 2022 (UTC)
@Michgrig you are right, my mistake... --2001:B07:6442:8903:207B:3278:FB13:1E2C 14:58, 9 February 2022 (UTC)

Leadership Development Task Force: Your feedback is appreciated

You can find this message translated into additional languages on Meta-wiki.

The Community Development team at the Wikimedia Foundation is supporting the creation of a global, community-driven Leadership Development Task Force. The purpose of the task force is to advise leadership development work.

The team is looking for feedback about the responsibilities of the Leadership Development Task Force. This Meta page shares the proposal for a Leadership Development Task Force and how you can help. Feedback on the proposal will be collected from 7 to 25 February 2022. --YKo (WMF) (talk) 04:53, 9 February 2022 (UTC)

Alexandre Quinet (Q26713506)

The dates given for birth and death looked wrong (While 1850-1950 isn't impossible, those are very round numbers, and they didn't really seem to fit in with him making prominently displayed photos in 1873, even if that's, again, possible). I checked Gallica (1837-1900), but the funny thing is when I checked the source given for the 1850-1950 dates, it also said 1837-1900. Any idea what happened? Accidental import of approximate dates without the qualifiers? And is there any easy way to check for similar mistakes? Adam Cuerden (talk) 19:18, 7 February 2022 (UTC)

If it's the same as Wikidata:Bot_requests/Archive/2020/10#Cleanup_VIAF_dates, it was mostly fixed a while back. 1850 would be 19th century. --- Jura 20:03, 7 February 2022 (UTC)
That would make sense. Ah, well. It's fixed for him - even got a precise death date by checking the other record. I imagine the bot was thrown by the birth only being a year. Adam Cuerden (talk) 20:21, 7 February 2022 (UTC)
And Ayack then went in and... just absolutely did amazing work. Adam Cuerden (talk) 20:57, 7 February 2022 (UTC)

How to record movements of a person or a thing in wikidata?

I was looking for a way to describe item or person movements across their lives, many attributes describe the approximate location of a person e.g. residence, but in many cases for historical items or persons would be beneficial to be able to track their itinerary. Is there any generic statement for instance was_in that can be used for such a thing? This is an example of usage: was_in New York start time 2 January 1939 end time 28 November 1954 has cause emigration

Thanks for your help. Giacomo  – The preceding unsigned comment was added by MRCGCM53 (talk • contribs).

@MRCGCM53: residence (P551) with start time (P580) and end time (P582) qualifiers is usually suitable for this.Vojtěch Dostál (talk) 12:04, 9 February 2022 (UTC)
@MRCGCM53, Vojtěch Dostál: work location (P937) might be another option if you know they worked somewhere during a particular period but are not sure exactly where they lived (for residence (P551)). ArthurPSmith (talk) 18:13, 9 February 2022 (UTC)

Cleanup templates?

When coming across an item that is in need of general rethinking or refactoring, or that needs specific work affecting a number of claims about it, which the passing editor doesn't have time or domain knowledge to fix — I haven't found a way to flag more than a single claim as having potential issues.

Is there a precedent for flagging entire entities or properties for cleanup in a way that's visibly associated with those entries? (Visibly on the site for people browsing to it or its talk page; attached as a 'cleanup flag' property to the P/QID?) Sj (talk) 18:13, 9 February 2022 (UTC)

I often find or suspect problems with items that I lack the time or expertise to handle. Reporting it to (the appropriate) Project Chat is one approach, but it's heavy weight, and not always useful. I sometimes wish there was a quick way to add an item to a specialized queue, especially for languages. Bovlb (talk) 23:59, 9 February 2022 (UTC)
  • There is {{Edit request}} which you can leave on the item talk page if you want others to have a look. Not much used, though.
  • The approach in general does not appear to be viable for Wikidata. Problems are usually not tied to individual entities and need to be fixed at many places, usually using some form of automated editing.
  • "People browsing item (talk) pages" is a rather unusual behavior here. The web UI is not an interface for data users (the equivalent to readers at Wikipedia), thus pages are *really* rarely visited in the browser. One could, of course, collect the flags and maintain another set of worklists, but we do not really have a shortage of these.
MisterSynergy (talk) 00:24, 10 February 2022 (UTC)
A problem we have for some known problems is that people trying to clean them up get complaints for doing so. --- Jura 00:33, 10 February 2022 (UTC)

Historic logos: To keep or not to keep

Hello, I'm having a minor argument with a Wikidata contributor about whether to keep old logos. My proposal is to qualify old logos with start time (P580) and end time (P582) and rank the current logo as preferred. Nikon1803 claims that it would be required to replace old logos with the new one, thus only keeping the latest logo and removing all superseded ones. I would appreciate if a third party could give their opinion on this matter. Thanks. --Nw520 (talk) 00:50, 15 February 2022 (UTC)

@Nw520: Wikidata's general rule is that no data is deleted from data objects, even if they appear to be out of date. Therefore, in my opinion, your variant would be the correct variant here. --Gymnicus (talk) 07:27, 15 February 2022 (UTC)
Concur. @Nikon1803: it's well established that WD keeps old 'true' data, ideally qualifying it with start & end dates, and adds a preferred rank statement to convey the current 'truth'. A statement showing an old logo should be left in place, should not be removed. --Tagishsimon (talk) 08:27, 15 February 2022 (UTC)
Thank you very much, Gymnicus and Tagishsimon. It should now be sufficiently obvious what is common modeling practice on Wikidata. --Nw520 (talk) 11:44, 15 February 2022 (UTC)

Typical descriptions

There are typical descriptions for templates, disambiguations, lists and categories but they’re different between items because added by different bots. They probably should be the same because if one description is better than other it should by used anywhere and it’s really hard to harmonize descriptions before merging such items. 217.117.125.83 12:33, 8 February 2022 (UTC)

I don't see any practical value in harmonising descriptions and it won't be completely possible anyway given the need for a unique label-description for every item. Groups of the same item type can be found by queries of the statements, regardless of the labels or descriptions. From Hill To Shore (talk) 19:27, 8 February 2022 (UTC)
  • Maybe we could just display and output default descriptions, Wikibase being a database?
It will also save plenty of capacity on query server, e.g. ca. 137 million triples for descriptions of Wikimedia disambiguation page (Q4167410) instances, or ca. 500 to 700 million triples for all mentioned types. --- Jura 13:53, 9 February 2022 (UTC)
Added it at Wikidata:Report_a_technical_problem#descriptions_for_categories,_templates,_disambiguation. --- Jura 08:38, 10 February 2022 (UTC)

P171 how to describe an item or person in wikidata

A way to describe item or person movements across their lives, many attributes describe the approximate location of a person e.g. residence, but in many cases for historical items or persons would be beneficial to be able to track their itinerary. Is there any generic statement for instance was_in that can be used for such a thing? This is an example of usage: was_in New York start time 2 January 1939 end time 28 November 1954 has cause emigration

Thanks for your help.  – The preceding unsigned comment was added by 105.112.51.101 (talk • contribs) at 00:19, 10 February 2022 (UTC).

This question was already asked above and received an answer. If you need more information or are not happy with the answer, continue the previous discussion. Deleting the previous discussion is not acceptable.[3] From Hill To Shore (talk) 05:32, 10 February 2022 (UTC)

333K QuickStatements edits

I have prepared 332,886 UK heritage items to be updated using QuickStatements. I have a list of them here (columns are: Q ID, English label of instanceof, current description). Currently the descriptions are the location of a given object. My intention is to alter the descriptions such that instead of 'current description' we would have 'English label of instanceof in current description'. Example edit. I made sure data is 100% sanitized and the batch is now ready to be run.

Given the magnitude, I have three questions:

  1. Do I need to get a formal approval from somewhere to run this batch? (Or is an informal discussion with the creator of items sufficient?)
  2. Is there a rule stating I have to use a bot account for batches exceeding X edits?
  3. What is the best place to ask about whether the edits I intend to do are beneficial, and the way I intend to do them is not disruptive?

Thank you. Gikü (talk) 12:48, 8 February 2022 (UTC)

I've been done something similar for Scottish geolocatable items, via quickstatements, and probably touching tens of thousands of items. I didn't consider I needed permission before doing so. I don't think there's consensus on the answer to the question. It does seem to be the case that thousands of NHLE items have poor descriptions in the form location where it would be beneficial for them to state type of thing in (standardised form of(location)). fwiw, the minimal pattern I adopted for decriptions was along the lines of type in local authority area, Scotland, UK (e.g. "mountain in Argyll and Bute, Scotland, UK). Current NHLE descriptions omit both England and UK, which is suboptimal for an international database. I don't think including the outbound postcode in the description (e.g. WD4 in Queen Anne Cottage And Elizabeth Cottage (Q26393732)) is useful. So, I'd be interested to see your planned desriptions, but broadly I support the idea that this should be done much as you're planning to do. (You also need to deal with items having a plurality of P31 values.) --Tagishsimon (talk) 14:08, 8 February 2022 (UTC)
I tend to think that formal bot approval is not required for one-shot imports or jobs. At the same time, I think announcement at a dedicated project page is usually good idea (here Wikidata:WikiProject UK and Ireland and/or Wikidata:WikiProject Built heritage). Vojtěch Dostál (talk) 14:19, 8 February 2022 (UTC)
Reminds of lighthouses in some of these I fixed a while back [4], much like Tagishsimon suggested, except maybe for the UK part. You might want to do some filtering for descriptions that are already cleaned-up. --- Jura 14:38, 8 February 2022 (UTC)
Thank you @Vojtěch Dostál: I'll let members of both mentioned projects know.
Thanks @Tagishsimon: I am trying to be minimally invasive so for now I did not plan to alter anything in the location formatting – I only intend to prepend the '<type of thing> in'. But I believe members of the projects mentioned above by Vojtěch Dostál may have an informed idea about these postcodes and if they have a consensus on removing them I can use this QuickStatements batch to do that too. I will also suggest adding the country name, either after or in place of the postal code. Thank you in particular for bringing up the multiple P31 claims issue. My solution for this I have to confess is rather primitive: I just take one of them at random and use it – in my opinion we do not lose anything by omitting an additional P31 when describing an object, and it's still more informative than just having the location.
@Jura1: Thank you, I made sure I excluded from my batch items with already complete descriptions like Ulting Wick (Q26266381) or Boultibrooke House (Q29487765). Gikü (talk) 14:52, 8 February 2022 (UTC)
@Gikü As there isn't really a dedicated project for this, maybe the property talk page(s) could do as well.
BTW, unless there is some special use of postal codes in the UK (to disambiguate two places with the same name or to identify the location more clearly), I would remove them from the descriptions. --- Jura 09:45, 9 February 2022 (UTC)
@Jura1: It's not unknown to use the first half of a UK post code as a disambiguator, similar to, but not nearly as common as the US practice of appending two-letter US state abbreviations. Bovlb (talk) 23:55, 9 February 2022 (UTC)
It's more common in London, I've found, but yes it does happen from time to time. Theknightwho (talk) 17:22, 10 February 2022 (UTC)

New development roadmap for Wikidata and Wikibase for Q1 2022

Hi everyone,

I wanted to let you know that we just published the plan of the Wikidata development team for Wikidata and Wikibase for the first quarter of 2022!

Wikidata:Development plan

Here are some highlights for 2022:

In 2022 we will continue developments to help editors increase the quality of Wikidata’s existing data and contribute new high-quality data. Among the initiatives is to build up feedback loops with data re-users to get them more actively involved in improving the data on Wikidata.

We will be improving Special:NewLexeme to make it easier for editors to create new Lexemes. More people need access to knowledge and technology presented in their own language and we believe that language data is a fundamental building block in reaching that goal.

More people should benefit from the data Wikidata provides. We will be releasing the new REST API to make it easier for programmers to access our data.

On the Wikibase side of things, we want to enable more projects with fewer resources to be able to independently onboard themselves into the Wikibase Ecosystem. We will be launching Wikibase.cloud, offering Wikibase as a Service. It will be based on the code used to run WBStack but will be managed and maintained by Wikimedia Deutschland.

We are also conducting market research, to get a better understanding how organizations that could provide valuable data for the ecosystem are taking decisions when it comes to choosing a software.

In addition, we’ve also published a status update about what was achieved for each of the 2021 development goals.

Please note that the development plan only presents the main projects that the development team will work on during the first quarter of 2022. Development may continue for some of these projects beyond that period. Critical and ongoing tasks (e.g. maintenance of the software and fixing pressing bugs) are not mentioned, but will be included in the workflow over the year. At the beginning of each quarter the roadmap will be updated to include the development estimates for that quarter. We will be sending notifications on our usual communication channels upon each update.

If you have any questions or feedback, feel free to add them on this talk page: Wikidata talk:Development plan.

Cheers,

- Mohammed Sadat (WMDE) (talk) 09:48, 10 February 2022 (UTC)

Dear @Mohammed Sadat (WMDE), in September 2020, WMDE has conducted a process review of the support brought by the development team to WD community.
One of the "main fields of suggestions from the community" was to be "able to suggest and vote for the most important features and bugs to fix". An action was added in "Action plan for later in 2021": "Find better processes to include community requests in our (already very packed) roadmap"
So, according to this plan, could you please tell us how community requests were added in 2022 roadmap because I don't remember having seen an announce related to this topic.
Thanks. Ayack (talk) 15:21, 10 February 2022 (UTC)

Entries for interviews

Do we have a property for "interviewer" and "interviewee". I see something specific to talk shows as "talk show guest", do we have something more generic? --RAN (talk) 00:37, 10 February 2022 (UTC)

You could use participant in (P1344) qualified by subject has role (P2868) interviewer (Q46034607) for an item about the person. The inverse would be participant (P710) qualified by object of statement has role (P3831) interviewer (Q46034607) for an item about the interview. From Hill To Shore (talk) 07:00, 10 February 2022 (UTC)

Asking for a change on a protected item

Could someone remove "Courbessac" as a French alias of Nîmes (Q42807)'s description? It is a distinct geographical entity, that has btw its own item... Thanks in advance, 92.184.105.99 10:48, 11 February 2022 (UTC)

✓ Done Ayack (talk) 11:16, 11 February 2022 (UTC)

Constraints

Is it okay to add to structure replaces (P1398) and structure replaced by (P167) the following constraints?

property constraint (P2302)conflicts-with constraint (Q21502838)property (P2306)replaces (P1365)constraint scope (P4680)constraint checked on main value (Q46466787)

property constraint (P2302)conflicts-with constraint (Q21502838)property (P2306)replaced by (P1366)constraint scope (P4680)constraint checked on main value (Q46466787)

Infrastruktur (talk) 15:42, 11 February 2022 (UTC)

Refactor WD:PFD

In my opinion, mark this page as a translation source page is clearly a bad idea, as TA need to "mark for translation" every time when a new request is added. I suggest to move the content of this page to two subpages: Wikidata:Properties_for_deletion/Current requests and Wikidata:Properties_for_deletion/On hold, while replacing this title with the header of this page. This structure is similar to c:Commons:Undeletion_requests and c:Commons:Deletion_requests. Stang 22:35, 11 February 2022 (UTC)

I agree that this is not very practical. It should be enough to put "=On hold=" and "These discussions have been closed but are awaiting deletion." on a subpage and make it translatable there. --Ameisenigel (talk) 08:15, 12 February 2022 (UTC)

Tracking category for mobile-unfriendly pages and templates?

Could I create a category for pages and templates that are not mobile-friendly and need to be fixed at some point? For example, Wikidata:Property proposal and Help:Contents are not mobile friendly. Lectrician1 (talk) 16:32, 12 February 2022 (UTC)

Arcade video games and arcade games

Just noticed that Q192851 and Q15613992 are quite mixed up between languages. To give only one example, the first one is en:Arcade video game in English but fr:Jeu d'arcade (arcade game) in French, while the second one is en:Arcade game in English but fr:Jeu vidéo d'arcade (arcade video game) in French. It does not help that the English Wikidata label for the first one is “arcade game machine” (which could be mixed up with Q1349717, “video game arcade cabinet”) and not “arcade video game” like the Wikipedia article. Nclm (talk) 12:11, 7 February 2022 (UTC)

@Jean-Frédéric: ^^ Multichill (talk) 19:17, 7 February 2022 (UTC)
Q15613992 was originally [5] about the genre of video games like those used in video arcade machines. This is still reflected in P31 and the external identifiers, if not the sitelinks. I suggest create new items to express “arcade game superclass-of (arcade video game, pinball machine game, electro-mechanical arcade game, ...)”. ⁓ Pelagicmessages ) 20:13, 13 February 2022 (UTC)
(Belated answer) Agree with @Pelagic: ; I have now created arcade game (Q113726751) for the « arcade game superclass-of (arcade video game, pinball machine game, electro-mechanical arcade game, ...) », and reverted arcade (Q15613992) back to being about the genre. Sitelinks still have to be sorted out. cc @Nclm:. Jean-Fred (talk) 12:04, 5 September 2022 (UTC)

Sandbox for testing a script?

Hi all,

We would like to write a script to upload the data of the Israeli Film Archive historical newsreel collection to Wikidata. Where can I find best practices related to this, and is there a sandbox area to test the script?

Cheers, Keren - WMIL (talk) 13:32, 13 February 2022 (UTC)

Also here on Wikidata itself, a series of sandbox items, e.g. Q4115189, Q15397819, etc.
If the output is of reasonable quality, you could create a limited number of new items directly. (Please review and fix them afterwards, or ask for feedback on Wikidata talk:WikiProject Movies).
For bot use, please see Wikidata:Bots. --- Jura 14:04, 13 February 2022 (UTC)

What about verbs?

I am a relative newcomer to Wikidata though see its immense potential. In looking to construct knowledge graphs and contribute to the project, I don't see a simple way to express verbs or actions in the property/predicate part of the triple, which is quite simple to do in a labeled property graph like Neo4j. How, for example, might one express "John loves Jane"? Or "Mark follows Mary"? Is there even a way to express a verb like "love" or "follow"? And what about the reciprocal "is loved by" and "is followed by"? Or is it better to say "<<:John :has-emotion :love>> :for :Mary (a form of reification) and handle love as an item/value? What am I missing here?  – The preceding unsigned comment was added by Mloparco (talk • contribs) at 20:25, 11 February 2022 (UTC).

I tend to think of triples as subject-verb-object just as much (or maybe even more than) as subject.property=object. A lot of our relations/properties are of the has-a or has-quality type, but the verb “to have” is involved. For example “sunflower colour yellow” or “sunflower has colour yellow”? ⁓ Pelagicmessages ) 08:36, 12 February 2022 (UTC)
I am unclear on what Mloparco is trying to get Wikidata to do here. Is this a proposal to link verbs into the existing data or is it a proposal to take the data and add verbs afterwards? If it is the former, we run into the problem of being a multi-lingual project with different sentence structures. Language structures include subject-verb-object, subject-object-verb, verb-object-subject, verb-subject-object, object-verb-subject and object-subject-verb. Whatever mechanism is employed to add verbs for one structure can't be at the expense of other sentence structures. From Hill To Shore (talk) 11:54, 12 February 2022 (UTC)
@From Hill To Shore: I interpreted Mloparco's question to mean that Wikidata properties are not very “verbish”. Where in real life we might say that “Douglas Adams wrote Hitchikers Guide” (active, past-tense verb), in Wikidata (with English labels) it looks like “Douglas Adams [has] notable work Hitchikers Guide” (Douglas Adams (Q42)notable work (P800)The Hitchhiker's Guide to the Galaxy pentalogy (Q25169)). Of course “notable work” has a narrower scope than “author of”, but it's the only example I could think of right now.
Aside from whether P-items are adjectives or verbs, another aspect of their question that I haven't addressed is that we don't always have specific P-items, and so create constructions like “(Q-subject general-P-item Q-object) P-qualifier Q-specific-concept)”. Q-specific-concept would be a noun, hence the reification.
. ⁓ Pelagicmessages ) 21:01, 12 February 2022 (UTC)
Wikidata is not about storing english language sentences but stores information in a different structure. Objects in the Q-namespace aren't words and thus can't be verbs. We do have verbs in the lexeme namespace. When it comes to relationships between people we represent them with spouse (P26) and unmarried partner (P451).
Learning Wikidata is about learning how we structure knowledge here. ChristianKl11:15, 14 February 2022 (UTC)

Updates on the Universal Code of Conduct Enforcement Guidelines Review

You can find this message translated into additional languages on Meta-wiki.

Hello everyone,

The Wikimedia Foundation Board of Trustees released a statement on the ratification process for the Universal Code of Conduct (UCoC) Enforcement Guidelines.

The Universal Code of Conduct (UCoC) provides a baseline of acceptable behavior for the entire movement. The UCoC and the Enforcement Guidelines were written by volunteer-staff drafting committees following community consultations.

The revised guidelines were published 24 January 2022 as a proposed way to apply the policy across the movement. There is a list of changes made to the guidelines after the enforcement draft guidelines review. Comments about the guidelines can be shared on the Enforcement Guidelines talk page on Meta-wiki.

To help to understand the guidelines and process, the Movement Strategy and Governance (MSG) team will be hosting Conversation Hours on 4 February 2022 at 15:00 UTC, 25 February 2022 at 12:00 UTC, and 4 March 2022 at 15:00 UTC. Join the conversation hours to speak with the UCoC project team and drafting committee members about the updated guidelines and voting process.

The timeline is available on Meta-wiki. The voting period is March 7 to 21. All eligible voters will have an opportunity to support or oppose the adoption of the Enforcement guidelines, and share why. See the voting information page for more details.

Many participants from across the movement have provided valuable input in these ongoing conversations. The UCoC and MSG teams want to thank the Drafting Committee and the community members for their contributions to this process.

Sincerely,

Movement Strategy and Governance
Wikimedia Foundation --YKo (WMF) (talk) 03:49, 14 February 2022 (UTC)

Q109116286

illness (Q109116286) can be merged with disease or illness, anyone speak Polish that can decide which one? The only link is to quotes for disease and illness. --RAN (talk) 03:57, 14 February 2022 (UTC)

Notified participants of WikiProject Poland – How would you decide here? --Gymnicus (talk) 07:51, 14 February 2022 (UTC)
Are you sure you meant Polish, not Czech? The only sitelink in this element is to Czech Wikiquote. Anyway, in Polish we have the same word for both disease and ilness. As far as I can understand Czech, which I don't know, but Polish and Czech are both West Slavic languages, the quotes in the linked page use the word in a meaning which is closer to illness, rather than disease in English. Powerek38 (talk) 08:14, 14 February 2022 (UTC)
@Powerek38: Oh hubs. Sorry for the unnecessary ping. I should have looked into the data object myself. --Gymnicus (talk) 11:42, 14 February 2022 (UTC)
I'd say merge to disease (Q12136) but the sitelink is already there too. So maybe Wikimedia duplicated page (Q17362920). Vojtěch Dostál (talk) 10:05, 14 February 2022 (UTC)

Mineral water composition

Hi, I am working in the mineral water's items. I want to include its mineralogical analysis, a list like typical , that usually appears on the bottle label. Which property would be better to use ?: contains (P4330) or has part(s) of the class (P2670) or another more specific, always with qualifier has part(s) of the class (P2670)?. This kind of content/structure information could be used too for components of soil (Q36133) or for alloy (Q37756), in these case with proportion (P1107) qualifier. Thanks, Amadalvarez (talk) 07:26, 14 February 2022 (UTC)

Maybe has listed ingredient (P4543)? --- Jura 09:16, 14 February 2022 (UTC)
Wonderful !. Sorry, I didn't know. Thanks, Amadalvarez (talk) 10:13, 14 February 2022 (UTC)

Wikidata weekly summary #507

Rank of only value in claim is prefered

Am I correct that it should be changed to normal? 217.117.125.83 12:10, 8 February 2022 (UTC)

Ultimately it does not matter if it's the only statement. Though it could cause confusion if another value is added without reconsidering the ranks. SilentSpike (talk) 17:03, 8 February 2022 (UTC)
A single statement with preferred rank may be an indication that another statement with a different rank has been deleted. It is worth checking the history to see if a statement needs to be restored. Is this about a particular item? From Hill To Shore (talk) 21:48, 8 February 2022 (UTC)
Originally, I found it in Kaluga Oblast (Q2842) and similar items where it was caused by an error of botTemplate:Ref-ru but I extrapolated the problem to all the such items. 217.117.125.83 15:47, 15 February 2022 (UTC)

I would like to know if there is a reason why there is no property containing the Wikipedia's page of items?

It looks like this information is only saved on the wikipedia side: https://en.wikipedia.org/w/api.php?action=query&titles=Douglas_Adams&prop=pageprops&format=json

Once this property exists, it should be easy: 1. to submit the value of it for every item; 2. to create a bot which synchronize both Wikidata and Wikipedia Database.

For a private project of mine I need this information for almost every item of Wikidata. So I'll start working on the first point: I'll create a raw csv file with both Wikipedia's id, and Wikidata's id.

Concerning the second point (on-going synchronization between the two databases), I don't know yet when I will work on it.  – The preceding unsigned comment was added by Aleph-g (talk • contribs) at 20:18, 14 February 2022‎ (UTC).

Sitelink data is stored on the WD side in three or so tripes
  • ?article schema:about ?item .
  • ?article schema:isPartOf ?wmf_site .
  • ?article schema:name ?sitelink .
and so for this data, the predicates are schema:about, schema:isPartOf and schema:name, rather than being derived from a property; and they are predicates of the ?article, not of the ?item. (Obvs, you can use the inverse predicate ?item ^schema:about ?article . ) For Q42 see https://w.wiki/4qEc ; for documentation see https://www.mediawiki.org/wiki/Wikibase/Indexing/RDF_Dump_Format#Sitelinks and, more generally, the whole of that page's description of the RDF representation of WD. --Tagishsimon (talk) 20:44, 14 February 2022 (UTC)
And, to follow on: this seems to obviate your proposed bot. Any WP article with an idea of its WD item, will be found as a sitelink on that item. --Tagishsimon (talk) 20:49, 14 February 2022 (UTC)
Thank you for your answer.
Is this information part of the json dump? If not, where can I download it, apart from the RDF Dump? I'm already importing all the items of WD from the json dump, and an API request to access to the information for each item is not possible for my case.
Another way to rephrase my question is: why is the WP page of a WD item not saved as a property, available in Json dump and part of the Data Model?
--Aleph-g (talk) 06:30, 15 February 2022 (UTC)
@Aleph-g: I'm not v.familiar with the JSON dump, but its documentation suggests sitelink data is included - https://doc.wikimedia.org/Wikibase/master/php/md_docs_topics_json.html#json_sitelinks and https://doc.wikimedia.org/Wikibase/master/php/md_docs_topics_json.html#json_example ... and that mirrors what I see when I look at the JSON of a single item - https://www.wikidata.org/wiki/Special:EntityData/Q110923225.json (sitelinks are at the foot of the page). As I noted above, sitelinks are part of the RDF Data Model. I don't know what the decision process was, but there is quite a lot in any item that is not represented by a pedicate derived from a Pnnn property, such as label(rdfs:label), description (schema:description), alias (skos:altLabel), sitelinks (as discussed), rank, bestrank, badges, #statements, #sitelinks &c &c. All of this data is available in the JSON dump, by my reading of the documentation, and is equally available in the RDF. --Tagishsimon (talk) 12:29, 15 February 2022 (UTC)
I finally found what I searched. The wikipedia page of an item is under the sitelinks keys of the item, in an object pointed from a key like enwiki for the English version, frwiki for the French one, and so on, in the title key.
What confused me previously is that I saw a lot of links for other wiki projects (wikivoyage, wikiquote, wikinews, etc), missing the simple enwiki, which stand for main wikipedia project, and that the title of the wikipedia article is given, where I look for an numeric value.
The example of the documentation is also different from the json dump, in which entities do not have the following keys:
  • pageid
  • ns
  • title
  • modified
Now every thing is clear for me. Thank you for your patience. Aleph-g (talk) 21:17, 15 February 2022 (UTC)

Format the spreadsheet to import the data??

Hello,

New in the Wikidata import. I'm working on Wikidata:Dataset Imports/Swissvotes (I will need help!).

I don't fully understand the purpose of formating the original spreadsheet. How it's going to work to import the data regularly (about each week)? Is there a way after to specify a mapping from the orignal to the field that Wikidata require?

Thanks! Gagarine (talk) 23:02, 14 February 2022 (UTC)

@Gagarine: I'll try and help you with this — Martin (MSGJ · talk) 20:34, 15 February 2022 (UTC)

Problem description:

Facts:

  • English Wikipedia has the article «Reserve Officer Training in Russia». Russian Wikipedia has the article «Военная кафедра». Belarusian Wikipedia has the article «Ваенная кафедра». These articles are linked to each other in Wikidata (itemid Reserve Officer Training in Russia (Q17087022)).
  • Russian Wikipedia has the article «Военный учебный центр». This article also exists in Wikidata (itemid Military training center (Q4479347)), but it is not linked to any article in other Wikipedias.
  • On 1 January 2019, when the Russian Federal Law of 3 August 2018 №309-FZ entered into force, the military departments (военные кафедры) were abolished, and from now on, the reserve officer training in Russia is conducted at military training centers (военные учебные центры).

My actions:

Since military departments no longer exist, and the reserve officer training in Russia is conducted at military training centers only, I thought it was right to replace Russian article «Военная кафедра» by the article «Военный учебный центр» in itemid Q17087022, but this was not possible. The merging of itemid Q17087022 and itemid Q4479347 was not possible too. So I had placed the request for deletion of itemid Q4479347 (current status «on hold»).

Objections:

The user Ksc~ruwiki considers that English article «Reserve Officer Training in Russia» describes the process of additional military education of a students of Russian civilian institutions of higher education, while Russian articles «Военная кафедра» and «Военный учебный центр», and Belarusian article «Ваенная кафедра» describe the organizational systems of military educational units of Russian and Belarusian civilian institutions of higher education.

Possible solutions:

1. First way:

  • delete itemid Q4479347
  • replace Russian article «Военная кафедра» by the article «Военный учебный центр» in itemid Q17087022

2. Second way:

  • remove Russian article «Военная кафедра» from itemid Q17087022
  • remove Belarusian article «Ваенная кафедра» from itemid Q17087022
  • create the article «Подготовка офицеров запаса в России» in Russian Wikipedia
  • link English article «Reserve Officer Training in Russia» to Russian article «Подготовка офицеров запаса в России» in itemid Q17087022
  • create the article «Military departments of civilian universities (Soviet Union and post-Soviet area)» in English Wikipedia
  • link English article «Military departments of civilian universities (Soviet Union and post-Soviet area)» to Russian article «Военная кафедра» and Belarusian article «Ваенная кафедра» in new Wikidata itemid
  • create the article «Military training centers of civilian universities in Russia» in English Wikipedia
  • link English article «Military training centers of civilian universities in Russia» to Russian article «Военный учебный центр» in itemid Q4479347

K8M8S8 (talk) 10:18, 15 February 2022 (UTC)

Kenneth Arthur Chapman (footballer)

Born 25 April 1932 Coventry Died 22 May 2019 Daventry Northamptonshire  – The preceding unsigned comment was added by 45.238.254.240 (talk • contribs).

How do I get to display English, Portuguese and Brazilian as my main languages?

I know this isn't exactly about Wikidata, but it's hindering my contributions to the project. Basically, I am fluent in these languages. The problem is, when the description box about a item shows up, only three languages show up (I don't know if you can change this). As you know, Portuguese and Brazilian Portuguese are considered different here, but most of the time the text in one can be applied to the other. That said, I can't manage to see both languages at the same time. Is there any solution to this? Tet (talk) 01:23, 16 February 2022 (UTC)

Create a Babel box like on my page. AntisocialRyan (Talk) 03:34, 16 February 2022 (UTC)
AntisocialRyan oh yeah, I know how to use Babel to display languages in your own profile. What I mean is the description of the QID in each language. It looks like this. I want it to display the 3 languages mentioned above so I can edit in these languages. Tet (talk) 04:24, 16 February 2022 (UTC)
It helped me to see both variants of Portuguese, I don't quite see which problem is causing if you're adding both variants into the Babel box, like {{#babel:pt-br-N|pt-4|en-3|es-1}}. --Wolverène (talk) 04:57, 16 February 2022 (UTC)
Oh, so the description box for QIDs is based on your Babel information. TIL. Thanks for the help, I added the pt as one of the languages I know. Tet (talk) 07:15, 16 February 2022 (UTC)
Sorry, I should've been more clear, it's not exactly self-explanatory. AntisocialRyan (Talk) 16:10, 16 February 2022 (UTC)

How should Wordle variants and clones be treated?

There are some ~500 wordle variants and clones right now, but only one of them have a specific QID: Parig (Q110754784), for Western Armenian users. This site has been collecting all of them and is CC0. I think it's possible to use OpenRefine with the database from the Gitlab repo, but I want to be sure these entries wouldn't be nominated for deletion first. Tet (talk) 01:29, 16 February 2022 (UTC)

I think if they have press coverage they would be fine, that's just my opinion though, they'd have to be notable of course. Popular clones like Dordle, Lewdle, Squabble, Worldle, etc. sound notable to me. AntisocialRyan (Talk) 03:37, 16 February 2022 (UTC)

Values for Wikimedia Deutschland

Wikimedia Deutschland is engaging in a dialogue of values and inviting the community to contribute its perspective

Where do you want to go?
Where do you want to go?

In the fall and winter of 2021 and in the spring of 2022, Q8288 (WMDE) will deal with the values of the association. In a process that will run until March 2022, a framework of values is to be developed that sets out the central guiding principles for WMDE as an organization.

In this Dialogue on organizational values by WMDE, a proposal has been developed over many discussions, on which the Wikidatians and the Wikidata-community as a central group in lively exchange with WMDE is asked for its perspective on the proposal talk page.

The texts describing the values should be generally valid and abstract. Different stakeholders such as members, communities, or the employees should be able to identify with them. At the same time, however, the values will only be binding for the work of Wikimedia Deutschland, i.e. the office and the board.

This value framework should reflect the identity of Wikimedia Deutschland. It should be a support for the board and the employees of WMDE in strategic decisions or difficult questions and help to communicate well what is important for WMDE.

In doing so, the proposal was not started from scratch, but built on many materials that are already there. In addition, important foundations have been laid in workshops and processes over the past few years. Now it is up to the community to decide whether the proposed values are a good fit for WMDE as an organization and how they could be implemented in practice. The WMDE Board will then decide on the final values framework in April 2022 and present it to the WMDE General Assembly in May.

The team is happy about every participation, also via mail, if you don't like to express your thoughts online.

Thank you for reading and best regards, Christoph Jackel (WMDE) (talk) 16:45, 16 February 2022 (UTC)

Item edit request

On Voeren (Q460479) please delete Voerenvlag.gif and add Voeren vlag.svg in flag image (P41). Many thanks in advance!!! --93.34.231.169 09:25, 17 February 2022 (UTC)

✓ Done — Martin (MSGJ · talk) 11:46, 17 February 2022 (UTC)

merging 2 items with 5,154 authors

Hi,

A 2015 paper has 5,154 authors (a record according to Physics paper sets record with more than 5,000 authors (Q59090049)). Somehow it has 2 items at the moment - Combined Measurement of the Higgs Boson Mass in p p Collisions at √s=7 and 8 TeV with the ATLAS and CMS Experiments (Q106988069) and Combined Measurement of the Higgs Boson Mass in p p Collisions at √s=7 and 8 TeV with the ATLAS and CMS Experiments (Q21558717). Is there a way to safely merge them? The potential for creating a mess is just huge. DGtal (talk) 11:08, 17 February 2022 (UTC)

I would just merge them in the usual way. If something goes wrong you can always rollback to a previous version. I notice the series ordinal (P1545) don't match on the two items, so there would be a huge cleanup task on those properties. — Martin (MSGJ · talk) 11:29, 17 February 2022 (UTC)
That's a bug part of the problem and I don't think anyone has the spare time to check 5000 authors. DGtal (talk) 11:45, 17 February 2022 (UTC)
The newer item is unused, and has little edit history. Perhaps it is not needed? — Martin (MSGJ · talk) 11:47, 17 February 2022 (UTC)
I just compared the items and you are correct. The only thing better in the new item is that the title is more precise ( √s=7 instead of s = 7). I would either just delete the newer item or turn it into a redirect page without actually merging anything. DGtal (talk) 12:14, 17 February 2022 (UTC)
✓ Done — Martin (MSGJ · talk) 15:24, 17 February 2022 (UTC)

Rollout of the new audio and video player

Please help translate to your language

Hello,

Over the next months we will gradually change the audio and video player of Wikis from Kultura to Video.js and with that, the old player won’t be accessible anymore. The new player has been active as a beta feature since May 2017.

The new player has many advantages, including better design, consistent look with the rest of our interface, better compatibility with browsers, ability to work on mobile which means our multimedia will be properly accessible on iPhone, better accessibility and many more.

The old player has been unmaintained for eight years now and is home-brewn (unlike the new player which is a widely used open source project) and uses deprecated and abandoned frameworks such as jQuery UI. Removing the old player’s code also improves performance of the Wikis for anyone visiting any page (by significantly reducing complexity of the dependency graph of our ResourceLoader modules. See this blog post.). The old player has many open bugs that we will be able to close as resolved after this migration.

The new player will solve a lot of old and outstanding issues but also it will have its own bugs. All important ones have been fixed but there will be some small ones to tackle in the future and after the rollout.

What we are asking now is to turn on the beta feature for the new player and let us know about any issues.

You can track the work in T100106

Thank you, Amir 17:59, 17 February 2022 (UTC)

references of identifiers

I could go on and on about the subject, but I don't think that's going to help that much. so I'll just leave it at this simple question: Do you think identifiers need references? I would appreciate regular participation in the discussion. --Gymnicus (talk) 09:12, 1 February 2022 (UTC)

I am in favor of using references at least in cases when the connection was established by someone else (external identifier hub). For example, I received some links between NL CR AUT ID (P691) and abART person ID (P6844) from The Fine Art Archive (Q107456632). Vojtěch Dostál (talk) 10:41, 1 February 2022 (UTC)
I think references for identifiers are good when a semiautomatic import, of whichever size, extracts IDs from a source and imports them to Wikidata; the source may be the identifier itself containing the link to a Wikidata item. When a user manually adds an external identifier to a Wikidata item, it is useless that he adds some sort of reference in my opinion. References containing only retrieved (P813), which I have seen sometimes, are substantially useless in my opinion. --Epìdosis 10:46, 1 February 2022 (UTC)
@Epìdosi: „When a user manually adds an external identifier to a Wikidata item, it is useless that he adds some sort of reference in my opinion.“ – So by this manual addition of an identifier Special:Diff/1515983069, the reference would be useless, do I understand that correctly? – Basically, I agree with you, but there are also identifiers where I still see such editing as useful. These identifiers include, for example, Dewey Decimal Classification (P1036) (Special:Diff/1512659095) and EU VAT number (P3608) (Special:Diff/1488576109). --Gymnicus (talk) 11:07, 1 February 2022 (UTC)
Yes; but I agree with the cases you cite, there the references seem effectively useful. --Epìdosis 12:19, 1 February 2022 (UTC)
@Vojtěch Dostál: Do you have an explicit example? I think I kind of know what you mean by that. But an example would perhaps underline it again. --Gymnicus (talk) 10:51, 1 February 2022 (UTC)
@Gymnicus Here, switch to format:MARC, you can see various identifiers in fields 0247 hard-coded by librarians when creating this entry. Vojtěch Dostál (talk) 12:32, 1 February 2022 (UTC)
Well, then I misunderstood you and I find such a reference nonsensical. If the identifier itself “links” to the data object, I don't have to confirm this here in the Wikidata object, because an identifier should stand for itself. Also, the property NL CR AUT ID (P691) is an authority file, and according to the page Help:Sources/Items not needing sources an authority file does not need a reference. --Gymnicus (talk) 08:20, 4 February 2022 (UTC)
  • Yes, references for identifiers are really useful, it's very valuable to know whether they come from the source directly, via a third party, or are added by a bot or script based on some heuristics (very error prone, always need to be double checked). I also find it helpful to see when they were added. Referenced identifiers are great. Moebeus (talk) 13:21, 1 February 2022 (UTC)
    @Moebeus: Based on your comment, I have several questions:
    1. How do you know from the references alone that it was added by a bot or a manual script?
    2. What do you mean by adding directly from a source or through a third party?
    3. What does the information tell you when the "reference" was added? On this question, I'm referring to the following edit you made: Special:Diff/1515983069
    I'd appreciate answers to these questions so I can better understand your comment. --Gymnicus (talk) 15:24, 6 February 2022 (UTC)
    1. In my corner of WD we're lucky enough that the main bot (Soweego) identifies itself when adding references (by using certain specific reference properties that manual editors typically don't use) .
    2. Directly: This identifier was found looking at the source (e.g. Spotify). Indirectly: Some other site that is not Spotify says the Spotify identifer is....
    3. The retrieved date? On October 23. 2021 this was the Spotify Id for artist N.R.F.B according to Spotify itself.

Moebeus (talk) 03:18, 7 February 2022 (UTC)

  • Need? No. Sometimes helpful? Yes. Sometimes there's no sensible reference to use. But if you are importing a ton of identifiers then you probably should be adding references. BrokenSegue (talk) 00:20, 2 February 2022 (UTC)
    @BrokenSegue: So you are also of the opinion that in principle no references are needed for identifiers. But under certain circumstances, which you also mention in your comment, you still find them useful. Now the follow-up question arises for me: Do you think that the special circumstances when references for identifiers are useful should be specified or not? --Gymnicus (talk) 16:05, 6 February 2022 (UTC)
    @Gymnicus: Should be specified in like help pages? I mean sure. Feel free to add this to one of them. BrokenSegue (talk) 16:12, 6 February 2022 (UTC)
    @BrokenSegue: No, not through a help page. With Help:Sources and Help:Sources/Items not needing sources there are already two help pages and the second page in particular is of no interest to anyone here. In relation to this I would like to quote MisterSynergy: “Mind that this is a help page, not a policy.” – What I mean to say is that what is written on a help page is not mandatory. You can also see that from the fact that they don't or don't have to stick to the things written there by "trusted users of the community". So the question is: What is the use of a help page? From my point of view it is useless. What is needed is a policy. In other words, the redirect Wikidata:References needs to become a proper Wikidata page where the rules are set. Otherwise everyone can continue as before. Except, of course, for those who oppose such useless references as Special:Diff/1515983069, Special:Diff/269168911 and Special:Diff/42873943. Of course they are criminalized. --Gymnicus (talk) 16:58, 6 February 2022 (UTC)
    @Gymnicus: I would be fine codifying my views as policy assuming consensus could be reached. Deleting references on identifiers is bad even if the references do seem a bit redundant in some cases. BrokenSegue (talk) 17:02, 6 February 2022 (UTC)
  • I realized I miss more fine-grained items (maybe also properties) to precisely state the source of the identifier. For example, I have done reconciliations based on birth year + death year + name. Maybe it would be useful to have means to indicate this in references, probably in conjunction with based on heuristic (P887). Vojtěch Dostál (talk) 09:03, 2 February 2022 (UTC)
  • "Identifiers need references?" No. But also: "Identifiers must not have references?" Definitely also no. In particular on identifier claims, references are a useful and valuable tool to aid the editorial process of identifier curation. Automated processes can have flaws (in Wikidata and in external databases), and manual processes tend to be erratic sometimes, so it can be useful to annotate how the identifiers have been assigned. Data users can of course opt to use identifiers without the references attached to it. —MisterSynergy (talk) 08:39, 4 February 2022 (UTC)
  • So often the reference would be the URL that the identifier links to. In that case I don't think it's useful doubling up. If somewhere else also supports the mapping between the item and the external ID, then definitely that should be added. ⁓ Pelagicmessages ) 10:51, 4 February 2022 (UTC)
  • As a concrete example, MusicBrainz often gives additional identifiers (e.g. Twitter, Facebook). It we're copying identifiers from there, it's useful to record that fact because: Those identifiers should be reevaluated if we decide that the MB identifier is wrong; and it allows us to reconsider the reliability of MB in the future. Bovlb (talk) 21:42, 5 February 2022 (UTC)
    @Bovlb: If I shorten your statement a bit and formulate it provocatively, I could say: The references state that the information in the data object could be incorrect because we do not know whether the specified identifier MusicBrainz artist ID (P434) is correct. – So it would have to be our job to confirm this statement through a manual check. However, this manual check would then also mean deleting the previous reference, because otherwise it would still mean that, for example, the identifier Facebook username (P2013) depending on the identifier MusicBrainz artist ID (P434) could still be wrong. However, the manual check removes this doubt, which is why the reference must also be removed. Of course, the qualifier point in time (P585)(date of verification) should be added during the manual check. --Gymnicus (talk) 17:42, 6 February 2022 (UTC)
    @gymnicus: We're not really in the business of verification here, but instead of representing what appropriate sources say. If I have one identifier, then an item is identified; if I have two identifiers and someone claims they refer to different people, then I have a potential conflation. (My father always told me: Never set to sea with two chronometers; take either one or three.) If I can trace the second identifier to a specific source equating the two, then I have more information to resolve the problem. Bovlb (talk) 21:03, 6 February 2022 (UTC)
    @Bovlb: Of course, verification is a task for us Wikidata users. As an administrator, you deal with it four more times than normal users. For the deletion requests, you must verify the sources and you must also verify that the source is credible. That's why I find it a bit strange that you now say that this is not an issue here. The fact that verification of data, which doesn't just have to be about identifiers, is an issue here can be seen from the fact that we have the option to assign statements to a preferred, normal or disapproved rank. In order to be able to do this, we have to verify and evaluate the data. --Gymnicus (talk) 14:49, 13 February 2022 (UTC)
    @Bovlb please my contact me $1 112.204.172.159 23:31, 17 February 2022 (UTC)

Relation for (school or else) merges and resplits which is not immediate replace

SchoolA, SchoolB and SchoolC merged into CollegeD, then merged into UniversityE, which has FacultyA, FacultyB and FacultyC split from the merged CollegeD.

SchoolA --replaced by--> CollegeD --replaced by--> UniversityE, that's okay. UniversityE --has part-->FacultyA, that's also fine.

However, there is a straight relation between SchoolA and FacultyA, and my question is what it should be? Follows (as I have picked now)? Anything else? (This should also written somehow into follows and replaces (and reverse pairs) pages I think.) grin 12:20, 16 February 2022 (UTC)

I'm not aware of a property for this. However we do have merged into (P7888) which would be better for the SchoolA and CollegeD. There is also separated from (P807) if appropriate. — Martin (MSGJ · talk) 12:35, 16 February 2022 (UTC)
Merged into is only good for CollegeD -> UniE, because UniE existed before, but CollegeD was created to hold Schools A, B and C. Separated from would mean the original entity stays, but it's not the case (collegeD was destroyed in the process). Still, this is covering rather the immediate relation (replaces/replaced), not the one throughout the replacement. (The connection is there because CollegeD was not integrated, so A,B,C was working separately both geographically and by processes, so "separating them" into faculties was rather obvious, since they always have been separate, only pushed and pulled into named entities on the whim of daily politics; the legal status and the actual organisation took different paths.) grin 16:42, 16 February 2022 (UTC)
Interestingly the reverse of merged into does not exist, the page references this which is not a property, just the label. grin 16:50, 16 February 2022 (UTC)
I don't think it is incorrect to say that the three schools were merged into a college. It does not mean the college had to exist previously. — Martin (MSGJ · talk) 21:57, 16 February 2022 (UTC)
No, it is the property which says that: "the subject was dissolved and merged into the object, that existed previously". grin 11:10, 18 February 2022 (UTC)
Ah, I see! Well this seems unnecessary. I have put a note about this on the talk page — Martin (MSGJ · talk) 12:40, 18 February 2022 (UTC)

Protection Lock

I want to add protection to my pages  – The preceding unsigned comment was added by Iliasalikpt (talk • contribs) at 08:04‎, 18 February 2022 (UTC).

Why? What do you mean by "my pages"? See also WD:Protection policy. --Matěj Suchánek (talk) 14:01, 18 February 2022 (UTC)
@Matěj Suchánek: Maybe he doesn't want his created data objects to be deleted. But most of them don't look notable, so I've already suggested deleting them. --Gymnicus (talk) 14:26, 18 February 2022 (UTC)

Wikispecies page moves, wikidata links, and redirects

Hello, (following the wikispecies discussion here), there is an issue with taxon page moves in wikispecies, following taxonomic updates, and wikidata links to the wikispecies pages in question. E.g. (1), wikispecies:Oceanodroma matsudairae (Q785281) was moved to wikispecies:Hydrobates matsudairae (Q28122588), and the wikidata link to wikispecies was moved from the former item to the latter, so the wikispecies pages and the wikidata pages for each name are aligned/wikispecies is linked to the "right" wikidata item, but the updated wikispecies page is now isolated since it is the only link to the wikidata item "Hydrobates matsudairae", while the wikidata item for the old name "Oceanodroma matsudairae" has links to pages in 27 different language wikipedias; assuming we want wikispecies to be linked in to wikipedia, unless we can somehow link redirects, this effectively discourages page moves (i.e., retention of superseded taxonomy), or encourages linking wikispecies to the "wrong" wikidata item. E.g. (2) wikispecies:Bullockornis planei (Q22111932) was moved to wikispecies:Dromornis planei (Q106704224), without removing the wikidata link prior to the page move - as a result of the "Automatic Update from Connected Wiki", the wikispecies page Dromornis planei is (or in fact was, as this has since been amended, following the related wikispecies discussion) linked to the wikidata item "Bullockornis planei", while the wikidata item "Dromornis planei" has no link to the updated wikispecies page.

To say wikispecies users should fix the links when they move pages is not a solution; there are users there who are unaware of or unwilling to engage with wikidata, so they will be moving pages and causing many issues like e.g. 2. (One more engaged user has reported they spent three hours fixing wikidata links for the move of three species; obviously not everyone is prepared to do that.) What is needed is the ability to link a wikispecies redirect page for the legacy taxon name combination to the corresponding wikidata item. There is a commons category wikidata property which means eg more than one enwiki page can have a link in the left margin to one and the same commons category. A similar wikispecies property could be used to link multiple wikidata items to one and the same wikispecies page (perhaps itself auto-updating upon a wikispecies taxon page move). This, however, presumably would mean a wikispecies user wouldn't see in the left margin links to the various language wikipedia articles, were they linked to a different/number of different related wikidata items. Intentional, direct users of wikidata would also benefit from, in e.g. 2, having a link they can click on for further information regardless of which of the two items they happened to look at. Thank you, Maculosae tegmine lyncis (talk) 18:21, 18 February 2022 (UTC)

P.S. Is there a reason why wikispecies has items for both Q330710 "Anas formosa" and Q27461190 "Sibirionetta formosa", rather than one item which includes the information that it was named first x (then a, then b) then y? If there was only one item, this links issue would presumably go away. Maculosae tegmine lyncis (talk) 18:21, 18 February 2022 (UTC)

This folder suggests the problem isn't insurmountable: wikispecies:Category:Redirects connected to a Wikidata item; eg Q20072390, where I just redirected a wikispecies page connected to a wikidata item to a wikispecies page which was aleady connected to a wikidata item; how can I/we override/bypass the software feature which prevents new redirects being linked to wikidata, when there is no software issue with redirects that are already linked? Thank you,Maculosae tegmine lyncis (talk) 23:13, 18 February 2022 (UTC)

Blocking IP range?

Currently I’m working on bavarian cultural heritage monuments (200k). While I’m fixing issues I saw that an IP is providing again and again same issues which are not in line with the common structure, use of property and the way of working. Now I’m getting tired to fix permanently without chance to of discussion and telling this user that he have to change his way of working. The IP edist are comming all out of the same range like this or this. Is it possible to block these IP range? The remainig work on cultural heritage monuments still heavy enought and i don’t won’t any more to clean up this problems. @Ordercrazy: fyi. --Derzno (talk) 17:17, 18 February 2022 (UTC)

I sympathize with the problem of trying to communicate with an ever-changing IP address, but these two IPs are not close enough for a range block. Can you give more examples, and some specific diffs? Also, this request should really be on WD:AN. Bovlb (talk) 22:42, 18 February 2022 (UTC)
@Bovlb:, I'd sorted out a couple more IPs issues given by these IPs. As said it's painfull and frustrating that such changes could be made from IPs without any discussion. I'll guess the IP see how and who is working on the subjects and had also the chance to start discussion on my page. I don't see this as typical vandalism and the edits looks to me as an experianced user having normaly an account but no motivation to discuss something. At the end it brings additional work and it should be a headsup to the IP saying "STOP!". By the way I'd added this content now to WD:AN either --Derzno (talk) 05:16, 19 February 2022 (UTC)

Merge request

I created Q110965517 accidentally, as Q7552504 didn't have the alternate name I was looking for. Special:MergeItems doesn't work for me, help please? -- Zanimum (talk) 19:08, 19 February 2022 (UTC)

See Help:Merge. You can activate the merge gadget in your preferences. From Hill To Shore (talk) 19:14, 19 February 2022 (UTC)

Have we had a previous ruling about public figures not wanting their image displayed in their Wikidata entry?

Have we had a previous ruling about public figures not wanting their image displayed in their Wikidata entry? I could see if it was a purposely unflattering image or even one taken stealthily at a private event, but this image was posed for. We remove private information like telephone numbers and email and exact dates of birth, when requested, but is a posed-for image the same thing? See: CGP Grey (Q5006102) and discussion at Talk:Q5006102. --RAN (talk) 22:30, 6 February 2022 (UTC)

Me and my community did at one point talk to an privacy specialist and it was clear from that the celebrity has less privacy than your average person. An image of an celebrity is fine as long as it is taken in an public place, the celeb is not being harassed and there is no other non-famous person in the image, such as their children. Removing images like that is a matter of respect, not privacy.--Snævar (talk) 18:36, 12 February 2022 (UTC)
@Snævar To the law of which jurisdiction do you refer? The US? It’s certainly not universal … --Emu (talk) 16:10, 13 February 2022 (UTC)
The EU (European Union) one. Snævar (talk) 23:23, 19 February 2022 (UTC)

How to remove spam references?

Thousands of elements have been added references to two search pages in an external database. It would take weeks to remove them manually, then I wonder if there is any way to locate and delete them all in one operation.
The links are:
https://www.hafen-hamburg.de/de/schiffe
https://www.hafen-hamburg.de/de/schiffe/mehrzweckschiffe
--Cavernia (talk) 21:20, 15 February 2022 (UTC)

@Cavernia: Here's one way to find them (the columns returned are the item, the property, the ID of the statement using that property, the hash of the reference on that statement, and the URL on the reference with that hash):
select ?item ?prop ?stmt ?ref ?url {
  values ?url { <https://www.hafen-hamburg.de/de/schiffe> <https://www.hafen-hamburg.de/de/schiffe/mehrzweckschiffe> }
  ?item ?prop ?stmt . ?stmt prov:wasDerivedFrom ?ref. ?ref pr:P854 ?url . 
}
Try it!
It should be possible with a Pywikibot run to remove most of these. (@MB-one: as the one who added some, if not all, of these URLs.) Mahir256 (talk) 22:01, 15 February 2022 (UTC)
But not without resorting to programming I guess? If you instead use Wikibase-CLI there is a remove-reference command that you can use to remove a specific reference from a specific statement. This should be lot easier to use to make a batch operation. Infrastruktur (talk) 22:35, 15 February 2022 (UTC)
It needs Python programming and some pywikibot experience, yes. However, the task is not complicated, and with PAWS we have a fully managed environment with login and stuff already set up, so that only the script needs to be written—a couple of lines of code in this case.
With wikibase-CLI, I guess you need some sort of input preparation and looping over it as well, so technically you are already pretty close to "programming" as well. Every type of advanced edit automation effectively requires at least basic programming skills. —MisterSynergy (talk) 23:52, 15 February 2022 (UTC)
Thanks. As I understand it, QuickStatements can be used to remove a statement, but not to remove a reference from a statement while keeping the statement. A possible workaround is then to delete the statement, and then add it again without the reference. Is it then possible to make a query that returns the values in one column? --Cavernia (talk) 10:14, 16 February 2022 (UTC)
I don't think this is a good workaround. You would need to take care of all potential qualifiers and other references that should not be removed. —MisterSynergy (talk) 11:17, 16 February 2022 (UTC)
Maybe what's actually needed is adding an archive URL? Websites change. @MB-one: fyi --- Jura 12:12, 16 February 2022 (UTC)
@MisterSynergy:I see your consern. I tried it with instance of (P31), filtered out statements without qualifiers, checked if any entries had other references, and ran it through QuickStatements. It worked fine.
@Jura1:This is not the case here. --Cavernia (talk) 16:30, 16 February 2022 (UTC)
I now found a way to extract all 8,831 URLs from this database. I'm not sure if it's a good idea to make a property of this, but at least it will make it easier to replace the references. --Cavernia (talk) 21:09, 18 February 2022 (UTC)
@Mahir256:Is it possible to filter items that only contain reference to these two URLs? If then, I would not have to fear the risk of deleting other references.--Cavernia (talk) 12:42, 20 February 2022 (UTC)

Discussion about the deletion

@Cavernia Why would you remove the references? What's wrong with it? MB-one (talk) 20:18, 16 February 2022 (UTC)
As they said, these references point to the search form of the source website. This is not ideal since it expects data users to anticipate what to do on the external website in order to verify the claims.
Consider for instance Q52353739#Q52353739$F881E060-EA21-4884-86E2-3DD26914C318: it should link to <https://www.hafen-hamburg.de/de/schiffe/cap-beatrice-27795/> instead of <https://www.hafen-hamburg.de/de/schiffe> in the reference.
I think a repair job is more appropriate here. Instead of a removal, these URLs should be fixed (and data ideally re-compared). Additionally, www.hafen-hamburg.de seems fit for an identifier property as well which could make the repair job potentially much easier. —MisterSynergy (talk) 20:31, 16 February 2022 (UTC)
Can someone block the account deleting references from Wikidata while the discussion is still ongoing. --- Jura 20:38, 16 February 2022 (UTC)
They are not references, that is why we want to remove them. Cavernia (talk) 12:55, 17 February 2022 (UTC)
@Cavernia: I am not sure who the "we" is that you are talking about as I see no consensus for removal. There is a proposal above to repair the references and removing them just makes things harder. If you still have more references that you are planning to remove, I would advise you to stop until you gain consensus. From Hill To Shore (talk) 13:11, 17 February 2022 (UTC)
In the meantime, please revert the deletions. --- Jura 14:35, 17 February 2022 (UTC)
@From Hill To Shore: Please see the comment from MisterSynergy which explains why these links don't point to a page containing the actual information, which means they should never have been imported in the first place. According to the guidelines «references are used to point to specific sources that back up the data provided in a statement». This is the consensus behind this removal. Removing the links doesn't prevent us from importing the correct references if we can manage to match the URLs to IMO number.
In addition, these imports have added thousands of duplicate/wrong values, about 50 item mismatches (see Donau (Q1240814) or Clara (Q24026984)) as the user has made a comparison of ship name instead of a unique identifier like IMO ship number (P458). I've spent many hours correcting these errors manually. --Cavernia (talk) 19:05, 17 February 2022 (UTC)
@Cavernia: You are misrepresenting the discussion as MisterSynergy went on to say, "I think a repair job is more appropriate here." At the moment you have yourself supporting removal, Jura objecting to removal and MisterSynergy proposing an alternative course of action. That is a classic case of having no consensus but leaning towards not removing them immediately. You should continue discussing the issue here to clarify consensus before resuming removal of references. Edit warring or editing against consensus will likely lead to you being blocked; continuation of discussion is in your own best interest. From Hill To Shore (talk) 19:21, 17 February 2022 (UTC)
@From Hill To Shore: You need to see the bigger context here. The original imports in 2020 were against consensus in so many ways, but when it was discovered, it was too late to revert the entire batches. Yes, the best solution is to replace this links with real references. The problem is, there is no easy way to achieve a list which matches IMO to URL. I'm working with some scripting alternatives to extract the data, but it will still take a couple of days before I have the complete list and can start importing. In any case, QuickStatements can't replace or update references, which means that the wrong links must be deleted before importing the correct ones. --Cavernia (talk) 20:20, 17 February 2022 (UTC)
I am not stopping you and I don't need to see the wider context. I am merely pointing out that you are editing against the current consensus and one editor has already asked for you to be blocked in the discussion above. If you want to ignore my suggestion of clarifying consensus before you continue, that is up to you. However, you will have no one to blame but yourself if you do get blocked. From Hill To Shore (talk) 20:56, 17 February 2022 (UTC)
Your allegation is wrong. I haven't deleted any single reference since MisterSynergy and Jura1 posted the opinions you refer to. You are welcome to participate constructively in this discussion, but please drop the blocking jargon. --Cavernia (talk) 22:58, 17 February 2022 (UTC)
@Cavernia: I think you have misunderstood my position here. You stated above that "we" agreed to remove the references. I replied to point out that you had no one currently in agreement with you. You then selectively quoted another commenter to imply that you had support. I then pointed out to a later comment from the same person that appeared to retract that support and I suggested that you should continue the discussion to gain consensus. At that point, you could have acknowledged what I said and clarified that you would continue the discussion to find consensus; instead, you told me that in my asking you to find consensus that I didn't see the big picture and that you are continuing to prepare your edit batch. As you weren't addressing anything that I said, I chose to end the conversation by repeating my advice and leaving you to your own devices. You have now accused me of laying an "allegation" against you. What allegation have I made? I stated that you have no consensus, I advised you to find consensus and I pointed out that another editor had already asked for you to be blocked. There is no allegation. If you choose to engage with the other editors to clarify consensus then that is great. If you choose not to then that is another editor's problem to resolve. I'll chalk this up to a miscommunication and move on and I suggest you do the same. From Hill To Shore (talk) 23:22, 17 February 2022 (UTC)
This is your allegation: "you are editing against the current consensus"
No, I were not. --Cavernia (talk) 23:37, 17 February 2022 (UTC)
This will be my last reply here and I would appreciate it if you stop selectively quoting what people have said. You need to read the whole text to gain full understanding. As I said in the text you partially quoted, "If you want to ignore my suggestion of clarifying consensus before you continue, that is up to you." I had not said you had restarted your earlier editing and I had clearly referenced that you had paused by saying, "before you continue." As I said before, I am willing to put this down to a misunderstanding and a miscommunication. I suggest you do the same but I won't be replying on this matter again on this page. From Hill To Shore (talk) 23:46, 17 February 2022 (UTC)

How to repair reference deletions

First step seems to be revert the batches by Cavernia. --- Jura 09:45, 18 February 2022 (UTC)

No. The import of the correct references is soon ready. Cavernia (talk) 17:51, 18 February 2022 (UTC)
The first batch is now imported. I started with length (P2043), see Wilson Malm (Q56451834) as an example. --Cavernia (talk) 20:54, 18 February 2022 (UTC)

Items with P31/P279* Q1969448.

I raised this at the ontology project a while ago, but have been reminded of it recently. I notice that items which match this pattern (instances of term (Q1969448) and its subclasses) seem to be incorrectly modelled since we tend to model the concept a term refers to rather than the term itself (in the Q namespace). I've yet to find a counter example where it makes sense and in practice this issue can lead to some long distance unintended results where items are instances of term (Q1969448) through transitivity of the subclass tree.

Am I right in thinking this is a bad modelling practice? If so, should we consider instances of terms to be problematic and aim to prevent them? --SilentSpike (talk) 21:56, 19 February 2022 (UTC)

@SilentSpike I am no ontologist but I think it's a bit weird to say something is instance of (P31) : term (Q1969448) because what isn't, right? IMO t would be nice to discourage the use of this instance or subclasses of it. Vojtěch Dostál (talk) 15:05, 20 February 2022 (UTC)
Thanks for raising the issue. To beginners in the area of ontology and terminology the distinction between a term (some words) and a concept (something a term refers to) might not be clear, as exemplified by the statement term (Q1969448)said to be the same as (P460)concept (Q151885) (currently on the item term (Q1969448)).
I suggest we clarify that typically terms would be modeled by lexemes and concepts by items, both linked by item for this sense (P5137).
I can think of a few cases where items are indeed instances of term (Q1969448): When an article discusses a term (say, the historical evolution of it's meaning). But in that special case care should be taken to not conflate the term and it's concept.
In general I agree that instances of term (Q1969448) should raise suspicion. Toni 001 (talk) 09:05, 21 February 2022 (UTC)

Remember to Participate in the UCoC Conversations and Ratification Vote!

You can find this message translated into additional languages on Meta-wiki.

Hello everyone,

A vote in SecurePoll from 7 to 21 March 2022 is scheduled as part of the ratification process for the Universal Code of Conduct (UCoC) Enforcement guidelines. Eligible voters are invited to answer a poll question and share comments. Read voter information and eligibility details. During the poll, voters will be asked if they support the enforcement of the Universal Code of Conduct based on the proposed guidelines.

The Universal Code of Conduct (UCoC) provides a baseline of acceptable behavior for the entire movement. The revised enforcement guidelines were published 24 January 2022 as a proposed way to apply the policy across the movement. A Wikimedia Foundation Board statement calls for a ratification process where eligible voters will have an opportunity to support or oppose the adoption of the UCoC Enforcement guidelines in a vote. Wikimedians are invited to translate and share important information. For more information about the UCoC, please see the project page and frequently asked questions on Meta-wiki.

There are events scheduled to learn more and discuss:

You can comment on Meta-wiki talk pages in any language. You may also contact either team by email: msg(_AT_)wikimedia.org or ucocproject(_AT_)wikimedia.org

Sincerely,

Movement Strategy and Governance
Wikimedia Foundation
--YKo (WMF) (talk) 04:38, 21 February 2022 (UTC)

Data Reuse Days: schedule still open for contributions

Hello all,

A follow-up about the Data Reuse Days, online event dedicated to the use of Wikidata's data that will take place on March 14-24. We already have a pretty exciting schedule with plenty of projects reusing Wikidata that will be presented, as well as discussions on best practices to retrieve data and workshops related to the use of Wikidata on Wikipedia and Commons.

We are still looking for community contributions, and it is not too late to propose something for the event! We are especially looking for sessions related to the use of Wikidata on the other Wikimedia projects, lexicographical data, your favorite tools to retrieve and reuse Wikidata's data, as well as practical, hands-on workshops or editathons to make the program more interactive and give people the occasion to gather and try using data for their projects.

We have plenty of time left in the two lightning talks sessions, on March 14 and March 21, and we also dedicated the two week-end days, March 19 and 20, to community sessions. If you have any ideas or suggestions, or if you would like to schedule a session, feel free to write on the talk page or to contact me directly.

Cheers, Lea Lacroix (WMDE) (talk) 09:22, 21 February 2022 (UTC)

Needed update of a protected item

Please do mention on Q701465 Saadia Tamelikecht (Q110929119) as a new sub-prefect, from 2021 (source). Thanks in advance... 92.184.105.69 16:39, 15 February 2022 (UTC)

Anyone, please...? 92.184.98.19 13:53, 21 February 2022 (UTC)

Wikidata weekly summary #508

New development roadmap for Wikidata and Wikibase for Q1 2022

Hi everyone,

I wanted to let you know that we just published the plan of the Wikidata development team for Wikidata and Wikibase for the first quarter of 2022!

Wikidata:Development plan

Here are some highlights for 2022:

In 2022 we will continue developments to help editors increase the quality of Wikidata’s existing data and contribute new high-quality data. Among the initiatives is to build up feedback loops with data re-users to get them more actively involved in improving the data on Wikidata.

We will be improving Special:NewLexeme to make it easier for editors to create new Lexemes. More people need access to knowledge and technology presented in their own language and we believe that language data is a fundamental building block in reaching that goal.

More people should benefit from the data Wikidata provides. We will be releasing the new REST API to make it easier for programmers to access our data.

On the Wikibase side of things, we want to enable more projects with fewer resources to be able to independently onboard themselves into the Wikibase Ecosystem. We will be launching Wikibase.cloud, offering Wikibase as a Service. It will be based on the code used to run WBStack but will be managed and maintained by Wikimedia Deutschland.

We are also conducting market research, to get a better understanding how organizations that could provide valuable data for the ecosystem are taking decisions when it comes to choosing a software.

In addition, we’ve also published a status update about what was achieved for each of the 2021 development goals.

Please note that the development plan only presents the main projects that the development team will work on during the first quarter of 2022. Development may continue for some of these projects beyond that period. Critical and ongoing tasks (e.g. maintenance of the software and fixing pressing bugs) are not mentioned, but will be included in the workflow over the year. At the beginning of each quarter the roadmap will be updated to include the development estimates for that quarter. We will be sending notifications on our usual communication channels upon each update.

If you have any questions or feedback, feel free to add them on this talk page: Wikidata talk:Development plan.

Cheers,

- Mohammed Sadat (WMDE) (talk) 09:48, 10 February 2022 (UTC)

Dear @Mohammed Sadat (WMDE), in September 2020, WMDE has conducted a process review of the support brought by the development team to WD community.
One of the "main fields of suggestions from the community" was to be "able to suggest and vote for the most important features and bugs to fix". An action was added in "Action plan for later in 2021": "Find better processes to include community requests in our (already very packed) roadmap"
So, according to this plan, could you please tell us how community requests were added in 2022 roadmap because I don't remember having seen an announce related to this topic.
Thanks. Ayack (talk) 15:21, 10 February 2022 (UTC)
@Mohammed Sadat (WMDE), Lydia Pintscher (WMDE): Could you please answer me? Taking into account the expectations of the contributors seems fundamental in our project. Thank you. Ayack (talk) 08:33, 21 February 2022 (UTC)
Sorry for the delayed response Ayack, this section got archived and fell off my radar.
You are absolutely right about the feedback we gathered from the process review in 2020 -- particularly on improving how we plan our development activities to include more community voices. We are yet to implement a structured process of collecting and weighing all of the feedback we receive from the community in order to determine what would be the most important task to be working on at any given time, for example, either collaborating more with the WMF Community Wishlist Survey team or setting up our own. In the meantime, I can assure you that we do make a large effort to get a good sense of what is important to the community, and have tried to be guided in our development planning by the inputs we receive from multiple sources all year round, including WD:RATP, WD:PC, on social media, at the Wikidata pingpony sessions, at conferences, during office hours and bug triage hours and in a lot of 1:1 conversations. Given our current limited development capacity Manuel has been working on changing some of our processes so we can take on more medium size projects instead of just tiny and huge ones. We will keep you updated as we make progress in this effort. -- Mohammed Sadat (WMDE) (talk) 17:20, 22 February 2022 (UTC)
Thank you @Mohammed Sadat (WMDE) for your answer. Yes, having a structured an transparent process is key because the majority of users do not use the channels you mentioned. The example of WMF Community Wishlist Survey shows how the often simple expectations of contributors are far away from large centrally decided development plans. Ayack (talk) 21:23, 22 February 2022 (UTC)

Wiki Loves Folklore is extended till 15th March

Please help translate to your language

Greetings from Wiki Loves Folklore International Team,

We are pleased to inform you that Wiki Loves Folklore an international photographic contest on Wikimedia Commons has been extended till the 15th of March 2022. The scope of the contest is focused on folk culture of different regions on categories, such as, but not limited to, folk festivals, folk dances, folk music, folk activities, etc.

We would like to have your immense participation in the photographic contest to document your local Folk culture on Wikipedia. You can also help with the translation of project pages and share a word in your local language.

Best wishes,

International Team
Wiki Loves Folklore

MediaWiki message delivery (talk) 04:50, 22 February 2022 (UTC)

Is there a process to propose a significant Identifier change to existing property?

I was looking at C-SPAN person ID (P2190) and noticed it uses strings as the ID and one of the examples is broken. I posted to Property_talk:P2190#Should this property change its Identifier from the string to numeric ID, but what I haven't found is any written guidance of what to do other than post on the talk page to propose a major change to a property. The RFC says there should be significant discussion prior and Property Proposals are currently written in a way to handle new properties. Does an existing process exist and if so, where is it documented and linked? Wolfgang8741 (talk) 05:27, 23 February 2022 (UTC)

You have done exactly the right thing by proposing it on the property talk page. You can also contact relevant WikiProjects and post here (which you have done) — Martin (MSGJ · talk) 08:27, 23 February 2022 (UTC)

Property Proposal with no discussions

What happens to property proposals with no discussions? Will it be created eventually or will it hang there forever? I recently proposed a property but no one is supporting nor opposing its creation, can I support it myself and have it created? If not is there any place where I can ask for a review to garner support? Hsuaniwu (talk) 09:51, 23 February 2022 (UTC)

This may be a result of Property proposals not displaying (see above); I suggest posting a link here; and on the talk pages of any relevant wiki projects. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:35, 23 February 2022 (UTC)

What's up with Steam?

Items using Steam application ID (P1733) doesn't offer a simple value normalization (wdtn) where other external identifiers do. It has a formatter URL, so how come? Infrastruktur (talk) 17:26, 21 February 2022 (UTC)

@Infrastruktur: The property Steam application ID (P1733) works for me as it should. The link is shown to me and I can also follow it. --Gymnicus (talk) 17:56, 21 February 2022 (UTC)
Yes, but the wdtn value doesn't show up in queries. This is a bit of a mystery. Infrastruktur (talk) 18:08, 21 February 2022 (UTC)
The normalized value is derived from formatter URI for RDF resource (P1921), if present. Toni 001 (talk) 08:11, 22 February 2022 (UTC)
Thanks. I updated the documentation on the RDF dump format to make this clear, but there was at least one other section that basically worded it as the formatter url, so that might need rephrasing. Infrastruktur (talk) 21:43, 23 February 2022 (UTC)

Recent Change Stream for Wikidata

As we can see in the list of streams available as part of https://stream.wikimedia.org/?doc, there are no streams available for Wikidata to get hold of recent changes. My doubt is whether the above fact is true ? If yes then when can we expect a stream to be implemented for the same or alternative ways to get recent changes for Wikidata ? If not, please guide to how to use stream for recent change for Wikidata. Thank you.  – The preceding unsigned comment was added by 203.244.219.15 (talk • contribs) at 10:38, 23 February 2022‎ (UTC).

As far as I understand, all the listed streams contain events for all wikis at once; if you’re only interested in streams for one wiki, you can filter them yourself, e.g. on the domain or database. See also wikitech:Event Platform/EventStreams § Filtering. --Lucas Werkmeister (WMDE) (talk) 17:13, 23 February 2022 (UTC)

Q9046234 Renaming an object?

Sorry, don't even know where to properly ask. I've got an issue with said object. It refers to West Ozark Township in Missiouri, USA that, as far as I can ascertain, does not exist under that name anymore. In it's place there's Northview Township. My knowledge of Missiourian administrative protocol and Wikidata as such is insufficient to know what to do with abovementioned WD-object named West Ozark Twp. Change the name? Create new object? And what really happened with the township/-s? Can't edit on en:wp due to some block. Appreciating any insight remains --G-41614 (talk) 14:25, 20 February 2022 (UTC)

I tried a web search and the first three relevant hits for “Northview Township” in MO say it's in Christian County, whereas Q9046234 is in Webster County. At least two sites has maps placing it south of Springfield adjacent to Fremont Hills. However the co-ords on Q9046234 place it near Bumgarden Ford, southeast of (another?) Northview. ⁓ Pelagicmessages ) 17:23, 23 February 2022 (UTC)
The second Northview is east of Springfield, on the railway line to Marshfield. References are sparse. G-41614, do you have any sources saying that Northview was previously called West Ozark Township? ⁓ Pelagicmessages ) 17:35, 23 February 2022 (UTC)
Aha, https://www.google.com/maps/place/West+Ozark+Township,+MO,+USA/@37.2697239,-92.9919855,11z/data=!4m2!3m1!1s0x87c57d91fd52101b:0x26fd4816646251fd?hl=en gives me an outline that stretches from Holman almost to Marshfield, with Northview at north-centre. ⁓ Pelagicmessages ) 17:45, 23 February 2022 (UTC)
Hello, Pelagic. When I posted my inquiry I had not known about the second Northview Twp - the issue's with the one in Webster County, east of Springfield. I reached my possible conclusion by comparing the maps at the Census-profiles for Northfield Twp and West Ozark Twp, respectively. What the google.maps-link show as West Ozark, the Census shows as Northview Twp., the Northview in google.map being Northview Populated Place, GNIS-ID 735753. If you check the West Ozark Twp. at the Census, there's no data for 2020, only 2010. A few days ago I could still load the map, right now it's not working. In the Census-profile map for Northview Twp., there's East Ozark Twp. right to the east of Northview Twp., so ... question remains, did that really happen, and what to do with Q9046234? Anyway, thank you for your effort so far, --G-41614 (talk) 21:27, 23 February 2022 (UTC)
According to Missouri 2010 population and housing use counts (pdf) p.28, West Ozark Township was created at some point between the 2000 census and the 2010 census by changing the census boundaries of (or outright deleting) existing townships. If it has disappeared from the 2020 census then they have probably redrawn the boundaries again (unless it was a straight name change). If it was a name change with mostly the same boundaries as the previous township, then we keep the same item but set two statements of official name (P1448) with the old and new name. If the old township has been deleted or had major boundary changes, we set the existing item with dissolved, abolished or demolished date (P576) and create replacement items (depending on how many ways the old territory was split). We then link the old item and new item(s) with replaces (P1365) and replaced by (P1366). From Hill To Shore (talk) 22:14, 23 February 2022 (UTC)
Thanks, that should clarify the situation. Without the map at the Census-profile to West Ozark loading I can't show it, but the boundaries were similar. Not as in mostly, but rather exactly. So I guess I'll look into that official name (P1448)-thingy. Thank you for your information! Regards, --G-41614 (talk) 22:50, 23 February 2022 (UTC)

Q110910880 (Gustine L. Hurd)

I think I did an okay job setting this up, though if anyone wants to have a lookover, I'd appreciate it. I rarely create datapoints from scratch. Adam Cuerden (talk) 09:55, 24 February 2022 (UTC)

Looks good to me! — Martin (MSGJ · talk) 15:25, 24 February 2022 (UTC)
@Adam Cuerden: Good work. You could perhaps add some more sources, such as for his given and family name; and his occupation. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:59, 24 February 2022 (UTC)

Changes to English descriptions of Roman Catholic clergy...any thoughts?

I have pending a QuickStatements batch ([6]) of around 8,500 edits to English descriptions which will correct "roman catholic"/"roman-catholic" to "Roman Catholic" and "roman-catholic church" to "Roman Catholic Church". Due to the large number of edits I thought it would be prudent to mention it here first in order to see if anyone has a comment or question. I performed a similar fix last year ([7]) and there were no issues (the "errors" were either duplicate items or same-named individuals which I then had to fix by hand...this is likely to happen again with items such as Giacomo (Q3761961)/Giacomo (Q96587945)/Giacomo (Q108546954)). If there are no objections after a few days then I will submit the batch to be completed.
--Quesotiotyo (talk) 23:43, 21 February 2022 (UTC)

In principle sure, but is there a reason the description sometimes reads “Roman Catholic bishop” and sometimes “bishop of the Roman Catholic Church” (apart it being the starting point)? --Emu (talk) 07:43, 22 February 2022 (UTC)
Notified participants of WikiProject Religions Maybe better uniforming to "Roman Catholic bishop", more concise; surely correct capitalizing "Roman Catholic", of course. --Epìdosis 07:51, 22 February 2022 (UTC)
I agree.  Bargioni 🗣 15:16, 22 February 2022 (UTC)
The usual convention is a title is capitalized if it is immediately before the person's name. So "Roman Catholic Bishop Christopher J. Coyne" but "The Roman Catholic bishop of the Diocese of Burlington is Christopher J. Coyne". I would not want a bot to presume that editors got it wrong; if the bot can't positively identify the context well enough to get the capitalization correct, it should not be run. Jc3s5h (talk) 17:24, 24 February 2022 (UTC)
Your comments confuse me. I am not changing the capitalization of "bishop" or similar titles. The only changes will be to capitalize "Roman Catholic (Church)" where it is not and to remove any hyphens that do not belong. There is no bot involved with these edits.
--Quesotiotyo (talk) 20:59, 24 February 2022 (UTC)

I will go ahead and run the batch since there is agreement about "Roman Catholic (Church)" needing to be capitalized. I will leave any changes to the exact wording of the descriptions to those with more knowledge in this area. Thanks to everyone for the feedback.
--Quesotiotyo (talk) 21:09, 24 February 2022 (UTC)

So in other words, you decided to do what you like and not to care about the results of the discussion. --Emu (talk) 21:58, 24 February 2022 (UTC)
I don't know. My reading of the discussion was that people supported the change User:Quesotiotyo suggested but thought more could be done. There's always a risk of bikeshedding preventing actual work from being done. It's sometimes ok to work incrementally especially when the change isn't that far reaching (only 8k edits). BrokenSegue (talk) 22:25, 24 February 2022 (UTC)
No, not at all. The only changes I made were the ones that I had originally proposed (to which no one objected), which capitalized "Roman Catholic" and "Roman Catholic Church" where they had been lowercase and removed the hyphen from "Roman-Catholic" (also some double spaces were trimmed, which I neglected to mention). I cannot answer your question about the difference between "Roman Catholic bishop" and "bishop of the Roman Catholic Church". There may very well be a distinction that I am not aware of, so I did not want to make any changes to the wording where I was not sure. Also, there are currently 185 pairs of people with the same English label where one is "Roman Catholic bishop" and the other "bishop of the Roman Catholic Church". These will either need to be merged if they are duplicates or additional information will need to be added to both descriptions to disambiguate the two. I spent a couple of hours doing this yesterday ([8]) after my batch finished but there are still more to do (there are currently 17 bishops named "Giovanni" with virtually identical descriptions if anyone is looking for a challenge! :) )
--Quesotiotyo (talk) 20:10, 25 February 2022 (UTC)

Representing the structure of international NGOs like Greenpeace, Oxfam or Rotary International

How would you represent the structure of international NGOs like Greenpeace (Q81307), Oxfam (Q267941) or Rotary International (Q109179)? It could be done via part of (P361)/has part(s) (P527) or parent organization (P749)/has subsidiary (P355). Since most of the time it's not an owner structure, but more an alliance, both ways are only in part correct. Best Newt713 (talk) 15:41, 25 February 2022 (UTC)

Notified participants of WikiProject Nonprofit Organizations Newt713 (talk) 15:43, 25 February 2022 (UTC)
I would too be interested in the general rules governing use of part of (P361) / parent organization (P749) in items of organizations. Vojtěch Dostál (talk) 16:15, 25 February 2022 (UTC)

Property proposals not displaying

Wikidata:Property proposal/Authority control (for example) does not display all current proposals, because of template transclusion maxima being exceeded.

Category:Property proposal authority control contains all past proposals of that type, so is not useful for finding current proposals.

Category:Open property proposals contains proposals of all types, not just those relating to authority control, so is also less than optimal.

Can we resolve the template issue in some way, or use more targeted categories, or both? Or split Wikidata:Property proposal/Authority control and other such pages into parts? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 19:35, 22 February 2022 (UTC)

The parser output says it's hitting the maximum page size which is set to 2M. This is too big. There is some styling that could be pushed to the main stylesheet, that might help. Edit: How about just splitting the category alphabetically in two? Infrastruktur (talk) 22:20, 22 February 2022 (UTC)
I think splitting by type (say, "AC for people" vs "AC other") would be better. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:06, 24 February 2022 (UTC)

Note that this issue also means that the counts on the table on Wikidata:Property proposal are wrong. For example, it currently says "Authority control: 47", but that is only the number of proposals transcluded on the sub-page, and excludes another seventeen (i.e. an error of 27%). Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:06, 24 February 2022 (UTC)

The counts are supposed to be updated by the bot [9], but they haven't been updated for a while...
As for the template expansion limit, I'd blame the {{Ping project}} template. I can see some proposal pages use the same one multiple times. Past uses could perhaps be suppressed using <nowiki>...</nowiki> or {{Tl}}. --Matěj Suchánek (talk) 13:03, 26 February 2022 (UTC)

Failed to merge

Hi. Apparently I made something wrong since I cannot proceed the merging of Q9076263 and Q1661427. Could anyone do it, please? --TwoWings (talk) 15:17, 26 February 2022 (UTC)

Merged. --Tagishsimon (talk) 19:14, 26 February 2022 (UTC)

Books with multiple ISBNs

I just introduced a conflict at Q57233214 by adding a second ISBN to the item. I gather from this that wikidata objects are supposed to only refer to a single edition/format of a book. Ok, fine. Now what? Do I make a second, nearly identical wikidata item for the paperback version of this book, and link both of them back to the en-wiki article I just created for it? (Are multiple wikidata items "allowed" to link to the same wikipedia article? I've never noticed this before.) Is there some way I should be describing ISBNs - i.e., is there a way to indicate "this wikidata item is for the paperback book" vs "this wikidata item is for the cloth-bound book"? Thanks in advance for the help. -- Asilvering (talk) 18:49, 26 February 2022 (UTC)

@Asilvering: Yes the idea is that you would make an item for every edition of the work. Generally Wikipedia articles are about the work not an edition of a work. So you would change The Queer Art of Failure (Q57233214) to be a literary work (Q7725634) and then make a new item for the edition you have in mind. You would connect the two with edition or translation of (P629). You may find Wikidata:WikiProject_Books helpful. BrokenSegue (talk) 19:03, 26 February 2022 (UTC)
@BrokenSegue Perfect, thank you! I think I've correctly set up the work itself and the other edition and linked them now. -- asilvering (talk) 19:33, 26 February 2022 (UTC)

Mills

Mill house (left) and windmill

We have >4500 instances of mill (Q44494) in Wikidata and many more are instances of subclasses of mill. Most of them are probably buildings but that item seems to be for a "device that breaks solid materials" rather than a building as a whole. Many languages do not distinguish between the device and the building which houses the device. Should Wikidata keep the concepts separate? Currently, mill (Q44494) has subclass of (P279) : machine (Q11019) but also industrial building (Q12144897). Separately, we have mill house (Q107196975) which is not linked from any other item. Any thoughts on this would be appreciated. Vojtěch Dostál (talk) 16:49, 25 February 2022 (UTC)

Would support a clear distinction between the machine and the building; & agree that many/most of our mills are indeed mill buildings. Mill house = "the residence of a miller, often attached to a mill, brewery or distillery" - https://canmore.org.uk/thesaurus/1/435/MILL%20HOUSE - at least in UK usage; so probably should be linked to a mill with e.g. connects with (P2789) --Tagishsimon (talk) 17:59, 25 February 2022 (UTC)
@Vojtěch Dostál, Tagishsimon: I have Schoterveense Molen (Q2718906) down the street. The machine is the building so no distinction to be made here. With mills you have to make the distinction between the source of energy (water, wind, horses, etc.) and what the energy is used for (grinding things, pumping water, sawing, etc.). This is not easy to model or at least we never completed it for the cases in the Netherlands. mill (Q44494) should probably be replaced by a more specific item.
As for mill house (Q107196975). This is probably what we call in Dutch a molenaarswoning: A house close to the (wind)mill for miller and family to live in. I'll update that item. Multichill (talk) 15:04, 26 February 2022 (UTC)
building != machine (paper mill (Q918088))
But the machinery is not always the building. You make a good point about the source of power, and your building/machine point holds, for the most part, for watermills, windmills, &c. It starts to fail for what we might call dark satanic mills, which is to say for factory buildings which have milling facilities within them (and in which the mill building might later be used for other industry, for accommodation, for office space &c). (And there's another subclass point: what is milled - corn, paper, flax &c &c) --Tagishsimon (talk) 15:20, 26 February 2022 (UTC)
@Tagishsimon: be careful, in English mill is also a synonym for factory (Q83405) like for example steel mill (Q2069494) and paper mill (Q918088) (the example picture). Let's not mix in those. That would make it even more complicated.
But even without the factories it's a very general and broad concept so any statements are likely not to match everything.
See https://petscan.wmflabs.org/?psid=21533870 for de:Kategorie:Wassermühle in Deutschland that can probably be changed to watermill (Q185187) and also a couple in France. Multichill (talk) 15:55, 26 February 2022 (UTC)
Looks like we did do some modeling, just forgot about it:
Multichill (talk) 16:04, 26 February 2022 (UTC)
The topic of the thread is the distinction - where it exists - between the building and the machine; about the fact that entire buildings are being coded as a "device that breaks solid materials". "Let's not mix in those", for obvious reasons, does not cut it. --Tagishsimon (talk) 17:46, 26 February 2022 (UTC)

Thank you all for your comments. I still think that mill (Q44494) was mostly created with tabletop mills or coffee mills in mind and a new item for "buildings used as mills or harboring a mill" should be created. Can we find agreement there? If yes, we can further discuss if this new item should be connected to mill (Q44494) via has part(s) (P527), subclass of (P279) or even has use (P366). Finally, thanks for the edits at mill house (Q18759904), now it's much clearer. Vojtěch Dostál (talk) 20:20, 26 February 2022 (UTC)

No, it wasn't. In Dutch we talk about a "molen" (mill) and "malen" (grinding) even when it's about pumping of water. We have a word "maalvaardig" (Q2504911) to indicate if it's functional (able to grind). All sorts of different types are listed on nl:Molen from small to big. Multichill (talk) 22:04, 26 February 2022 (UTC)

Coming soon

- Johanna Strodt (WMDE) 12:39, 28 February 2022 (UTC)

Wikidata weekly summary #509

Bad sources

Can we agree we should not use Conservapedia (Q17963), Infogalactic (Q55075031) or Metapedia (Q693899) as source in any item not directly related to the websites themself or it's founders? --Trade (talk) 17:57, 25 February 2022 (UTC)

@Trade: I wasn't even aware of the second and third ones you mention. Do you have some examples of incorrect sourcing to those sites? If something's a matter of fact, which is what most of wikidata contains, then presumably any source supporting it should be ok. But I agree some sources may be troublesome. ArthurPSmith (talk) 17:53, 28 February 2022 (UTC)
Infogalactic (Q55075031) is a fork of English Wikipedia, it doesn't really add anything. I do object to linking to Conservapedia on 2016 United States presidential election (Q699872). Partially, i also wanted to start a discussion on notable sites that should not be used. @ArthurPSmith:--Trade (talk) 18:16, 28 February 2022 (UTC)
Hmm, it's not used as a reference, but there's a described by source (P1343) link. That does seem less than useful. ArthurPSmith (talk) 18:28, 28 February 2022 (UTC)

Pending proposal

Hi,

why nobody did create this pending proposal?

Nomen ad hoc (talk) 18:11, 28 February 2022 (UTC).

@Nomen ad hoc This is because there is considerable backlog of properties ready for creation. Vojtěch Dostál (talk) 18:56, 28 February 2022 (UTC)

Expanding notability to include use of Wikidata items in Wikimedia projects

Hi all, I've proposed adding "Wikidata information is being directly used on another Wikimedia project" to Wikidata:Notability, or changing the third point to "It fulfills a structural need, for example: it is needed to make statements made in other items more useful, or it is used in Wikimedia content such as tables, lists, or references." If you're interested, please have a look and comment at Wikidata_talk:Notability#Wikidata_information_is_being_directly_used_on_another_Wikimedia_project. Thanks. Mike Peel (talk) 20:02, 28 February 2022 (UTC)

Gadget for finding archived discussions

Is there any equivalent to w:User:SD0001/find-archived-section here on Wikidata? {{u|Sdkb}}talk 20:26, 28 February 2022 (UTC)

Wikidata:Mismatch Finder is ready for testing!

Quick introduction to and demo of how the Mismatches tool works (short video).

Hello,

As you may know, the Wikidata development team has been working on a tool that lets editors review mismatching data between Wikidata and external databases. The tool is now ready to be used, and you can access it here and read more details on Wikidata:Mismatch Finder. We hope that this tool can be useful to people who are working on data quality and matching external databases with Wikidata, and we are looking forward to your feedback if you give it a try!

What is the purpose of Mismatch Finder?

The tool helps highlight differences in the data between Wikidata and other databases, in order to improve data quality in Wikidata and make the whole linked open data web more robust. The tool itself doesn’t check these databases automatically: it is necessary for someone to compare an external database to Wikidata first and then upload a list of possible mismatches into the Mismatch Finder, so they can be analyzed and processed by Wikidata editors.

By providing such a tool, we hope to support the Wikidata editors to spot and fix mistakes in Wikidata as well as organizations reusing Wikidata’s data, who now have a convenient way to contribute back by reporting lists of possible mismatches.

How to use the tool to check mismatches?

On the Mismatch Finder tool page, you can check Items by entering a list of Q-IDs (for example taken from a SPARQL query). After clicking on “Check Items”, the tool will check if there are mismatches for these Items in the mismatch store, and display any issue that was found with a specific part of the data.

From this page and after logging in with your Wikidata account via OAuth, you will be able to choose a status of the mismatch, indicating what part of the data is wrong, and to access the Item on wikidata.org to edit the data if needed. Mismatch Finder does not perform any automatic editing on Wikidata.

Once the status is changed from “waiting for review” to another value, the mismatch will not appear in the list anymore.

You can also use the Mismatch Finder user script that will display an alert at the top of the Item pages on wikidata.org and a link to the Mismatch Finder tool to learn more about the potential mismatches. See Help:User scripts for how to enable the user script for your account.

Where does the information come from?

Information about the potential mismatches is stored in the Mismatch Store, a database separate from Wikidata where organizations, researchers and editors can upload lists of mismatches.

The Mismatch Store is hosted on Toolforge and its content can be accessed via an API. You can find more information about the database, how to get data from the API, how to prepare and upload a mismatches file in this user guide.

We hope that the Mismatch Finder tool will help to build up feedback loops with data re-users to get them actively involved in improving the data on Wikidata. Feel free to try out the tool and let us know what you think on the talk page. You can also join us for an intro session and discussion at the upcoming Data Reuse Days.

We would especially like to thank Mike Peel and Marco Fosatti, for providing the first mismatches and real-world testing data for the Mismatch Finder to get us started. More will follow in the next days and weeks.

Cheers,

-Mohammed Sadat (WMDE) (talk) 12:50, 22 February 2022 (UTC)

Starting with a list of QIDs seems like an odd way to use such a tool. Is there a way to browse the mismatches, say, by property? Bovlb (talk) 03:52, 25 February 2022 (UTC)
yeah I found it confusing too. What's the use case the developers had in mind? Maybe a browser extension that runs it on the item I'm looking at would make sense? BrokenSegue (talk) 04:01, 25 February 2022 (UTC)
"You can also use the Mismatch Finder user script that will display an alert at the top of the Item pages on wikidata.org and a link to the Mismatch Finder tool to learn more about the potential mismatches. See Help:User scripts for how to enable the user script for your account"; above. --Tagishsimon (talk) 06:38, 25 February 2022 (UTC)
ah I missed that. thank you. BrokenSegue (talk) 19:06, 25 February 2022 (UTC)
Has anyone received an alert with the user script yet? I've enabled it some days ago and I'm still looking for an item with a mismatch... An interface that would allow to browse the mismatches (like Mix'n'Match for example) would be more useful in my opinion. Ayack (talk) 17:20, 3 March 2022 (UTC)
yeah I enabled the script and as far as I can tell it hasn't done anything. BrokenSegue (talk) 18:34, 3 March 2022 (UTC)
Watch the video linked above at roughly the 2:00 min timestamp in order to see how it is supposed to look like. —MisterSynergy (talk) 18:44, 3 March 2022 (UTC)
Predefined sets would probably be checked in order, so if added their order ought to be randomized. More immediately useful would be if the tool could generate a set of 100 or 1000 random Q-items to start with. There's no need to check if the Q-ids are valid beforehand as I trust the checks are done at runtime. Infrastruktur (talk) 22:00, 25 February 2022 (UTC)