Welcome

edit

  Welcome to Wikidata, Habst!

Wikidata is a free knowledge base that you can edit! It can be read and edited by humans and machines alike and you can go to any item page now and add to this ever-growing database!

Need some help getting started? Here are some pages you can familiarize yourself with:

  • Introduction – An introduction to the project.
  • Wikidata tours – Interactive tutorials to show you how Wikidata works.
  • Community portal – The portal for community members.
  • User options – including the 'Babel' extension, to set your language preferences.
  • Contents – The main help page for editing and using the site.
  • Project chat – Discussions about the project.
  • Tools – A collection of user-developed tools to allow for easier completion of some tasks.

Please remember to sign your messages on talk pages by typing four tildes (~~~~); this will automatically insert your username and the date.

If you have any questions, don't hesitate to ask on Project chat. If you want to try out editing, you can use the sandbox to try. Once again, welcome, and I hope you quickly feel comfortable here, and become an active editor for Wikidata.

Best regards! MisterSynergy (talk) 06:16, 14 June 2018 (UTC)Reply

Athletics results

edit

Hey Habst, I saw that you are experimenting with athletics results, which is a great idea. However, I strongly suggest to adjust the model to be more in line with Wikidata habits. We often add results as qualifiers to participation data. If we look at Usain Bolt (Q1189), this would mean:

participant in
  athletics at the 2016 Summer Olympics – men's 100 metres
stage reached final
ranking 1
race time 9.81 second
wind speed 0.2 metre per second
0 references
add reference
  2017 World Championships in Athletics
competition class men's 100 metres (item to be created)
stage reached qualification event
ranking 1
race time 10.07±0.005 second
0 references
add reference
add value

This means that a results (P2501) statement is no longer necessary, but that property wasn’t meant to be used in participant items (humans or teams) anyways. In case there are no items for individual events (say there is 2017 World Championships in Athletics (Q175508) but not “2017 World Championships in Athletics – Men's 100 metres”, see second example), one can use the competition item as main value for the participant in (P1344) claim with an extra qualifier competition class (P2094), or alternatively (less precise) sports discipline competed in (P2416) similar as in your approach. Qualifiers country (P17) and point in time (P585) can be omitted, as they are redundant to information which belongs to the event item (e.g. athletics at the 2016 Summer Olympics – men's 100 metres (Q25397537)).

Any thoughts? —MisterSynergy (talk) 06:16, 14 June 2018 (UTC)Reply

Hey, @MisterSynergy, thanks so much for this! I definitely want to be as standard as possible before I start doing this on a larger scale and will implement the participant in (P1344) schema on my bot first. Often times I won't be able to WikiData-ify the meet name automatically -- in that case, I'll just set that part to "no value" for now. If I do know the meet WD item then I'll definitely have the bot go in to that item and add country / date info -- but if not I'll keep the country / date, and although the event-specific items like athletics at the 2016 Summer Olympics – men's 100 metres (Q25397537) exist for Olympics and some international championships they don't exist for most meets, so I think I'll continue using sports discipline competed in (P2416). But thank you for letting me know about competition class (P2094), a string qualifier is very handy and I'll probably use that to include some of the "other" performance info that the bot can't automatically WikiData-ify like competition name and heat number. Do you think this is OK? --Habst (talk) 14:33, 14 June 2018 (UTC)Reply
Yeah, I understand the problem that many items of the type athletics meeting (Q11783626) and individual events thereof are not yet well organized. My primary workfield is Wikidata:WikiProject Rowing, where similar problems existed for a while. I took an approach to get most of these meta items sorted first, in order to be able to add full results at a later stage (still planned, but I’m almost there fortunately). This includes all items related to the type of sport in Wikidata about: humans (occupations and identifiers to databases, no duplicates), regattas (equivalent to meetings in athletics) and their individual events, competition classes, optionally venues. Nevertheless, it is certainly possible to some extent to start with results before all other items are organized.
In general, we don’t have a compulsory sports ontology, unfortunately, but we try to adhere to some similarities among different types of sports as much as possible. This would make it easier to compare and contrast data, and maybe also to re-use it later in infoboxes etc. I plan to write some formal documentation at Wikidata:WikiProject Sports later, which is not in a good shape at the moment.
Anyways, if you need help or input, feel free to ping me. I’m already watching your bot proposal page. —MisterSynergy (talk) 20:10, 14 June 2018 (UTC)Reply
Hi Habst. Good work on setting up the bot! We've still got some basic questions outstanding on the athletics data model that would be good to sort out before adding lots of data (e.g. reaction times, lane draws, DNF). Please see Wikidata:WikiProject Olympics/Competition formats for info on what I've collated thus far. Do you want to add comments to that/propose some of the missing properties? Note that we have a pretty big barrier in the lack of a suitable HH:MM:SS datatype at Wikidata (see phabricator).
I've noticed that the cycling editors have a well-developed data model, but I'm conscious that theirs is very sport-specific. For athletics, I think it a good idea to start thinking of broader data concepts that can apply to other sports too (for example, "lane draw" could apply to athletics, swimming and cycling, or we could make it a broader "starting order", which would incorporate turn-based sports like archery and equestrian too). It is inevitable (given the prominence of the Olympics) that many people will want to extract data on many sports - the more complex the data models (and number of properties) the harder the task becomes and the less useful the data will be. Using similar designs across sports also means we can leverage the effort of editors with different interests like MisterSynergy, as any tools developed should be relatively easy to convert to use across sports. Sillyfolkboy (talk) 20:26, 14 June 2018 (UTC)Reply
Hey guys, thanks again for your helpful comments.
@MisterSynergy, thanks for your overview. I've taken a look at your work here and it's really impressive, and certainly a model for what I hope to achieve with athletics. My hope is that by getting something accurate but carefully limited in scope out there, we can begin to work on templating for Wikipedia and improve as we go along.
@Sillyfolkboy, thanks for your advice. I totally agree about those two links and have studied them a lot -- that's why I made the latest comment on the Phabricator ticket in April and I added the Olympics project competition format page link to the Athletics WP home page last month. For the Phabricator ticket, I agree that it's a big problem for usability. I actually tried to hack the Wikibase source code and fix it myself once but didn't get very far... but I don't consider it a roadblock per se, because I think it will be easy for a bot to auto-convert pure quantity times to the new datatype once it gets developed, and of course doing the HH:MM:SS to SSSS conversion is hard for humans but not computers  . In terms of the questions, I think they totally make sense for the Olympics but a lot of them don't apply to the vast majority of meets in the IAAF database -- for example in my experience most meets outside the Olympics, WCs and Diamond Leagues don't have any reaction time or even lane draw info available (similar for full decathlon and field event scorecards), and even for Olympic competitions that info isn't in the IAAF athlete race history listings (only in the more detailed per-meet listing). That's not to say these qualifiers aren't valuable to add and they certainly can be added easily later when we have a standard way of retrieving them -- it's just that they aren't in the scope of wikiTrackBot as it stands, so I don't think WTB conflicts with that format specification.
I've just updated the bot source code to comply with as much of this as possible, and I will try my best to be super vigilant about being standards / spec-compliant going forward. I really hope that this will be successful and help coverage of athletics on wikidata! --Habst (talk) 21:16, 15 June 2018 (UTC)Reply

Representation of Wikidata at the Wikimedia movement strategy process

edit

Hi Habst, I'm contacting you because I would like your support and your comments on my proposal to represent the Wikidata community at the Wikimedia movement strategy process. I'm contacting you in private because you are a member of the Wikidata Community User Group and I thought that this could be relevant for you.--Micru (talk) 17:40, 19 June 2018 (UTC)Reply

Thanks, I responded. --Habst (talk) 14:01, 22 June 2018 (UTC)Reply

WikiTrackBot changing IAAF-IDs to deprecated

edit

Hello Habst, currently WikiTrackBot is setting a lot of IAAF-IDs to deprecated:

Is there any background information about this process? For example, why are the IDs deprecated, is there a mapping of old IDs to new IDs, which/how much entries are effected, what about the usage of these IDs in articles in various languages, which also use theses IDs, how can we inform maintainers in these languages about the effected entries, in order to remove or replace the IDs in the articles, etc.

Thanks a lot! M2k~dewiki (talk) 23:06, 31 October 2023 (UTC)Reply

BTW: I just found about, because a lot of old articles appeared in de:Kategorie:Wikipedia:World Athletics ID fehlt auf Wikidata (category for arcticles, where the IAAF-ID is found in the article, but not in the Wikidata item), which used a deprecated IAAF-ID in the article. Several invalid IDs have been removed from the articles. M2k~dewiki (talk) 23:14, 31 October 2023 (UTC)Reply
Hi @M2k~dewiki, thank you for your question. Yes, by my count there are about 33,000 items with IAAF-IDs on Wikidata, and about 24,000 of them were using deprecated IDs. So those 24,000 IDs will need to be marked as deprecated and replaced.
At some point within the last ~3 years, World Athletics started changing from 5-digit IDs (from 1 to ~81090) to 8-digit IDs (starting at 14164600). I believe the reason was that the smaller IDs were from the defunct All-Athletics database which they initially bought out, while the longer IDs were generated by World Athletics so they could have their own system.
Originally, there were redirects set up from the small (old) IDs to the new IDs. For example, #210311 redirects to #14208053. But that system isn't perfect, because the redirect only works for the HTTP call to the athlete webpage, and the redirect does not work if you are trying to call the API directly (i.e. via graphql endpoints). So for that reason alone, we should replace the old IDs with the new IDs via redirect and mark them as deprecated.
Also, there is another problem. Some of these "small" IDs actually no longer work at all -- for example Margarita Martirena (Q15767868)'s value #64382 is a 404 and it does not redirect, so it has to be deprecated and manually replaced. Also there are some cases like Mercy Kuttan (Q6818771) where the small ID does redirect, but the redirected ID is a 500 error so it is useless. I don't know why some of the small IDs redirect properly and some do not, but it is worrying and it makes me think that we cannot depend on these redirects always being there, which is why we need to deprecate them ASAP.
So, my bot will be testing every "small" IAAF-ID (less than 6 characters) to see if it redirects. If it does, it will replace the ID with the new redirected ID and deprecate the small one. If the redirect is broken, then it will just be deprecated, and humans will have to find a replacement -- but this isn't new, and it just means the ID was broken all along so it is good that we are just "catching" the problem now.
The only advice I would have to maintainers is that these category errors (where the IAAF ID template is in article but there is no non-deprecated value in Wikidata) will become more common, but it is only because my bot is finding the broken IDs, it was a problem that always existed but is just being pointed out now. Is there any way you would recommend I inform people about this? --Habst (talk) 00:00, 1 November 2023 (UTC)Reply
In the german language wikipedia users are : @Sol1, Loper12321: for the spanish language wikipedia @Leonprimer:
There are also portals like d:Q8289031, where a information can be posted on discussion pages. M2k~dewiki (talk) 00:07, 1 November 2023 (UTC)Reply
Also see
M2k~dewiki (talk) 00:14, 1 November 2023 (UTC)Reply
Thank you for the hints. I've notified @Leonprimer here: es:Usuario discusión:Leonprimer#WikiTrackBot will be changing old IAAF-IDs to deprecated and I've notified the English Wikipedia WikiProject Athletics here: en:Wikipedia talk:WikiProject Athletics#wikiTrackBot will be deprecating many older {{World Athletics}} IDs
I hope this will be sufficient. Let me know if there are any concerns and I will be resuming the bot run soon. --Habst (talk) 02:38, 1 November 2023 (UTC)Reply

IAAF duplicates

edit

Hello Habst, IAAF duplicates can be found at

Yesterday I cleaned up all duplicates, today there are 78 new duplicates. M2k~dewiki (talk) 16:56, 1 November 2023 (UTC)Reply

Thank you for the report, I am happy to report that the bot run is finished as this query is (or should be) empty now:
SELECT ?item ?itemLabel ?id WHERE
{
  ?item wdt:P1146 ?id.
  FILTER(STRLEN(?id) < 8).
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
ORDER BY DESC(xsd:integer(?id))
Try it!
I will go through that list and merge items as appropriate. --Habst (talk) 17:37, 1 November 2023 (UTC)Reply

Incorrect use of instance of (P31)

edit

Hello –

This statement on "record" (Q1241356) was incorrect, and I have removed it. Errors with instance of (P31) and subclass of (P279) lead to numerous false inferences.. When using these two properties, check the description of your value item, and make sure your statement follows Help:Basic membership properties. (In this case, an instance of award (Q618779) would be something like Heisman Trophy (Q1035067) -- record (Q1241356) is a class, and not all records result in awards.) There is often another property that can accurately express the relationship you had in mind; in this case, contributing factor of (P1537) works better. You can search through available properties here. Thanks! Swpb (talk) 14:31, 18 January 2024 (UTC)Reply

@Swpb, thank you. In my case I wanted to indicate that Boureima Kimba (Q24203097)'s personal best (P2415) in the 200 metres was a national record (Q24034388) -- see Q24203097#P2415. I tried to do it with an award received (P166) qualifier, but it complained that national record (Q24034388) was not a subclass of award (Q618779). It seems like "award received" may not be the correct qualifier, so I used has effect (P1542) instead. --Habst (talk) 14:52, 18 January 2024 (UTC)Reply

Millrose Games

edit

Hi there! I noticed your message on the ontology for 2024 Millrose Games (Q124526525). I've followed the format of 2020 Summer Olympics (Q181278) as that is an established standard. :) Sillyfolkboy (talk) 21:50, 17 March 2024 (UTC)Reply

@Sillyfolkboy, hello and thanks for checking in! I'll follow that standard (namely "instance of" parent meeting) going forwards. --Habst (talk) 22:06, 17 March 2024 (UTC)Reply