Wikidata talk:Events/Data Modelling Days 2023

Latest comment: 10 months ago by Lea Lacroix (WMDE) in topic Session proposal: The QLever SPARQL Engine

Data Modelling Days, 30 November-2 December 2023

Discussions related to the Data Quality Days are welcome here. If you have any question or need support (for example, to propose a session), feel free to contact Lea Lacroix (WMDE) directly ( or @Auregann on Telegram).

Template for session proposal


Note: the deadline to propose sessions is now over. You can still choose to schedule a session yourself in the open program on Sunday 3rd, but please note that we will not include video recording, facilitation or notes-taking on that day. Here's a template you can use for proposing a session for the Data Modelling Days. Feel free to copy its content and create a new section below! (use the session title as the section title). Please note that sessions are not automatically accepted: because we have limited slots in the schedule, we will make a selection of sessions that can make it to the program of the event.

  • Session title:
  • Format:
  • Speaker(s) or facilitator(s):
  • Short description of the session:
  • Audience:
  • Suggestions of time and date:
  • Language:

Suggestions of topics


If there's a topic that you would love to see covered at the event, but you don't feel like running a session yourself, feel free to add it below. You can also ping people whom you would like to talk about this topic. And of course, feel free to take inspiration from this list to propose a session yourself!

(If someone wants to work a real example during a session, I imagine it could serve well as that. I suspect for some sessions, it would be practical to do a real task while demonstrating how do do stuff.)
--RudolfoMD (talk) 06:15, 29 November 2023 (UTC)Reply
  • Lea Lacroix (WMDE) A question by Tsaag Valren just today (in french) that could be fun to talk about : cloning. There are a bunch of items to denotate clones and director / manager (P1037) / relative (P1038) could be used, but … one of the "clone" items is about cloned cells, and the usecase are whole organisms or fiction characters, there is one item about "cloning", and there is one question about how to express that one is the original and the other is the copy. Is it better to create a new property ? Also to make the constraint happy the "clone" item has to be made a kind of parenthood relationship. author  TomT0m / talk page 17:48, 30 November 2023 (UTC)Reply
@TomT0m, Tsaag Valren: Sounds like a perfect topic for one of the "Data Modelling Clinic" sessions we have during the event, feel free to join one and raise the question :) Lea Lacroix (WMDE) (talk) 17:54, 30 November 2023 (UTC)Reply

Session proposal: Using Wikidata to Bootstrap a new Knowledge Graph: Selecting Relevant Entities using Analogical Pruning

  • Session title: Using Wikidata to Bootstrap a new Knowledge Graph: Selecting Relevant Entities using Analogical Pruning
  • Format: Lightning talk (10min)
  • Speaker(s) or facilitator(s): Pierre Monnin
  • Short description of the session: Knowledge Graph Construction (KGC) can be seen as an iterative process starting from a high quality nucleus that is refined by knowledge extraction approaches in a virtuous loop. Such a nucleus can be obtained from knowledge existing in an open KG like Wikidata. However, due to the size of such generic KGs, integrating them as a whole may entail irrelevant content and scalability issues. Selecting relevant entities is also challenging, e.g., due to different representation granularity in different communities. In this talk, we discuss the motivation to select parts of interest from Wikidata to bootstrap a new KG, the associated issues, and present our approach to select relevant entities based on analogical reasoning.
  • Audience: Everyone
  • Suggestions of time and date: Preferably Friday, December 1st
  • Language: English
Hello @Pmonnin:, thanks a lot for your proposal! I'm thinking of scheduling a lightning talks session on Friday, December 1st at 10:00 UTC (11:00 in France), would that time work for you? Lea Lacroix (WMDE) (talk) 09:43, 2 November 2023 (UTC)Reply
Hello @Lea Lacroix (WMDE), thank you for your message. That time would be perfect for me! Pmonnin (talk) 10:24, 2 November 2023 (UTC)Reply
✅ Perfect! I added the LT session on Friday at 10:00 UTC with your proposal. Lea Lacroix (WMDE) (talk) 10:51, 2 November 2023 (UTC)Reply
That's perfect, thank you! Looking forward to sharing this work with the Wikidata community! Pmonnin (talk) 09:40, 3 November 2023 (UTC)Reply
Hi @Pmonnin:, thanks again for joining the Data Modelling Days! When you have a minute, could you upload your slides in PDF on Wikimedia Commons, with the category Data Modelling Days 2023 presentations? If you're encountering any issue, please let me know. Many thanks in advance! Lea Lacroix (WMDE) (talk) 07:15, 5 December 2023 (UTC)Reply
Hi @Lea Lacroix (WMDE). Thanks again for having me at the Data Modelling Days. I would love to upload my slides in PDF but they are always flagged as potentially unconstructive. How can I solve this issue? Thanks in advance for your help Pmonnin (talk) 08:31, 5 December 2023 (UTC)Reply
Hi @Pmonnin:, right, it's a new limitation on Commons that apparently prevents people who didn't upload other files before to upload pdf files. If you share the file with me ( I can upload it on your behalf. Lea Lacroix (WMDE) (talk) 06:19, 6 December 2023 (UTC)Reply

Session proposal: Modelling research expeditions

  • Session title: Proposal for modelling research expeditions
  • Format: Modelling challenge (50min, work together on a specific issue)
  • Speaker(s) or facilitator(s):Ambrosia10
  • Short description of the session: An international collaboration of Wikidata editors and people affiliated with the Biodiversity Information Standards community (TDWG) and natural history institutions are aiming to create an agreed schema for the modelling of research expeditions in Wikidata. The group has been working on this schema and wish to discuss it with the wider Wikidata community as well as obtain feedback on improving the schema.
  • Audience: Everyone
  • Suggestions of time and date: Preferably Thursday 16 UTC (This is the normal meeting time for our group)
  • Language: English

- Ambrosia10 (talk) 14:37, 26 October 2023 (UTC)Reply

Hi @Ambrosia10:, thanks a lot for your idea of inviting the Biodiversity group at the event, I really like it!
I don't think I can fit a specific session on research expeditions at the date and time that you asked for, but here's what I can offer instead: at that time and date, we run a "Data Modelling Clinic" session, where everyone can come and ask for input on modelling. We would dedicate this session to biodiversity topics, so the group would have the chance to present their work, but other people would also have time to present their modelling challenges related to biodiversity. What do you think? :) Lea Lacroix (WMDE) (talk) 09:22, 2 November 2023 (UTC)Reply
Hi Lea, that would be great! The group and I want other Wikidata folk to run their eye over our proposed schema to double check we've thought of everything. It may be there are others wanting the same reassurance so a biodiversity themed session would be perfect. Ambrosia10 (talk) 16:52, 2 November 2023 (UTC)Reply
✅ Perfect! The session is scheduled. Lea Lacroix (WMDE) (talk) 07:34, 6 November 2023 (UTC)Reply

Session proposal: modeling protected areas

  • Session title: Proposal for modeling protected areas
  • Format: talk
  • Speaker(s) or facilitator(s): Olea
  • Short description of the session: We really need to standardize protected areas in Wikidata into a single common practice. Here we'll propose an extrapolated model of heritage designation (P1435) and intangible cultural heritage status (P3259).
  • Audience:
  • Suggestions of time and date: none
  • Language: English

—Ismael Olea (talk) 10:24, 4 October 2023 (UTC)Reply

Hi @Olea:, thanks a lot for your suggestion! Would Saturday, December 2nd at 11:00 UTC (12:00 in Spain) work for you?
@Susannaanas: Would you be interested in joining this session at the Data Modelling Days and talk about modelling challenges on living heritage?
@Dario Crespi (WMIT):, would you like to join and talk about modelling challenges when working on historical monuments? This way, we could broaden a bit the session to "Modelling heritage", I think it would make it even more interesting :) Let me know what you think! Lea Lacroix (WMDE) (talk) 09:50, 2 November 2023 (UTC)Reply
@Lea Lacroix (WMDE) 👌 —Ismael Olea (talk) 10:39, 2 November 2023 (UTC)Reply
✅ Perfect! I scheduled it at the time mentioned above. Let's see if other people want to join so we can expand the topic. Lea Lacroix (WMDE) (talk) 07:34, 6 November 2023 (UTC)Reply
OK! – Susanna Ånäs (Susannaanas) (talk) 15:17, 6 November 2023 (UTC)Reply

Session proposal: Conflations and duplications

  • Session title: Conflations and duplications
  • Format: discussion (50 min)
  • Speaker(s) or facilitator(s): Epìdosis
  • Short description of the session: I would like to present some data about conflations and duplications (a paper I have written about this issue will be available before the end of October) and to facilitate a discussion about possible ways to mitigate this problem in its various aspects
  • Audience: everyone
  • Suggestions of time and date: Friday; or Saturday, after 13 UTC
  • Language: English

--Epìdosis 13:47, 4 October 2023 (UTC)Reply

Ciao @Epìdosis:, thanks a lot for proposing sessions! Would Friday 1st morning work for you? We could do for example "Conflations and duplications" at 09:00 UTC (10:00 for us in CET), and the second one at 11:00 UTC, to give you an hour to rest in between :)
By the way, would you be open to rephrase the title of your second section to highlight Autofix a bit more? I think people would be attracted by the idea of discovering this tool.
Let me know what you think! Lea Lacroix (WMDE) (talk) 09:39, 2 November 2023 (UTC)Reply
@Lea Lacroix (WMDE): Sure, the timings are fine. For the title of the second, how about "A better way to enforce a data model: suggestions to improve Autofix"? Thanks as always! --Epìdosis 09:50, 2 November 2023 (UTC)Reply
✅ Perfect, both sessions are scheduled here. Feel free to add more details and the link to your paper about conflations if it's published. Lea Lacroix (WMDE) (talk) 10:22, 2 November 2023 (UTC)Reply

Session proposal: Lack of an effective way to enforce a data model

  • Session title: Lack of an effective way to enforce a data model
  • Format: discussion (50 min)
  • Speaker(s) or facilitator(s): Epìdosis
  • Short description of the session: as a follow-up of Wikidata:Events/Data Quality Days 2022/Modeling data, I would like to discuss more deeply the third aspect I highlighted (i.e. enforcement of data models); specifically, I would like to present the main present tool used for enforcing data models ({{Autofix}}), focusing on its qualities, its limitations and its issues, and to give some proposals for a new tool substituting it
  • Audience: everyone (all my ideas are already in Wikidata talk:Events/Data Quality Days 2022/Modeling data#The need for an improved autofix and, more briefly, in phab:T341405, which can easily be read; if the participants have already read these pages, I can just speak very few minutes and leave nearly all time for discussion)
  • Suggestions of time and date: Friday; or Saturday, after 13 UTC
  • Language: English

--Epìdosis 13:53, 4 October 2023 (UTC)Reply

I agree that something better is needed to enforce data models. What I see is needed is a way to enforce things like replacement of not individual values but replacement of all values that are instances of a class such as replacing instance of (P31) to an instance of ship class (Q559026) with vessel class (P289) to the instance. Another lack is a way to add in consequences, such as the intended consequence of is metaclass for (P8225). Another lack is a way to state and enforce disjointness.
These would all be very helpful in enforcing the data model around ship (Q11446). Peter F. Patel-Schneider (talk) 23:17, 10 October 2023 (UTC)Reply
✅ Scheduled. Lea Lacroix (WMDE) (talk) 07:46, 6 November 2023 (UTC)Reply

Session proposal: Alternate reference model

  • Session title: Alternate reference model
  • Format: Modelling challenge (50min, work together on a specific issue)
  • Speaker(s) or facilitator(s): ArthurPSmith
  • Short description of the session: Wikidata references are separately attached to each statement. This is fine when the source(s) for each statement differ, but this is often not the case, and items can get loaded down with many repetitions of the same reference. There are a variety of gadgets to simplify adding these repeated references. But wouldn't it be better to just state the reference information once, then refer back to it by a short identifier? Wikipedia (most languages) does this now, though we don't necessarily want to follow that model with a separated reference section. Serious problems caused by the current state will be discussed, and options for improving the situation raised with the hope of the session selecting a preferred path forward.
  • Audience: anyone who cares about references in wikidata or the size of items and our triple count
  • Suggestions of time and date: Any day, preferably after 12:00 UTC

ArthurPSmith (talk) 17:21, 10 October 2023 (UTC)Reply

It is surely a very important point! I found some discussion about the point in this old ticket, phab:T76233, which was closed after the creation of DuplicateReferences; another one, phab:T159191, is mainly proposing something like DuplicateReferences itself, so IMHO it could be closed; in fact, I think we really need a new ticket for this issue, which is closely related to the bigger problem of data redundancy, that has the negative effect of uselessly making items bigger (which is a problem for WDQS). --Epìdosis 17:38, 10 October 2023 (UTC)Reply
WDQS does use the same subentity for same references. Midleading (talk) 01:19, 15 October 2023 (UTC)Reply
Just checked, WDQS shares the same reference among different entities. For example, there are 14,750K statements with "imported from Wikimedia project (P143)=English Wikipedia (Q328)" reference (SELECT (COUNT(*) AS ?count) WHERE { ?importFromEnwiki prov:wasDerivedFrom wdref:fa278ebfc458360e5aed63d5058cca83c46134f1. }). There are no duplicated references in WDQS. Midleading (talk) 08:14, 15 October 2023 (UTC)Reply
Hello @ArthurPSmith:, thanks for your proposal! We could make it fit on Friday 1st between 13:00 and 16:00 UTC, or Saturday 2nd between 14:00 and 18:00 UTC. Let me know what would work best for you.
Also, it gives me similar vibes as the "How to make Wikidata smaller" session ran by @Mahir256: at WikidataCon 2023 (video). Maybe there's something to do here, like have Mahir talking at the session as well, broadening a bit the topic? Or suggesting Mahir's session as a video to watch beforehand? Let me know what you think! Lea Lacroix (WMDE) (talk) 10:02, 2 November 2023 (UTC)Reply
@Lea Lacroix (WMDE): I think I'd prefer before 16:00 on Friday; 13:00, 14:00 or 15:00 would be fine. On Mahir's talk - I'm afraid I've had other meetings going on and haven't been able to check in on this year's Wikidatacon. Thanks for the link, definitely releated. Mahir points out one possible solution (changing the json representation) but I had a couple of other ideas that might be easier to implement. @Mahir256: what do you think, are you interested in focusing on this one question together? ArthurPSmith (talk) 16:14, 2 November 2023 (UTC)Reply
✅ Thanks! I tentatively scheduled it on Friday at 15:00 UTC. Lea Lacroix (WMDE) (talk) 07:59, 6 November 2023 (UTC)Reply

Session proposal: Wikibase federation

  • Session title: Wikibase federation
  • Format: Discussion (25min)
  • Speaker(s) or facilitator(s): ArthurPSmith
  • Short description of the session: (this may be out of scope) Wikicite and some other projects would like to either add much more data to Wikidata, or else to have their own wikibase that is closely federated. The wikibase route would mean that many item values (authors, topics, author affiliations, journals, publishers, etc. etc.) should be linked directly to their Wikidata values, rather than duplicated. Can wikibase be extended to allow external (presumably cached) references for item values? Some ideas for how to do this will be raised, and the discussion should help refine next steps.
  • Audience: Wikicite participants, people interested in federation or wikibase technical infrastructure
  • Suggestions of time and date: Any day, preferably after 12:00 UTC.

ArthurPSmith (talk) 18:31, 10 October 2023 (UTC)Reply

Hi @ArthurPSmith:, I'd love to find a slot for this session in the Data Modelling Days program. What do you think of Saturday, December 2nd, at 14:00, 15:00 or 16:00 UTC? Let me know what would work best for you (and possibly ask other people from the Wikicite group for their preference). I will then take care of scheduling the session at the requested time. Lea Lacroix (WMDE) (talk) 09:53, 2 November 2023 (UTC)Reply
14:00 would be best, thanks! ArthurPSmith (talk) 15:51, 2 November 2023 (UTC)Reply
Perfect, I scheduled it! There's still a possibility that we shuffle sessions around to accomodate other sessions, if it happens I'll let you know. Lea Lacroix (WMDE) (talk) 16:29, 2 November 2023 (UTC)Reply
I'm fine with the other times too, but I do have another commitment later on that day, so finishing earlier is better for me. ArthurPSmith (talk) 18:38, 2 November 2023 (UTC)Reply
✅ Scheduled. Lea Lacroix (WMDE) (talk) 07:46, 6 November 2023 (UTC)Reply

Session proposal: Hypotheses

  • Session title: Hypotheses
  • Format: Modelling challenge (50min, work together on a specific issue)
  • Speaker(s) or facilitator(s): Daniel Mietchen
  • Short description of the session: Hypotheses, theories, lemmas, theorems and similar concepts are currently modeled in a rather inconsistent fashion. The aim of the session is to share an overview of the current situation and to explore how such concepts could be described in Wikidata terms, and how to delineate them from another.
  • Audience: basic knowledge of Wikidata or hypotheses-related concepts is needed; advanced knowledge of either or both can be very helpful.
  • Suggestions of time and date: Any time on Dec 1 or 2.
  • Language: The session will be facilitated in English. Comments and other contributions in any language are welcome.

-- Daniel Mietchen (talk) 23:27, 10 October 2023 (UTC)Reply

Hi @Daniel Mietchen:, thanks for your proposal! I'm putting it on hold for now, waiting to see if we could fit it in the program. If we cannot, you could still talk about modelling hypotheses during one of our "Data Modelling Clinic" sessions where everyone can take 5-10min to present a data modelling challenge and get input. I'll keep you updated. Lea Lacroix (WMDE) (talk) 10:06, 2 November 2023 (UTC)Reply
Hello @Daniel Mietchen:, as we're trying to fit everything into the tight schedule, I would like to offer you to talk about modelling serialized fiction during ~10min during one of our "Data Modelling Clinic" slots. We have two of them available:
  • Friday, December 1st at 17:00 UTC
  • Saturday, December 2nd at 13:00 UTC
Let me know which one would work best for you, and I'll add a slot for you! Thanks, Lea Lacroix (WMDE) (talk) 07:30, 20 November 2023 (UTC)Reply
@Lea Lacroix (WMDE): Both slots currently work for me, with a preference for the Friday session. I hope, though, that I can stick to my original topic of modelling hypotheses, rather than serialized fiction (of which I don't know much). Thanks, --Daniel Mietchen (talk) 10:54, 20 November 2023 (UTC)Reply
Upsie, clearly a copy/paste mistake :D Thanks for your swift reply, I added the topic to Friday's session. ✅ Lea Lacroix (WMDE) (talk) 06:11, 21 November 2023 (UTC)Reply

Session proposal: Why are there no men or women in Wikidata?

  • Format: discussion
  • Speaker(s) or facilitator(s): Peter Patel-Schneider
  • Short description of the session: As of 11 October 2023 there are no items that are instance of (P31) woman (Q467) and no items that are correctly instance of (P31) man (Q8441). This is the case even though there are many men and women that are represented by items in Wikidata. Why is this so? How can this be discovered? How did this come about to be? How can one find the items that do in fact belong to man (Q8441) or woman (Q467)? None of these questions are answered on or or their talk pages. There are many other examples where the obvious way of retrieving information from Wikidata produces incorrect results and there is no obvious way of finding out why. Modelling in Wikidata would be much better if there were fewer of these situations, but how can this be done? This session would be devoted to answering this question.
  • Audience: Anyone interested in making Wikidata more usable by non-experts.
  • Suggestions of time and date: No earlier than 12:00 UTC (07:00 EST).
  • Language: English

Peter F. Patel-Schneider (talk) 14:46, 11 October 2023 (UTC)Reply

Hi @Peter F. Patel-Schneider:, thanks for raising the topic of modelling gender on Wikidata. Your proposal actually made us think about the idea of having a broader discussion on this quite sensitive topic, where several people who performed research on it could give us an overview of how gender is modelled on Wikidata, the history of the decisions that were made, and how we could improve it. The speakers are currently preparing their proposal, I will let you know when it is scheduled, and you will be of course free to join as a participant. Best, Lea Lacroix (WMDE) (talk) 10:11, 2 November 2023 (UTC)Reply
Sorry for not responding earlier. I somehow missed your comment.
My proposal is not about gender per se, but is instead about how modelling is done in Wikidata. As a user of Wikidata I want to access the information in Wikidata using my preferred way of modelling. So I expect to be able to find women in Wikidata as instances of woman (Q467), or plumbers as instances of plumber (Q252924). But this is not possible, as the information is in Wikidata but represented a different way. If woman (Q467) was not a class in Wikidata then I would have no problem, but as it is, I expect to be able to use it to access the relevant information in Wikidata.
Hopefully there will be an effort to produce a better rewrite mechanism and this effort could also provide information on what information has been rewritten, leading towards a solution of this problem. Peter F. Patel-Schneider (talk) 14:04, 5 December 2023 (UTC)Reply

Session proposal: Building our own Wikibase from scratch. How and why.

  • Format: Discussion (25min)
  • Speaker(s) or facilitator(s): Jason Evans, National Library of Wales
  • Short description of the session: A short presentation about how and why we decided to build a Wikibase as a connector between Wikidata and our own name authority data. I'll talk about the challanges, advantages and our expearience of building a custom ontology, whist retaining interoprability with Wikidata. The presentation will be followed by plenty of time for questions and a discussion.
  • Audience:Anyone thinking about starting out with Wikibase.Cloud or anyone interested in Linked Data for name authority.
  • Suggestions of time and date: Ideally not Dec 1st
  • Language: English

Jason.nlw (talk) 09:21, 13 October 2023 (UTC)Reply

✅ Suggested for Saturday 2nd at 10:00 UTC, see discussion below. Lea Lacroix (WMDE) (talk) 10:03, 2 November 2023 (UTC)Reply

Session proposal: Building a instance for internal knowledge, which exists alongside Wikidata

  • Format: Discussion, 50 minutes
  • Speaker(s) or facilitator(s): André Costa (WMSE), Alicia Fagerving (WMSE)
  • Short description of the session: Wikimedia Sverige is using Wikibase Cloud to build a structured database of the chapter's events, documents and projects (Metabase). We're exploring the strengths and limitations of using structured data for this in comparison with our our own wiki. In the long run, we hope other affiliates would like to join us with their data. In this session, we invite everyone interested in this initiative to discuss some relevant questions, including: How to strike the balance between compatibility with Wikidata and avoiding data duplication? What properties and structures do we need that we do not have on Wikidata? How to best document the work? We have only worked with WMSE's own data so far, so we'll appreciate other perspectives a lot to make sure we're not inadvertently building a platform that only works for our needs.
  • Audience: People involved in the work of Wikimedia affiliates, interested in sharing the knowledge about their activities with the rest of the movement.
  • Suggestions of time and date: Due to other commitments, we will most probably ask to hold our session either in the early morning or late evening (Swedish time). We will suggest more specific slots nearer the final submission deadline (Nov 19).

Alicia Fagerving (WMSE) (talk) 08:13, 2 November 2023 (UTC)Reply

Hi @Alicia Fagerving (WMSE), André_Costa_(WMSE):, many thanks for your suggestion! We would love to have this session, which could form a little "Wikibase" cluster with the one from @Jason.nlw: above. How about we use the morning of Saturday, December 2nd, by having Alicia's session at 9:00 UTC, and Jason's session at 10:00 UTC? Would that work for you? Let me know! Lea Lacroix (WMDE) (talk) 09:29, 2 November 2023 (UTC)Reply
I was excited to read Jason's proposal and guessed we could do something like this! I'm very positive about it, however I can't give a definite answer about the suggested time right now. I hope I'll know more details about mine and André's schedule within the next few days, and we'll let you know as soon as possible :) Alicia Fagerving (WMSE) (talk) 10:10, 2 November 2023 (UTC)Reply
✅ Update: as we are receiving many proposals, I would suggest we have both presentations in the 50min slot at 10:00 UTC. This would leave 20min for both projects and 10min for Q&A. Let me know if this works for you @Alicia Fagerving (WMSE), André_Costa_(WMSE), Jason.nlw: Lea Lacroix (WMDE) (talk) 07:52, 6 November 2023 (UTC)Reply
It works for us at WMSE! Alicia Fagerving (WMSE) (talk) 09:23, 16 November 2023 (UTC)Reply
20 mins is fine for me too Jason.nlw (talk) 15:29, 27 November 2023 (UTC)Reply

Session proposal: Money, money, money – modeling budgets, grants and other financial information

  • Format: Modeling challenge
  • Speaker(s) or facilitator(s): André Costa (WMSE), Alicia Fagerving (WMSE)
  • Short description of the session: This session builds on the previous one we suggested and the presentation and discussion of Wikimedia Sverige's Metabase project. In this session, we invite people to collaboratively model data about the financing of activities of a Wikimedia affiliate. Data about budget, funders, project grants - how to best structure it so that it's both specific and flexible enough to be useful for other affiliates. Since modeling of these aspects is not well built out on Wikidata the discussion here could hopefully be of use also for expanding this area on Wikidata.
  • Audience: People who have participated in the session Building a instance for internal knowledge, which exists alongside Wikidata and are willing to share the knowledge about the specifics of how financing works in their Wikimedia affiliate.
  • Suggestions of time and date: Due to other commitments, we will most probably ask to hold our session either in the early morning or late evening (Swedish time). We will suggest more specific slots nearer the final submission deadline (Nov 19).
  • Language: English
Hi @Alicia Fagerving (WMSE), André Costa (WMSE):, thanks for your proposal! I'm putting it on hold for now, waiting to see if we could fit it in the program. If we cannot, you could still talk about modelling moneyz topics during one of our "Data Modelling Clinic" sessions where everyone can take 5-10min to present a data modelling challenge and get input. I'll keep you updated. Lea Lacroix (WMDE) (talk) 10:08, 2 November 2023 (UTC)Reply
Hello @Alicia Fagerving (WMSE), André Costa (WMSE):! As we're trying to fit everything into the tight schedule, I would like to offer you to talk about modelling budgets and grants during ~10min during one of our "Data Modelling Clinic" slots. We have two of them available:
  • Friday, December 1st at 17:00 UTC
  • Saturday, December 2nd at 13:00 UTC
Let me know which one would work best for you, and I'll add a slot for you! Thanks, Lea Lacroix (WMDE) (talk) 07:28, 20 November 2023 (UTC)Reply
Thank you! The Saturday slot would be best. If at all possible, we would be grateful if we could be scheduled early in :) Alicia Fagerving (WMSE) (talk) 10:11, 21 November 2023 (UTC)Reply
✅ For sure! I put you first in the list :) Lea Lacroix (WMDE) (talk) 14:11, 21 November 2023 (UTC)Reply

Session proposal: Scaling Wikidata Query Service - Split the Graph experiment

  • Format: presentation + discussion
  • Speaker(s) or facilitator(s): Guillaume Lederrey (WMF), David Causse (WMF)
  • Short description of the session: Wikidata Query Service is a vital tool for data quality work, especially around modelling data and finding ontology issues in Wikidata. It is currently under a lot of strain and work is ongoing to improve the situation. In this session we want to present the next step towards improving the sytem, the current experiment done by WMF on Splitting the WDQS Graph, followed by a Q&A / Discussion
  • Audience: anyone who uses Wikidata Query Service
  • Suggestions of time and date: Thursday 9am-5pm UTC, Friday 9am-5pm UTC
  • Language: English

GLederrey (WMF) (talk) 16:47, 3 November 2023 (UTC)Reply

Thanks a lot for your proposal! @GLederrey (WMF), DCausse (WMF): I can offer Friday 1st at 13:00 UTC (14:00 en France métropolitaine); does it work for you? Lea Lacroix (WMDE) (talk) 08:01, 6 November 2023 (UTC)Reply
Yes, it works for me (and for David). Thanks! GLederrey (WMF) (talk) 08:27, 7 November 2023 (UTC)Reply
✅ Scheduled. Lea Lacroix (WMDE) (talk) 07:48, 8 November 2023 (UTC)Reply

Session proposal: Intensive Wikidata usage on Wikivoyage - Experiences and challenges

  • Session title: Intensive Wikidata usage on Wikivoyage - Experiences and challenges
  • Format: Discussion 25min
  • Speaker(s) or facilitator(s): DerFussi
  • Short description of the session: German Wiki uses Wikidata intensively. Data model changes, changing properties and qualifiers and limited features with Lua creates challenges. A lecture and discussion could show you the challenges we faced.
  • Audience: Anyone who uses Wikidata on a wiki, developers, Wikidata editors
  • Suggestions of time and date: 02. Dez. anytime
  • Language:

DerFussi 10:56, 4 November 2023 (UTC)Reply

Thanks a lot @DerFussi: for your proposal! I'm thinking of having a session "Reusing Wikidata on the Wikimedia projects" where we could include your presentation, on Saturday 2nd at 15:00 UTC (16:00 in Germany). Would that work for you? Lea Lacroix (WMDE) (talk) 08:04, 6 November 2023 (UTC)Reply
@Lea Lacroix (WMDE) Sounds good. I will be ready then. -- DerFussi 13:53, 6 November 2023 (UTC)Reply
✅ Scheduled. Lea Lacroix (WMDE) (talk) 15:39, 6 November 2023 (UTC)Reply
Hi @DerFussi:, as we just freed a slot on November 30th evening, and we would love to give more visibility to the "Reusing Wikidata on the Wikimedia projects" panel, we are considering moving it to this slot. Would you be available on Thursday, November 30th at 19:00 UTC (20:00 in Germany)? Thanks in advance! Lea Lacroix (WMDE) (talk) 05:23, 15 November 2023 (UTC)Reply
@Lea Lacroix (WMDE): It's possible. I will block that time frame for you and prepare a presentation to be part of that session. -- DerFussi 09:10, 15 November 2023 (UTC)Reply

Session proposal: Modelling serialized (web) fiction

  • Session title: Modelling serialized (web) fiction
  • Format: Modelling challenge (50min, work together on a specific issue)
  • Speaker(s) or facilitator(s):Walkuraxx
  • Short description of the session: WikiProject African Literary Metadata is a Wikidata project dedicated to create and improve Wikidata's coverage of African informal literatures. The project has just started recently by among others exploring how serialized fiction is modelled in Wikidata. Apparently there is no common model for describing serialized fiction. Editors choose different solutions to describe this literary form (see f.e. Reichsgräfin Gisela (Q19225178), Oliver Twist (Q164974), Uncle Tom's Cabin (Q2222), The White Rose. A tale of the fifteenth century (Q109983876) and Hlomu : the wife (Q123165784)). I would like to discuss with the wider Wikimedia community how one could model serialized fiction (how to model publication date and installments, how many layers to use for the description, how to model literary genres). The model should not only be able to describe fiction printed in installments in newspapers or magazins but also online forms of serialized fiction such as blog fiction, Facebook novels, mobile novels etc.
  • Audience: Everyone
  • Suggestions of time and date:
  • Language: English

—Walkuraxx (talk) 16:32, 4 November 2023 (UTC)Reply

Thanks a lot for your proposal! I'm putting it on hold for now as I'm waiting to see more proposals coming. If we have other people interested in modelling books or other works of fiction, I'll offer to combine them in one session. Lea Lacroix (WMDE) (talk) 08:07, 6 November 2023 (UTC)Reply
Hello @Walkuraxx:, thanks a lot for your proposal! As we're trying to fit everything into the tight schedule, I would like to offer you to talk about modelling serialized fiction during ~10min during one of our "Data Modelling Clinic" slots. We have two of them available:
  • Friday, December 1st at 17:00 UTC
  • Saturday, December 2nd at 13:00 UTC
Let me know which one would work best for you, and I'll add a slot for you! Thanks, Lea Lacroix (WMDE) (talk) 07:27, 20 November 2023 (UTC)Reply
Thank you @Lea Lacroix (WMDE) :-) Saturday, December 2nd at 13:00 UTC would work fine. Kind regards Walkuraxx (talk) 08:39, 20 November 2023 (UTC)Reply
✅ Perfect! I added it to the session on Saturday. Lea Lacroix (WMDE) (talk) 06:14, 21 November 2023 (UTC)Reply

Session proposal: Modelling Gender on Wikidata

  • Session title: Modelling gender on Wikidata
  • Format: Panel Discussion (50 minutes)
  • Speaker(s) or facilitator(s): Arielle Rodriguez, Beatrice Melis, Chiara Paolini, Crystal Yragui, Daniele Metilli, Katy Weathington, Marta Fioravanti, John Samuel (Facilitator)
  • Short description of the session: We will examine the current process of modeling gender in Wikidata, with a primary focus on the methodologies, prevailing practices, and potential challenges. Our aim is to discuss the decision-making process behind these models, evaluate our existing practices, and address potential concerns. We will explore the various techniques employed for inferring gender information and conduct an in-depth analysis of how such data is documented on Wikidata. This session is designed to analyze the pros and cons of incorporating gender information, offering valuable insights into potential issues. Furthermore, we will engage in collaborative brainstorming to generate suggestions for enhancing future edits within this context. Our overarching objective is to deepen our comprehension of gender modeling in Wikidata and contribute to the ongoing discourse regarding its future direction.
  • Audience: mostly core Wikidata community members, with various levels of understanding about gender and how to model it on Wikidata.
  • Suggestions of time and date: Saturday (preferably in the morning)
  • Language : English

- John Samuel (talk) 17:54, 5 November 2023 (UTC)Reply

@Jsamwrites: Thank you so much for coordinating and formulating the proposal! I tentatively scheduled it on Saturday at 09:00 UTC, can you confirm that it works for all of you? Thanks a lot! Lea Lacroix (WMDE) (talk) 07:46, 6 November 2023 (UTC)Reply
@Lea Lacroix (WMDE) Is it possible to change this slot to Saturday 16:00 UTC, since 9:00 UTC is quite late for two of the speakers? Thanks in advance. John Samuel (talk) 20:43, 9 November 2023 (UTC)Reply
✅ Scheduled on Saturday at 17:00 UTC. Lea Lacroix (WMDE) (talk) 18:37, 10 November 2023 (UTC)Reply
@Lea Lacroix (WMDE) Katy Weathington (who gave this year's keynote presentation at Wikidata Workshop) also agreed to participate in the panel. I will update the program page with this information. John Samuel (talk) 10:39, 22 November 2023 (UTC)Reply

Session proposal: Indonesian intangible cultural heritages in Wikidata

  • Session title: Indonesian intangible cultural heritages in Wikidata
  • Format: Modelling challenge (50min, work together on a specific issue)
  • Speaker(s) or facilitator(s): RXerself
  • Short description of the session: Around 11,000 intangible cultural heritages are registered in the official list of the Indonesian Ministry of Education and Culture.1 These range from traditional dances to local speciality dishes, agricultural practices, rituals, and ceremonies. The ample supply of the heritages in the list made it a staple of source to be used in online editing events by the Indonesian Wikidata community. However, the variety of the heritages present may not be smooth when they are inserted into Wikidata as constraints are frequently met. Several examples of these would be presented during the session and we hope to receive more input from the wider Wikidata community in tackling this challenge.
  • Audience: beginner friendly, interest in cultural heritages optional
  • Suggestions of time and date: Saturday, December 2nd 09:00 UTC

RXerself (talk) 16:28, 16 November 2023 (UTC)Reply

Hi @RXerself:, thanks a lot for your proposal! I'll block the time you suggested for your presentation, and I asked Susanna, who will also talk about living heritage, if she would be fine joining your session. This way, both of you would have ~20min presentation followed by 10min of common Q&A. Would that work for you? Lea Lacroix (WMDE) (talk) 07:10, 20 November 2023 (UTC)Reply
Ooo yes please go ahead. RXerself (talk) 08:24, 20 November 2023 (UTC)Reply
✅ Scheduled. Lea Lacroix (WMDE) (talk) 06:07, 21 November 2023 (UTC)Reply

Session proposal: Wikidata Schema Validation Output Improvements

  • Session title: Wikidata Schema Validation Output Improvements
  • Format: Lightning talk
  • Speaker(s): Mennolt van Alten
  • Short description of the session: I will give a short description of my proposed project, where I am hoping to improve the look and usability of the output of the ShEx2 Simple Online Validator. I want to do interviews with people with a specific role or relation to the validator and project, and invite people to sign up to be interviewed via my user talk page.
  • Audience: Editors of wikidata using schemas, those interested in using schemas or encouraging consistent data
  • Suggestions of time and date: Thursday 19:00 UTC: this puts it at the end of a day related to schemas which hopefully will have users interested in schemas and data validation there.
  • Language: English

M.alten.tue (talk) 13:42, 17 November 2023 (UTC)Reply

Hi @M.alten.tue:, we scheduled our lightning talks session on Friday, December 1st at 10:00 UTC. Would that work for you? Lea Lacroix (WMDE) (talk) 06:58, 20 November 2023 (UTC)Reply
Hello @Lea Lacroix (WMDE).
Yes, that works for me as well! M.alten.tue (talk) 10:31, 20 November 2023 (UTC)Reply
✅ Scheduled. Lea Lacroix (WMDE) (talk) 06:09, 21 November 2023 (UTC)Reply

Session proposal: The QLever SPARQL Engine

  • Session title: The QLever SPARQL Engine
  • Format: Workshop (50 minutes)
  • Speakers: Hannah Bast and Johannes Kalmbach
  • Short description of the session: We will report on the status quo of the QLever SPARQL engine, which is a possible replacement for the current Blazegraph SPARQL engine, as discussed in detail here. We will show, how many queries that are currently very slow or infeasible with Blazegraph, can be processed quickly by QLever. We will also show QLever's context-sensitive autocompletion for constructing SPARQL queries for non-experts.
  • Audience: We would be happy to see Wikidata editors, other users of the Wikidata Query Service, and people from the search team at the workshop
  • Suggestions of time date: It has to be Saturday afternoon for us. It seems there is still a free slot on Sat Dec 2 at 15:00 UTC. That would be perfect.
  • Language: English

Hannah Bast (talk) 22:35, 18 November 2023 (UTC)Reply

✅ Hi @Hannah Bast:, thanks a lot for your proposal! We scheduled it at the time you requested. We hope you will still be able to attend the other Query Service related presentation on Friday :) Lea Lacroix (WMDE) (talk) 07:22, 20 November 2023 (UTC)Reply
Thanks indeed! This is easily the most interesting talk of the whole conference and something I look forward to. Infrastruktur (talk) 17:28, 21 November 2023 (UTC)Reply
Hi @Hannah Bast:, thanks again for joining the Data Modelling Days! When you have a minute, could you upload your slides in PDF on Wikimedia Commons, with the category Data Modelling Days 2023 presentations? If you're encountering any issue, please let me know. Many thanks in advance! Lea Lacroix (WMDE) (talk) 07:18, 5 December 2023 (UTC)Reply
@Lea Lacroix (WMDE) Thanks for the reminder! We are just wondering, which layout we should use for the slides? For the presentation on Saturday, we used the layout from our university. But I could also use your layout instead. Any recommendation from your side? Hannah Bast (talk) 18:53, 6 December 2023 (UTC)Reply
@Hannah Bast: It doesn't matter, you can keep the university layout :) Lea Lacroix (WMDE) (talk) 07:12, 8 December 2023 (UTC)Reply

Data Modelling Days: important information for speakers


Hello all,

Many thanks to everyone who submitted a contribution to the Data Modelling Days! We were able to build a very exciting schedule, and we are really looking forward to each of these presentations, discussions and live editing sessions.

Here's some important info for the speakers:

  • Please double-check the time of your session in the schedule and make sure that you can attend. Reminder: the time displayed is in UTC, you can click on the link to see it in your own timezone. I recommend you book the time of your session in your favorite calendar notebook or app.
  • All sessions will take place on the open source online videoconference room Jitsi: Please make sure to arrive in the room at least 10min before the start of your session. Our facilitators will be here to welcome you and help you test your sound and screen sharing before the session starts.
  • Most sessions will be recorded in video and published on Wikimedia Deutschland's Youtube channel after the event. If you don't want your session to be recorded, please let me know asap. In any case, all sessions will also have an Etherpad collaborative notes document.
  • If you are preparing slides, feel free to use this slide deck template or the visuals of the event. When your presentation is ready, don't forget to upload it to this Commons category!
  • If you encounter an issue or if you have to cancel your attendance, please contact me as soon as possible at or on Telegram @Auregann.

If you have any questions, if anything is unclear, please reach out to me!

Ping the speakers: @DerFussi, Epìdosis, Pmonnin, M.alten.tue, GLederrey (WMF), DCausse (WMF): @ArthurPSmith, Daniel Mietchen, RXerself, Susannaanas, Jason.nlw, Alicia Fagerving (WMSE): @André Costa (WMSE), Olea, VIGNERON, Walkuraxx, Hannah Bast, AWesterinen: @Jsamwrites, Ainali:

Best, Lea Lacroix (WMDE) (talk) 07:01, 21 November 2023 (UTC)Reply

When will videos be online?


I would like for my students to participate in the event and discuss this by Monday in our class. Time zone is a challenge at the conference starts at 6pm for us and ends at 7am. When can we watch the videos? I cannot find videos for past events online :( Hanyangprofessor2 (talk) 05:44, 1 December 2023 (UTC)Reply

Hello @Hanyangprofessor2:, thanks for your questions! We are hoping to upload the videos during the weekend. They will appear in a playlist on Wikimedia Deutschland's Youtube channel, where you can also find videos of previous Wikidata events :) Lea Lacroix (WMDE) (talk) 08:52, 1 December 2023 (UTC)Reply

Documentation of the event


Hello there,

Many thanks to everyone who attended the Data Modelling Days 2023, it was awesome to see so many of you joining and discussing important and in-depth topics about data modelling!

As the event concluded, we would like to remind you to add anything interesting you did during the event (started discussions onwiki, worked on something) to the outcomes page. If you had an interesting exchange with someone on Jitsi during the event and would like to follow up, you can probably find them on the participants list.

All videos of recording sessions have been uploaded on Youtube, you can find them in this playlist, as well as linked from the program. You will also find the collaborative notes, archived on wiki pages, and the slides if any. You can also find the slides directly in the related Commons category.

We hope that some discussions and issues raised during the event can be shared with the rest of the Wikidata community, and we encourage you to get involved in WikiProjects and to start discussions there to move forward on the topics that you care about.

If you have any questions or suggestions related to Wikidata, feel free to contact Lydia Pintscher (WMDE) and Arian Bozorg (WMDE). If you have any feedback about the event or suggestions for future events, feel free to reach out to me. Best, Lea Lacroix (WMDE) (talk) 06:32, 6 December 2023 (UTC)Reply

Return to the project page "Events/Data Modelling Days 2023".