Wikipedia:Wikipedia Signpost/2010-09-13/Sister projects


Sister projects

Update on the Death Anomalies collaboration

WereSpielChequers is an editor on the English Wikipedia and occasionally elsewhere. He has been actively involved in various Biography related projects this year and collaborated with Bot writer Merlissimo to launch the Death Anomalies project.

Just over a month ago, The Signpost published a story on the Death Anomalies project, which identifies anomalies where different language Wikipedias disagree as to whether an individual is dead or alive. The Project was started in June, with initially just the German and English language Wikipedias extracting reports of anomalies. Since then, the Latin, Swedish, and Slovenian Wikipedias have joined in, and hundreds of errors have been resolved. When The Signpost covered the project, readers pitched in and the number of anomalies on enwiki was slashed from 447 to 190 in just over a week. EN wiki still has more than a 100 anomalies on Wikipedia:Database reports/Living people on EN wiki who are dead on other wikis, with new reports coming in daily. However, most of the backlog is down to differences in the way different projects treat missing people who (if alive) would be more than 100 years old, cross-wiki anomalies stemming from unreferenced articles showing a person as dead, and issues that probably require a native foreign-language speaker to resolve.

In July, only two projects were extracting data from the table, though it queried data from around 70. Subsequently these have been joined by the Swedish Wikipedia which rapidly reduced 94 anomalies to 16, and the Latin wikipedia, which has managed to reduce its anomalies to one. Earlier this month the Slovene Wikipedia became the fifth participating project, and went in a week from requesting a report to having cleared their backlog.

Biographies of living people (BLPs) inevitably need to be updated when the subject dies, so all these reports are expected to be ongoing maintenance tasks. Although the bot is processing data from millions of biographies across different Wikipedias, fewer than a thousand anomalies have been identified so far, relying on Interwiki links and categories that identify biographies as dead or living. Some projects are ineligible for the program because they don't organise their articles in such a way; for example, the Portuguese Wikipedia have lists of people who died in particular years (rather than categories).

In the future, the number of languages from which data is extracted and number of languages requesting reports will hopefully increase; we have 66 Wikipedia language versions including French, Spanish, Japanese, Polish and Russian for whom reports could be extracted almost immediately. Merlissimo (whom Jimbo Wales praised as a "rock star" for his work on the project) has a bot that updates the reports daily, and is willing to produce reports for other projects.

User responses