Jump to content

Future Audiences

From Meta, a Wikimedia project coordination wiki
This is an archived version of this page, as edited by Waldyrious (talk | contribs) at 11:19, 31 July 2023 (Sign up to participate!: Added myself). It may differ significantly from the current version.

Future Audiences is one of the "buckets" within the Wikimedia Foundation's draft 2023-24 Annual Plan. Its purpose is to explore strategies for expanding beyond our existing audiences of readers/reusers and contributors in an effort to truly reach everyone in the world as the "essential infrastructure of the ecosystem of free knowledge". This bucket aligns with Movement Strategy Recommendation #9 (Innovate in Free Knowledge).

There are different strategies we could pursue as a movement to reach new audiences -- but as a movement we don't have the resources to pursue all of them. Should we endeavor to bring more and more people to our sites and apps? Should we strive to see our content be present all over the internet wherever people spend time? Something else? Therefore, this year, we will run a series of experiments to investigate strategies to engage new audiences in sharing, collecting, and improving knowledge in new ways. We want to learn what strategic directions make the most sense for the future of our products. If you're interested in getting engaged with this work, please sign up here.

FAQ

If you're curious about any of the following...

  • Why devote WMF resources to "Future Audiences" now, when there are many things that should be improved for our current users?
  • What is the different between an experiment and a product?
  • Why run experiments on commercial platforms like ChatGPT? Isn't that counter to our mission?

... and more, see: Wikimedia Foundation Annual Plan/2023-2024/Draft/Future Audiences/FAQ

Objectives and Key Results

Within the annual plan's Future Audiences "bucket", there are two Objectives and three associated Key Results:

FA1: Describe multiple potential strategies

Through which Wikimedia could satisfy our goal of being the essential infrastructure of the ecosystem of free knowledge.

  1. KR1. Participants in Future Audiences work (internal staff and community members) are equipped with at least three candidate strategies for how Wikimedia projects (especially Wikipedia and Wikimedia Commons) will remain the “essential infrastructure of free knowledge” in the future, including the audiences they would reach, the hypotheses they test, and approaches for testing them.

FA2: Test hypotheses

To validate or invalidate potential strategies for the future, starting with a focus on third party content platforms.

  1. KR1. Test a hypothesis aimed at reaching global youth audiences where they are on leading third party content platforms, to increase their awareness and engagement with Wikimedia projects as consumers and as contributors.
  2. KR2. Test a hypothesis around conversational AI knowledge seeking, to explore how people can discover and engage with content from Wikimedia projects.

FA1: Describing future strategies

Exploring "Future States"

Below is some early thinking on different ways the knowledge ecosystem and our role within it could look in 2030. These are draft ideas and are meant to be refined and improved throughout the fiscal year. If you're interested in helping us with thinking through different future states and assumptions, please sign up below and indicate that you are interested in future trends research!

Future State name Status Quo "Destination" "Free Knowledge Everywhere" "Internet's Conscience"
Description This is how we currently reach readers and new contributors. This status quo relies heavily on search engines to drive readers and new contributors to our projects and is exposed to risk if the way knowledge search works (web browser → search engine results → Wikipedia) changes significantly. We do not rely on any external platforms – search engines or otherwise – to syndicate our content and instead focus on attracting knowledge seekers directly to our projects. We do this by creating the features they want and making it easier, faster, more enjoyable, and more reliable to get knowledge from our projects than via any external service. Rather than relying on search engines to drive traffic to our sites, we proactively push free knowledge out to external platforms – not just to search engines but also social platforms, where we reach millions more people (including younger audiences, who are currently expressing less interest in/affinity with Wikipedia[1]). Users consume rich media based on Wikimedia content through videos, stories, etc., built in those platforms’ systems. Hooks in the external platforms draw in editorial and monetary contributions. We acknowledge that vast amounts of information are being exchanged all over the internet and that Wikimedia could be an engine to help the world sift reliable information from the rest. Wikimedia content is available as structured facts that let search and social platforms vet their own content, and they use our brand to indicate trustworthy content. Those platforms also encourage users to flag content that needs vetting so that our readers and editors can vet by comparing it to Wikimedia content.
Key features
  • Most readers come to us through search engines (primarily Google).[2]
  • Much smaller fraction come to us directly (via desktop or mobile).
  • Our content is not well-represented and/or well-attributed off-platform.
  • There are no methods for contributing off-platform.
  • We modernize our offering to compete with the experiences that are attracting audiences around the world.
  • Wikipedia would be peoples’ destination for learning, the way Amazon is for shopping and TikTok is for entertainment.
  • This might mean building our own video or AI offerings, and marketing in new ways.
  • Instead of expecting most readers to come to our properties, we actively push content out to search and social platforms.
  • Users consume rich media based on our content built, by external creators and Wikimedians.
  • Hooks in external platforms draw in contributions to edit and/or donate.
  • We make content available as structured facts that let search and social platforms vet content
  • Our brand is used to indicate trust
  • Platforms encourage users to flag content that needs vetting by our community
Key assumptions
  • Web search engines will continue to be how most people start searching for knowledge in the next 10-15 years.
  • Web search engines will continue to display results from Wikipedia and people will follow links to our projects in the next 10-15 years.
  • Search and social platforms will be increasingly inundated with low-quality knowledge over the next 10-15 years, and knowledge seekers will grow frustrated with their lack of reliability, transparency, neutrality, etc.
  • Billions of readers would come directly to our platform for knowledge if it was quick, easy, fun and/or frictionless to find what they were looking for.
  • People want encyclopedic content on social apps & other third party platforms.
  • Some people on other platforms are curious enough to get involved with our communities.
  • Third-party platforms and/or creators are interested in being seen as providing reliable/verifiable content.
  • People on third-party platforms want to see independent signals that the content they are consuming is trustworthy and see the Wikimedia brand as being trustworthy.

Other ideas for alternative Future States to explore?

Please let us know on the talk page!

FA2: Testing hypotheses

Testing strategy: testing assumptions

FA2.1 Global youth audiences on third-party content platforms

FA2.2 Conversational AI

First hypothesis/assumptions to test

[Status: as of July 11, 2023, available as an experimental beta feature to all ChatGPT Plus subscribers]

Context: OpenAI is experimenting with a business model in which partners build “plugins” for ChatGPT. As part of our Future Audiences efforts, we are interested in creating a Wikipedia ChatGPT plugin that could deliver content from Wikipedia for topical queries (e.g., when a user prompts ChatGPT to tell them about a topic in history, law, geography, science, etc.), attribute this content, and give users pathways to contribution. This would allow us to a) seize a unique, time-sensitive opportunity to influence how our content should appear and be attributed in conversational AI interfaces, and b) test key “Future Audiences” hypotheses to understand how we can remain the essential infrastructure of free knowledge in a possible future where AI transforms knowledge search.
Hypothesis: if we build a plugin to ChatGPT that allows the user to receive information from Wikipedia for general knowledge queries with attribution (as opposed to a hybrid of Wikipedia and other sources, with no attribution), we will be able to test the following assumptions:
  1. People using AI assistants will want to receive knowledge from Wikipedia for some of their queries (i.e., they will enable a Wikipedia plugin if it exists)
  2. People getting knowledge from Wikipedia via an AI assistant are satisfied with the results
  3. People want to get knowledge in non-English languages from Wikipedia via an AI assistant
  4. Knowledge arrives with fidelity to the end user
  5. People receiving information from a Wikipedia ChatGPT plugin that has attribution will be interested in visiting our projects to learn/read more
  6. People receiving information from a Wikipedia ChatGPT plugin that has attribution will contribute to improving our content if asked to do so
  7. People receiving information from a Wikipedia ChatGPT plugin that has attribution will donate to us if asked to do so
MVP requirements for alpha launch

Must-have:

As a ChatGPT user, I can:

  • Ask ChatGPT any general knowledge or current events question to trigger the Wikipedia plugin
  • Receive the following in response to my query:
  • “According to Wikipedia” prefacing any knowledge that comes from Wikipedia
  • A natural-language summary of the relevant Wikipedia knowledge
  • “From [Link(s) to the articles used to generate the knowledge]. Wikipedia content is made by volunteers and is free to share and remix under a Creative Commons BY-SA 3.0 license.”
  • No more than five seconds of me waiting for the response should be because of Wikimedia’s API
  • Receive up-to-date content from Wikipedia as of the previous day.
  • Allow content returned from English and Spanish Wikipedias only (as a simple first test of multilingual support)
  • Receive content from all Wikipedia articles (i.e. not just certain topic areas)
  • Receive correct and relevant information from the relevant Wikipedia article at least 80% of the time, incorrect and/or irrelevant information (e.g., information from the article on the video game Call of Duty for a query about World War II) less than 20% of the time, and harmful/offensive/misleading information less than 1% of the time – to be assessed through pre-release qualitative testing

Ideas for future releases:

  • An attribution card that contains the following for each article that was used to generate a user-facing response:
  • Wikipedia favicon
  • Lead image from article (if applicable) or puzzle globe placeholder
  • Title of article and link
  • CC BY SA 3.0 text or icons
  • A carousel of the images in the Wikipedia article
  • Receive content in whatever language I used to prompt the response, sourced from the Wikipedia of that language.
  • Receive a link to the Wikipedia language version of the language I used to prompt the response. E.g., If I ask ChatGPT in Ukrainian who Harry Belafonte is, I should receive a summary from the Ukrainian Wikipedia article on Harry Belafonte (if it exists) and a link to the Ukrainian Wikipedia article.
  • Return a list of relevant sources from the article
  • Invitation to edit or learn how to edit
  • Invitation to donate
  • Meta-data on living/community nature of Wikipedia (e.g.: “This information was last modified on Apr 29, 2023," “This information was brought to you by 23 volunteer contributors. Learn more about getting involved,“ :This information was marked as needing more references. Can you help?”)
  • Hover-over preview of links
  • Prompts to learn about “related concepts” after a query is completed. (E.g., “Sifan Hassan won the 2023 women’s London marathon. Would you like to learn more about Sifan Hassan, the London marathon, or marathon racing?”)
  • Invitation to contribute more content to the Wikipedia language version of the language I used to prompt the response if there is little/no content on the subject in that language. (E.g., If I ask ChatGPT in Ukrainian who Harry Belafonte is and there is no Harry Belafonte article in Ukrainian, I can receive a summary from the English Wikipedia article on Harry Belafonte and an invitation to incorporate this content into Ukrainian Wikipedia)
  • Expose citation links within Wikipedia content

Sign up to participate!

Please sign your name below if you're interested in following this work. If applicable, please indicate any areas that particularly interest you (e.g., AI, social apps, video/rich media, future trends research).

  1. User:Waltercolor
  2. User:Natalia Ćwik (WMPL)
  3. User:Lydia Pintscher (WMDE)
  4. User:Grzegorz Kopaczewski (WMPL)
  5. Klara Sielicka-Baryłka (WMPL) (talk) 08:54, 15 May 2023 (UTC)[reply]
  6. Bertux (talk) 19:47, 16 May 2023 (UTC)[reply]
  7. Sandizer (talk)
  8. --Frank Schulenburg (talk) 17:43, 18 May 2023 (UTC)[reply]
  9. MJLTalk 21:57, 18 May 2023 (UTC)[reply]
  10. Jklamo (talk) 09:06, 21 May 2023 (UTC)[reply]
  11. {{u|Sdkb}}talk 23:12, 22 May 2023 (UTC)[reply]
  12. Frostly (talk) 20:17, 10 July 2023 (UTC)[reply]
  13. Rtnf (talk) 05:49, 11 July 2023 (UTC)[reply]
  14. --Count Count (talk) 15:15, 11 July 2023 (UTC)[reply]
  15. Fuzheado (talk) 15:55, 13 July 2023 (UTC) - I was the initiator of en:Wikipedia:AI and have documented a number of experiments at en:User:Fuzheado/ChatGPT. I will also be on a panel at Wikimania 2023 regarding AI, organized by Shani and joined by User:Jimbo.[reply]
  16. Shani Evenstein Sigalov
  17. Soni (talk) 02:44, 14 July 2023 (UTC)[reply]
  18. Theklan, interested in both AI generated content, multimedia (c:Category:Ikusgela), and rich media (eu:Topic:Wr0cgff9sat6mv3k). -Theklan (talk) 08:28, 14 July 2023 (UTC)[reply]
  19. User: Heike Gleibs (WMDE) (talk) 08:36, 14 July 2023 (UTC)[reply]
  20. Tarkowski (talk) 11:50, 14 July 2023 (UTC)[reply]
  21. AyourAchtouk (talk) 13:19, 14 July 2023 (UTC) - interested in AI generated content[reply]
  22. Identifying differences between WMF AI ethics and Wikimedia community AI ethics Bluerasberry (talk) 17:38, 14 July 2023 (UTC)[reply]
  23. Adithyak1997 (talk) 10:48, 15 July 2023 (UTC)[reply]
  24. Interested to know about a) attribution to volunteer community and its labour, b) clear notice on possible biases and constantly changing content, c) larger AI ethical concerns, d) PR (avoiding fanboi/AI hype and stating what it is -- experiments of generative content using content created by volunteers) --Psubhashish (talk) 15:23, 15 July 2023 (UTC)[reply]
  25. Interested in solutions to combat "plausible falsehoods" created by AI chatbots. Sobaka (talk) 17:58, 17 July 2023 (UTC)[reply]
  26. Alalch E. (talk)
  27. Dyork (talk) 15:08, 18 July 2023 (UTC) - AI, future trends[reply]
  28. Mathglot (talk) 07:12, 19 July 2023 (UTC)[reply]
  29. DancingPhilosopher(User talk:DancingPhilosopher) 20:48, 19 July 2023
  30. Using chatGPT as part of the wikiedu / wikiversity student experience Stevesuny (talk) 15:52, 20 July 2023 (UTC)[reply]
  31. Oceanflynn (talk) 16:29, 20 July 2023 (UTC)[reply]
  32. Kasyap (talk) 05:49, 26 July 2023 (UTC)[reply]
  33. Waldyrious (talk) 11:19, 31 July 2023 (UTC) Interested in systems that might allow LLMs to expose their confidence in the information they output, restrict it to verifiable facts, and provide sources for them.[reply]

See also

References