How Generative AI Changes Information Discovery - 9thmay2024
How Generative AI Changes Information Discovery - 9thmay2024
Context
2
Wiley Green External 2021-10.potx
Digital Transformation: AI + Big Data + Cloud (ABC)
Goal: To Facilitate People (not replace them)
Social Media
Improved
Improved Data-Informed Discoverability,
Efficiency Decision Making Readability and
& Reduced cost & Accuracy Accessibility
3
Wiley Green External 2021-10.potx
AI Development is an Inevitable Trend
Context:
• From Information Age to Intelligent Age
• Global Digital Transformation
4
Wiley Green Mac 2021-10.potx
From a Content to a Knowledge
Content Knowledge
Aggregate Enrich
Link
Name UserId PubId PageId ActionTime
1 251819348 41518104 1033 48:21.3
2 298382030 41219686 1033 48:22.4
3 298382032 41415282 1033 48:25.0
5
Wiley Green External 2021-10.potx
Information Discovery Movement
Strongly read-
Typically read-only Read-write-interact
write
Web service
Web page Data space
endpoint
HTML/HTTP/URL/Por
XML/RSS RDF/RDFS/OWL
tals
https://www.researchgate.net/publication/228503784_Challenges_and_Reflections_on_Information_Knowledge_and_Wisdom_Societies_Sociotechnical_Systems
6
Wiley Green External 2021-10.potx
How ChatGPT is Related to Other AI Concepts
7
Wiley Green Mac 2021-10.potx
Challenges & Problems
Researchers:
- Experiencing an overwhelming amount of
information.
- Difficulty in locating pertinent data and
information.
- Struggling to formulate precise inquiries.
- Staying current with new information.
- Ensuring the reliability of information
retrieval sources.
- Processing vast quantities of information.
Publishers:
- Enhance search experience on their
platform.
- Enhance users engagement on their
websites.
- Increase content discoverability.
- Filter out improper content
Information discovery on the research journey
• The vastness of • How to find the most • Hard to frame a • Any method that fits • Which journal
potential research relevant and best hypothesis that’s both the research should
areas literature novel and feasible to questions and the researchers
• Opportunity for long- • How to read and digest investigate available resources submit to
term relevance or vast amounts of info • Repeatedly refining or • Have enough good • Who can help
career prospects quickly even discarding quality data researchers revie
• Who to collaborate with • How to deal with • Right analytical tools w their work
contradicting and correct
studies/findings interpretation of
• How much can I trust results? Especially
these work? Any for unexpected
papermill or retraction results
9
Information discovery on the publishing journey
• Where to submit • Any research • Better content • Who are the right
• Who to work with misconduct? accessibility, audience/users
• What are good • Who reviews the readability and • ASEO
references submission discoverability • Faster, broader, deeper
• Discover knowledge and more personalized
instead of content discovery
10
AI powered tools in the publishing workflow
11
Wiley Green Mac 2021-10.potx
Content Generation
Research Topic Suggestion
Ask ChatGPT/Gemini to suggest research topics/directions for PhD study and for journal/SI development
Prompt: “I am a PhD student and focus on NLP area. I am really interested in large language models currently. Can you suggest me a good topic for
PhD research and paper writing” and “I am a journal editor in computer science, I would like to create a new journal or special issue, please suggest
some important and popular topics for the new journal and special issue to have enough submission to make the journal or special issue successful”
14
Wiley Green Mac 2021-10.potx
Content Summarization
Extractive Summary: Effects of Vitamin D on Endometriosis-Related Pain: A Double-Blind Clinical Trial
• Before laparoscopy, the mean pelvic pain score in the vitamin D group was 4.05±3.45 and 4.82±4.1(p=0.513) in the placebo group. Before laparoscopy, the mean dysmenorrhea pain score in the vitamin D group was
7.37±2.61 and in placebo group it was 6.42±3.04 (p=0.325).
• Table 2 shows a comparison between the 2 groups for severity of pelvic pain and/or dysmenorrhea at different time points (before laparoscopy, in second menses after laparoscopy, and at 24 weeks after laparoscopy).
At the second menses after laparoscopy, there was no significant difference between the 2 groups for pelvic pain (p=0.583) and dysmenorrhea (p=0.365), and at 24 weeks after laparoscopy there was no significant
difference between mean pain scores in the 2 groups. Mean pelvic pain at 24 weeks after laparoscopy in the vitamin D group was 0.84±1.74 and in placebo group it was 0.68±1.70 (p=0.513).
• We explored the relationship between vitamin D and endometriosis in a double-blind, randomized clinical trial looking at the effect of vitamin D supplementation on cessation of pain in proven endometriosis after
laparoscopic diagnosis and treatment.
• There may be a relationship between vitamin D and pathogenesis of endometriosis, but in our study vitamin D was not effective in treatment of endometriosis-related pain.
• In this double-blind, randomized clinical trial, at 24 weeks after laparoscopic treatment of endometriosis there was no significant difference between effect of vitamin D3 (cholecalciferol) and placebo on severity of
dysmenorrhea and/or pelvic pain.
• The remaining 39 cases were randomly assigned in vitamin D (n=19) or placebo treatment (n=20) groups.
• After authorization by the university Ethics Committee, eligible patients were assigned by simple randomization to receive either vitamin D or placebo. In the vitamin D group (D group), we prescribed oral vitamin D 50
000 iu/weekly for 12 weeks (capsule D-Vigel, vitamin D3 50 000 iu, Daana Pharma Co. Tabriz-Iran) and in the placebo group (P group) we prescribed 1 capsule of placebo (Daana Pharma Co. Tabriz-Iran) weekly for 12
weeks.
• Mean dysmenorrhea was 2.10±2.33 in the vitamin D group and 2.73±2.84 in the placebo group (p=0.45).
ChatGPT produces better and more fluent summary than Gemini. But it is much more expensive and generates less detail than the extractive summary
15
Wiley Green Mac 2021-10.potx
Deeper Information Discovery
Challenge Solution
Structured Knowledge
Embedded in Unstructured Knowledge Mining & Search
Text
16
Wiley Green External 2021-10.potx
Specific Entities Mining
Knowledge Mining takes valuable information from customers’ existing content to create a more structured
layout and generate new business opportunities.
Existing Content Structured Data
Inogatran molecular weight 439 Da
1
PS-b-P4VP molecular weight 59 000 g mol-1
2
3
What: Generate new content bundle with new
topics, new content types, new information
cross languages
18
https://www.thelancet.com/journals/lancet/article/PIIS0140-6736%2820%2930304-4/fulltext
Wiley Green External 2021-10.potx
Personalized and Richer Information Discovery
Challenge Solution
19
Wiley Green External 2021-10.potx
Where to Submit – Journal Suggestion
ChatGPT and Journal Finder give some common suggestions. But dedicated Journal Finder gives the correct answer.
ChatGPT and Google Gemini recommend related top-tier journals only. But Gemini can also give relevance scores.
.
ChatGPT vs Journal Finder
20
Wiley Green Mac 2021-10.potx
Multimedia Content Discovery
Multimedia content discovery includes topic, image, video and funder searches:
21
Wiley Green External 2021-10.potx
Recommendations & Personalized News Feeds
Recommendations is one of the most common applications of AI. It varies from suggesting relevant experts, to identifying interest groups
and enabling a personalized user experience based on individual interests.
22
Wiley Green External 2021-10.potx
Reviewer Suggestion
Journal editor asks AI to suggest some reviewers after giving paper title and abstract of as shown below
I am a journal editor in computer science area and I have following paper with title:{Bilinear joint learning of word and entity embeddings for Entity Linking} and abstract:{Entity Linking (EL) is the task of resolving mentions to
referential entities in a knowledge base, which facilitates applications such as information retrieval, question answering, and knowledge base population. In this paper, we propose a novel embedding method specifically designed
for EL. The proposed model jointly learns word and entity embeddings which are located in different distributed spaces, and a bilinear model is introduced to simulate the interaction between words and entities. We treat EL as a
ranking problem, and utilize a pairwise learning-to-rank framework with features constructed with learned embeddings as well as conventional EL features. Experimental results show the proposed model produces effective
embeddings which improve the performance of our EL algorithm. Our method yields the state-of-the-art performances on two benchmark datasets CoNLL and TAC-KBP 2010.}. Can you give me recommendations about reviewers
for this paper?
ChatGPT Gemini Our Own Reviewer Finder
All Bing, Gemini and ChatGPT don’t perform well in this experiment and their results contain
serious issues. Dedicated reviewer suggestion service give more reliable results with richer info!
23
Wiley Green Mac 2021-10.potx
New Way of Search
Question: What is the latest study progress about large pre-trained language models?
For the usage of scholarly research-related cases, Bing gives a better result than the two others. ChatGPT limit their answers to their data
while Bing generates the result based on the web search and gives the related articles with links as well (which are real latest articles)
Gemini currently understand ~133 Bing gives short answers with reliable
languages, and it can support image references and less hallucination
search
25
Wiley Green Mac 2021-10.potx
Wider Information Discovery
Challenge Solution
Classifications
Data Silos
Milestone Paper
Need for Cross-
Disciplinary Research Research Analytics
26
Wiley Green External 2021-10.potx
Content Classification
Topic 1
PUBLISHER-SPECIFIC
TAXONOMIES
Documents
PUBLIC Images
TAXONOMIES Topic 2
(MeSH, UN SDG)
AUTO
TAGGER Videos
…
GLOBAL
TAXONOMY News/Blog
27
Wiley Green External 2021-10.potx
Content Classification
ChatGPT can NOT tag content based on customer specific taxonomy and it often generates fake tags or IDs
“Evaluation of BNT162b2 Covid-19 Vaccine in Children Younger than 5 Years of Age
Although coronavirus disease 2019 (Covid-19) is generally mild in children younger than 5 years of age, severe disease, hospitalizations,
and post-acute effects, including multisystem inflammatory syndrome in children (MIS-C), can occur.1-3 In the United States, rates of
Covid-19−associated hospitalization among children younger than 5 years of age peaked at…”
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9947923/
Atypon Taxonomy tags Dedicated MeSH Auto Tagger ChatGPT MeSH Tagging
1. Medicine: 0.98 1. COVID-19: 0.99 1. Vaccines - 0.95
2. Pediatrics: 0.95 2. Child, Preschool: 0.97 2. Coronavirus disease 2019- 0.95
3. Pandemic: 0.93 3. Pandemics: 0.93 3. Child – 0.90
4. Immunization: 0.93 4. Coronavirus Infections: 0.91 4. Clinical Trials, Phase I - 0.85
5. Clinical trial: 0.92 5. Infant: 0.89 5. Clinical Trials, Phase II - 0.85
6. Booster: 0.73 6. COVID-19 Vaccines: 0.87 6. Clinical Trials, Phase III - 0.85
7. Coronavirus disease 2019: 0.65 7. Viral Vaccines: 0.85 7. Immunogenicity, Vaccine – 0.80
8. Emergency use authorization: 0.6 8. Pneumonia, Viral: 0.83 8. Safety - 0.80
9. Hospitalization: 0.77 9. Efficacy - 0.80
10. SARS-CoV-2: 0.69 10. Placebos - 0.75
28
Wiley Green Mac 2021-10.potx
Milestone Work Identification
29
Wiley Green External 2021-10.potx
Research Analytics
30
Wiley Green External 2021-10.potx
New Scholarly Search Applications in era of GAI
Bibliography databases Conversational discovery
Scopus AI Dimensions Assistant ChatGPT Google Gemini New Bing Perlexity.ai
• Topic experts feature. • Dimensions research GPT • Improved scientific writing. • Interdisciplinary Research. • Based on OpenAI • GAI native search engine
• generates Concept Enterprise. • Enhanced research equity. • Organized and Methodical technology and integrated • Provide clean and simple
maps • Notifications each time • Utility in health care research Approach by handling with MS Bing search way to discover without
• The Foundational content generated is based (e.g., dataset analysis, code large dataset. engine Ad.
papers option lists on Dimensions data generation, literature • streamlines the process of • Provide more reliable • Allow people to select
high-impact Scopus • AI-driven summarization reviews, and drug discovery). identifying relevant answers with trackable different LLM models and
papers on any topic. • Cost savings and scholarly content. references knowledge base
• synthesizes abstracts documentation. • Multimodality. • Provide shorter answers • Suggest relevant questions
• Provides • Personalized learning in than ChatGPT and Google
the references used health care education. • Allow to search on
to build the • Citation Generation for internet or local
summary. Research Papers. knowledge base
• Leverages advanced LLMs • Provides summary for the top 10 relevant • Provides high-level insights from top five • leverage GPT4 and other LLMs
• Ability to ask questions within a results papers, with detailed summaries and to summarize results for you.
specific result content • Q&A feature that runs through full-text insights • Designed to answer scientific
• Provides summary for the top X content • Integrated with Copilot AI assistant to chat research questions
relevant results • Availability of chat history with your paper by asking following up
• Concept search. • Availability to configure assistant settings questions
31
Wiley Green External 2021-10.potx
Content Accessibility
Different Branches of AI
33
Wiley Green External 2021-10.potx
Alt Text for Image Example
Alt text (alternative text) describes the appearance or function of an image on a page for screen
reader & SEO.
• ChatGPT: The image depicts a serene winter scene, likely captured
during the early morning based on the soft lighting. It shows several
trees covered with thick, white snow, standing prominently against a
backdrop of a dense pine forest. The foreground features a wooden
fence partially buried under the snow, stretching across a field also
blanketed in snow. The entire scene conveys a tranquil, almost
untouched winter landscape, highlighting the beauty of a heavy
snowfall.
34
Wiley Green External 2021-10.potx
Content to Audio Example
Microsoft, Google and Samsung all provides free apps that assist people with impaired vision by
narrating surroundings.
35
Wiley Green External 2021-10.potx
Compound Figure Segmentation Example
Antagonism
cADPR
36
Wiley Green External 2021-10.potx
Image Quality Enhancement Examples
AI can automatically increase contrast in images. AI can automatically enhance image resolution and
quality to improve readability.
37
Wiley Green External 2021-10.potx
Speech-to-Text Example
AI can listen to a video or sound file and then transcribe the spoken words into text.
38
Wiley Green External 2021-10.potx
Solution changes
AI GAI
39
Wiley Green External 2021-10.potx
AI Revolutionize information
discovery & Interaction
41
Wiley Green External 2021-10.potx
Future Thoughts