0% found this document useful (0 votes)
257 views27 pages

Topic Modeling P.P.T

Uploaded by

mojy.shasha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
257 views27 pages

Topic Modeling P.P.T

Uploaded by

mojy.shasha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 27

Topic

Modeling
Prepared By:
Maryam Kashani
Rozhin Ahmadi Khamene
Zoya Chegini
Ahmad Shahriari
Table of Contents
01 02 03
An Introduction to Topic Modeling Topic Modelling
Topic Modeling Application Techniques

04 05 06
How to Apply STM? Techniquesy Conclusion
Challengesy and Future
Directions for STM
An
Introduction
to Topic
Modeling

01 Topic modeling is the new


revolution in text mining
Why Is Topic Modeling Used?
Topic
Modeling
Pipeline
Goals of
Uncover latent topics
Topic
that shape document Modeling
Finding Hidden meanings
factors that govern
the semantics of
documents
Topic Modeling
Applications Analyzing research papers to identify
trends and emerging topics.
Topic 3 Facilitating systematic reviews by summarizing
large volumes of literature.

Analyzing consumer feedback to uncover trends


in preferences and behaviors in Market Research

Article To Segmenting customer reviews into actionable


pi
1 c insights for product development in Market
Research
Understanding user interactions and community
structures within social platforms in SNA.

Topic 2 Analyzing sentiment and topics in social media


posts in SNA.
Identifying key themes related to customer
satisfaction or dissatisfaction I sentiment analysis.
Topic Modeling
Techniques

LSA is much faster to train than LDAy but has lower accuracy
Topic Modeling
Techniques LDA & LSA Hierarchy
Differences Between LSAy
LDA
Type: Non-probabilistic model
Latent Semantic Analysis (LSA)

Methodology: Utilizes Singular Value Decomposition (SVD) on the term-document matrix to


identify latent semantic structures.

Focus: Captures relationships between terms based on their co-occurrence in documents.

Applications: Information retrievaly text summarizationy and natural language processing.

Strengths:
 Fast computation for smaller datasets.
 Effective at revealing hidden relationships between terms.
Limitations:
 Lacks interpretability of topics.
 Sensitive to noise in data.
Differences Between LSAy
LDA
Latent Dirichlet Allocation (LDA)

Type: Probabilistic generative model.


Methodology: Assumes documents are generated from a mixture of topicsy each
represented by a distribution over words.

Focus: Identifies latent topics in a corpus and how these topics are distributed across
documents.

Applications: Topic modeling in large text corporay such as academic papers or social media
content.

Strengths:
 Provides interpretable topics linked to specific words.
 Handles large datasets effectively.
Limitations:
 Requires careful tuning of hyper parameters.
 Computationally intensive compared to LSA.
Structural Topic Modeling
(STM)
Type: Supervised topic model that builds on LDA principles.

Methodology: Incorporates document-level metadata to analyze how topics vary across


different contexts or groups.

Focus: Allows for the inclusion of external variables (e.g.y authory date) to understand topic
prevalence better.

Applications: Social science researchy political discourse analysisy and marketing insights.

Strengths:
 Provides richer insights by considering contextual factors.
 Enhances interpretability by linking topics to metadata.
Limitations:
 More complex and computationally demanding than both LSA and LDA.
 Requires careful selection and preprocessing of metadata.
03 05
01
Topic Topic
Data Interpretation
Collection Modeling

02 04
Data
Preparation
Topic
Visualization How to
Apply
STM
STM?
Software for Running
STM
Heuristic
Description of
STM Package
Features
How does STM work
Pkojhnklmcjidsovmcgpseprgzjtsevygtzjhivotyhysopzfesxfopyz0fuerziogjrtyuiogsfoaesdyucghjsduicyh
ajskifuohydvcbjsdkhyifcvxcuoiahscpuoozdsgsbzunvaschiszuofhpoiuytwfdghjklkmanbvcxzsdrtyuiawu
ystrdfhjkaociucyxfvwsbndpscoiuydsgcvbneldpeoiuyctsrfcvbjskociusydgwbnelqwoiduxysgbnxkcijusyd
gwhkdofciudsgvcbnmklskodiuygshdjckos0duyfhjskelpfcouehfjkoisdcv9uyhdjcnhuytfsdcgvchysdtrfcvb
njkdiokjflefkiuvhsncjhuysdtgfvdbhaejwiduhj2keiuydtfgshu7yawdhjkoiuryqghbcpsoiugvbhgfreqtyauwijd
kcbxvgctyefuijdcxfdstkjvhhyidsfbsyiecngisegfaonxfanifxucngrnysdnfygusdgvuifreovbvnhjdkwsjfrdegts
ujwldqwdefefwefwefjwiduhj2keiuydtfgshu7yawdhjkoiuryqghbcpsoiugvbhgfreqtyauwijdkcbxvgctyefuijd
cxfdstkjvhhyidsyfhjskelpfcouehfjkoisdcv9uyhdjcnhuytfsdcgvchysdtrfcvbnjkdiokjflefkiuvhsncjhuysdtgfv
dbhaejwiduhj2keiuydtfgshu7yawdhjkoiuryqghbcpsoiugvbhgfreqtyauwijdkcbxvgctyefuijdcxfdstkjvhhyid
sfbsyiecngisegPkojhnklmcjidsovmcgpseprgzjtsevygtzjhivotyhysopzfesxfopyz0fuerziogjrtyuiogsfoaes
dyucghjsduicyhajskifuohydvcbjsdkhyifcvxcuoiahscpuoozdsgsbzunvaschiszuofhpoiuytwfdghjklkmanb
vcxzsdrtyuiawuystrdfhjkaociucyxfvwsbndpscoiuydsgcvbneldpeoiuyctsrfcvbjskociusydgwbnelqwoidux
ysgbnxkcijusydgwhkdofciudsgvcbnmklskodiuygshdjckos0duyfhjskelpfcouehfjkoisdcv9uyhdjcnhuytfsd
cgvchysdtrfcvbnjkdiokjflefkiuvhsncjhuysdtgfvdbhaejwiduhj2keiuydtfgshu7yawdhjkoiuryqghbcpsoiugv
bhgfreqtyauwijdkcbxvgctyefuijdcxfdstkjvhhyidsfbsyiecngisegfaonxfanifxucngrnysdnfypoiuyopokouytf
eertyuipomzx
rs
25 0 le
tt e
How does STM work
Pkojhnklmcjidsovmcgpseprgzjtsevygtzjhivotyhysopzfesxfopyz0fuerziogjrtyuiogsfoaesdyucghjsduicyh
ajskifuohydvcbjsdkhyifcvxcuoiahscpuoozdsgsbzunvaschiszuofhpoiuytwfdghjklkmanbvcxzsdrtyuiawu
ystrdfhjkaociucyxfvwsbndpscoiuydsgcvbneldpeoiuyctsrfcvbjskociusydgwbnelqwoiduxysgbnxkcijusyd
gwhkdofciudsgvcbnmklskodiuygshdjckos0duyfhjskelpfcouehfjkoisdcv9uyhdjcnhuytfsdcgvchysdtrfcvb
njkdiokjflefkiuvhsncjhuysdtgfvdbhaejwiduhj2keiuydtfgshu7yawdhjkoiuryqghbcpsoiugvbhgfreqtyauwijd
kcbxvgctyefuijdcxfdstkjvhhyidsfbsyiecngisegfaonxfanifxucngrnysdnfygusdgvuifreovbvnhjdkwsjfrdegts
ujwldqwdefefwefwefjwiduhj2keiuydtfgshu7yawdhjkoiuryqghbcpsoiugvbhgfreqtyauwijdkcbxvgctyefuijd
cxfdstkjvhhyidsyfhjskelpfcouehfjkoisdcv9uyhdjcnhuytfsdcgvchysdtrfcvbnjkdiokjflefkiuvhsncjhuysdtgfv
dbhaejwiduhj2keiuydtfgshu7yawdhjkoiuryqghbcpsoiugvbhgfreqtyauwijdkcbxvgctyefuijdcxfdstkjvhhyid
sfbsyiecngisegPkojhnklmcjidsovmcgpseprgzjtsevygtzjhivotyhysopzfesxfopyz0fuerziogjrtyuiogsfoaes
dyucghjsduicyhajskifuohydvcbjsdkhyifcvxcuoiahscpuoozdsgsbzunvaschiszuofhpoiuytwfdghjklkmanb
vcxzsdrtyuiawuystrdfhjkaociucyxfvwsbndpscoiuydsgcvbneldpeoiuyctsrfcvbjskociusydgwbnelqwoidux
ysgbnxkcijusydgwhkdofciudsgvcbnmklskodiuygshdjckos0duyfhjskelpfcouehfjkoisdcv9uyhdjcnhuytfsd
cgvchysdtrfcvbnjkdiokjflefkiuvhsncjhuysdtgfvdbhaejwiduhj2keiuydtfgshu7yawdhjkoiuryqghbcpsoiugv
bhgfreqtyauwijdkcbxvgctyefuijdcxfdstkjvhhyidsfbsyiecngisegfaonxfanifxucngrnysdnfypoiuyopokouytf
eertyuipomzx
How does STM work
Pkojhnklmcjidsovmcgpseprgzjtsevygtzjhivotyhysopzfesxfopyz0fuerziogjrtyuiogsfoaesdyucghjsduicyh
Number of first 4 letters of the alphabet
ajskifuohydvcbjsdkhyifcvxcuoiahscpuoozdsgsbzunvaschiszuofhpoiuytwfdghjklkmanbvcxzsdrtyuiawu
ystrdfhjkaociucyxfvwsbndpscoiuydsgcvbneldpeoiuyctsrfcvbjskociusydgwbnelqwoiduxysgbnxkcijusyd
gwhkdofciudsgvcbnmklskodiuygshdjckos0duyfhjskelpfcouehfjkoisdcv9uyhdjcnhuytfsdcgvchysdtrfcvb
73
njkdiokjflefkiuvhsncjhuysdtgfvdbhaejwiduhj2keiuydtfgshu7yawdhjkoiuryqghbcpsoiugvbhgfreqtyauwijd
66
62
kcbxvgctyefuijdcxfdstkjvhhyidsfbsyiecngisegfaonxfanifxucngrnysdnfygusdgvuifreovbvnhjdkwsjfrdegts
ujwldqwdefefwefwefjwiduhj2keiuydtfgshu7yawdhjkoiuryqghbcpsoiugvbhgfreqtyauwijdkcbxvgctyefuijd
51
cxfdstkjvhhyidsyfhjskelpfcouehfjkoisdcv9uyhdjcnhuytfsdcgvchysdtrfcvbnjkdiokjflefkiuvhsncjhuysdtgfv
dbhaejwiduhj2keiuydtfgshu7yawdhjkoiuryqghbcpsoiugvbhgfreqtyauwijdkcbxvgctyefuijdcxfdstkjvhhyid
sfbsyiecngisegPkojhnklmcjidsovmcgpseprgzjtsevygtzjhivotyhysopzfesxfopyz0fuerziogjrtyuiogsfoaes
dyucghjsduicyhajskifuohydvcbjsdkhyifcvxcuoiahscpuoozdsgsbzunvaschiszuofhpoiuytwfdghjklkmanb
vcxzsdrtyuiawuystrdfhjkaociucyxfvwsbndpscoiuydsgcvbneldpeoiuyctsrfcvbjskociusydgwbnelqwoidux
ysgbnxkcijusydgwhkdofciudsgvcbnmklskodiuygshdjckos0duyfhjskelpfcouehfjkoisdcv9uyhdjcnhuytfsd
cgvchysdtrfcvbnjkdiokjflefkiuvhsncjhuysdtgfvdbhaejwiduhj2keiuydtfgshu7yawdhjkoiuryqghbcpsoiugv
bhgfreqtyauwijdkcbxvgctyefuijdcxfdstkjvhhyidsfbsyiecngisegfaonxfanifxucngrnysdnfypoiuyopokouytf
Part 1 Part 2 Part 3 Part 4
eertyuipomzx
How does STM work
Higher k shows more
focus and tunnel vision
on a subject
Pkojhnklmcjidsovmcgpseprgzjtsevygtzjhivotyhysopzfesxfopyz0fuerziogjrtyuiogsfoaescyucghjscuicyh
Number of 3rd letter of the alphabet
ajskifuohydvcbjsdkhyifcvxcuoiahscpuoozdsgsbzunvaschiszuofhpoiuytwfdghjklkmanbvcxzsdrtyuiawu
ystrdfhjkaociucyxfvwsbndpscoiuydsgcvbneldpeoiuyctsrfcvbjskociusydgwbnelqwoiduxysgbnxkcijusyd
gwhkdofciudsgvcbnmklskodiuygshdjckos0duyfhjskelpfcouehfjkoisdcv9uyhdjcnhuytfsdcgvchysdtrfcvb
njkdiokjflefkiuvhsncjhuysdtgfvdbhaejwiduhj2keiuydtfgshu7yawdhjkoiuryqghbcpsoiugvbhgfreqtyauwijd
29
kcbxvgctyefuijdcxfdstkjvhhyidsfbsyiecngisegfaonxfanifxucngrnysdnfygusdgvuifreovbvnhjdkwsjfrdegts
ujwldqwdefefwefwefjwiduhj2keiuydtfgshu7yawdhjkoiuryqghbcpsoiugvbhgfreqtyauwijdkcbxvgctyefuijd
cxfdstkjvhhyidsyfhjskelpfcouehfjkoisdcv9uyhdjcnhuytfsdcgvchysdtrfcvbnjkdiokjflefkiuvhsncjhuysdtgfv
19 19
dbhaejwiduhj2keiuydtfgshu7yawdhjkoiuryqghbcpsoiugvbhgfreqtyauwijdkcbxvgctyefuijdcxfdstkjvhhyid
17
sfbsyiecngisegPkojhnklmcjidsovmcgpseprgzjtsevygtzjhivotyhysopzfesxfopyz0fuerziogjrtyuiogsfoaes
dyucghjsduicyhajskifuohydvcbjsdkhyifcvxcuoiahscpuoozdsgsbzunvaschiszuofhpoiuytwfdghjklkmanb
vcxzsdrtyuiawuystrdfhjkaociucyxfvwsbndpscoiuydsgcvbneldpeoiuyctsrfcvbjskociusydgwbnelqwoidux
ysgbnxkcijusydgwhkdofciudsgvcbnmklskodiuygshdjckos0duyfhjskelpfcouehfjkoisdcv9uyhdjcnhuytfsd
cgvdhysdtrfcvbnjkdiokjflefkiuvhsncjhuysdtgfvdbhaejwiduhj2keiuydtfgshu7yawdhjkoiuryqghbdpsoiugv
bhgfreqtyauwijdkcbxvgctyefuijdcxfdstkjvhhyidsfbsyiecngisegfaonxfanifxucngrnysdnfypoiuyopokouytf
Part 1 Part 2 Part 3 Part 4
eertyuipomzx
How does STM work
What is FREX?
Topic Identification by STM
Topic Identification by STM

PLZ BRING YOUR OWN EXAMPLES BASED ON AVLB ARTICLES


YOU MAY HELP FROM BELOW LINK:
https://github.com/trajceskijovan/Structural-Topic-Modeling-in-R
Techniquesy Challengesy and
Future Directions
Techniques for Structural Topic Models (STM):
 Topical Prevalence: This allows metadata to influence how frequently a topic is discussed.

 Topical Content: This enables metadata to affect the word distribution within a topicy allowing for nuanced
interpretations of how topics are framed based on external factors

 Variational Inference: The estimation process in STM is typically accomplished through fast variational
approximationy which enhances computational efficiency and scalabilityy particularly for large datasets15.

 Model Initialization Techniques: Proper initialization is crucial due to the non-convex nature of the posterior
distribution. Techniques like spectral initializationy which uses non-negative matrix factorizationy help stabilize
results across different runs15.

 Model Selection and Evaluation: The select Model function automates the evaluation of multiple models
based on different initializationsy allowing researchers to identify models with desirable properties. This
includes calculating held-out log-likelihood and performing residual analyses to select the optimal number of
topics
Techniquesy Challengesy and
Future Directions
Challenges in Implementing STM

Sensitivity to Initialization: The multi-modal estimation problem can lead to different results based on initial
parameter values. This necessitates careful model selection and multiple runs to ensure robustness15.

Determining the Number of Topics: There is no definitive method for selecting the appropriate number of
topicsy which can lead to subjective decisions. Automated methods like ‘searchK' can assisty but they may not
always yield clear results15.

Complexity of Interpretation: Analyzing and interpreting the results from STM can be complexy especially when
dealing with multiple metadata covariates. Researchers must be adept at using visualization tools and statistical
tests to draw meaningful conclusions from the model outputs
Techniquesy Challengesy and
Future Directions
The Future of Structural Topic Models:

Integration with Machine Learning: Combining STM with advanced machine learning
techniques could enhance its predictive capabilities and allow for more sophisticated
analyses of large text datasets5.

Improved User Interfaces: Developing more intuitive interfaces for tools like the stm
package could broaden access and usability for researchers without extensive
programming backgrounds15.

Expanding Applications: As text data continues to proliferate across various domains


(e.g.y social mediay academic literature)y STM can be adapted for diverse applications
beyond traditional social science researchy such as sentiment analysis or trend detection
in real-time data streams5.

Enhanced Model Flexibility: Future iterations of STM could incorporate more flexible
modeling structures that account for temporal dynamics or hierarchical relationships within
datay improving its applicability across different contexts and research questions5
PLZ WRITE A CONCLUSION BASED ON YOUR OPINION

CONCLUSION
Question
?

You might also like