Topic Modelling
Topic Modelling
Topic Modelling
word(1) word(2) ... word(n) topic1 topic2 topic1 topic2 word(1) word(n)
doc(1) tf-idf doc(1) topic1 topic1
doc(2) = doc(2) x topic2 x topic2
doc(3) doc(3)
doc(m) doc(m)
A U S VT
■ This will keep the t most significant dimensions in the transformed space.
As its name implies, PLSA just adds a probabilistic treatment of topics and words on
■ PLSA is more flexible than LSA, but still has some limitations :
• The number of parameters grows linearly with the size of training
documents → The model is prone to overfitting
• Not a well-defined generative model - no way of generalizing to
new, unseen documents
https://monkeylearn.com/blog/introduction-to-topic-modeling/
LDA Modeling a Corpus
https://monkeylearn.com/blog/introduction-to-topic-modeling/