0% found this document useful (0 votes)

31 views8 pages

Correlated Topic Models: David M. Blei John D. Lafferty

The document describes the Correlated Topic Model (CTM), which addresses a limitation of topic models like LDA that do not account for correlation between topics. CTM uses a logistic normal distribution over topic proportions instead of a Dirichlet, allowing for correlation. This provides a more realistic model of latent topic structure where presence of one topic may be correlated with another. The authors develop CTM and demonstrate on a corpus from JSTOR that it fits better than LDA and provides a way to visualize topic correlations in large document collections.

Uploaded by

andrea ramirez

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

31 views8 pages

Correlated Topic Models: David M. Blei John D. Lafferty

Uploaded by

andrea ramirez

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

Correlated Topic Models

David M. Blei John D. Lafferty

School of Computer Science
Carnegie Mellon University

Abstract
Topic models, such as latent Dirichlet allocation (LDA), have been an ef-
fective tool for the statistical analysis of document collections and other
discrete data. The LDA model assumes that the words of each document
arise from a mixture of topics, each of which is a distribution over the vo-
cabulary. A limitation of LDA is the inability to model topic correlation
even though, for example, a document about sports is more likely to also
be about health than international finance. This limitation stems from the
use of the Dirichlet distribution to model the variability among the topic
proportions. In this paper we develop the correlated topic model (CTM),
where the topic proportions exhibit correlation via the logistic normal
distribution [1]. We derive a mean-field variational inference algorithm
for approximate posterior inference in this model, which is complicated
by the fact that the logistic normal is not conjugate to the multinomial.
The CTM gives a better fit than LDA on a collection of OCRed articles
from the journal Science. Furthermore, the CTM provides a natural way
of visualizing and exploring this and other unstructured data sets.

1 Introduction
The availability and use of unstructured historical collections of documents is rapidly grow-
ing. As one example, JSTOR (www.jstor.org) is a not-for-profit organization that main-
tains a large online scholarly journal archive obtained by running an optical character recog-
nition (OCR) engine over the original printed journals. JSTOR indexes the resulting text
and provides online access to the scanned images of the original content through keyword
search. This provides an extremely useful service to the scholarly community, with the
collection comprising nearly three million published articles in a variety of fields.
The sheer volume of this unstructured and noisy archive naturally suggests opportunities for
the effective use of statistical modeling. For instance, a scholar in a narrow subdiscipline,
searching for a particular research article, would certainly be interested to learn that the
topic of that article is highly correlated with another topic that the researcher may not have
known about and that is not explicitly contained in the article. Alerted to the existence of
this new related topic, the researcher could browse the collection in a topic-guided manner
to begin to investigate connections to a previously unrecognized body of work. Since the
archive comprises millions of articles spanning centuries of scholarly work, automated
analysis is essential.
Several statistical models have recently been developed for automatically extracting the
topical structure of large document collections. In technical terms, a topic model is a
generative probabilistic model that uses a small number of distributions over a vocabulary
to describe a document collection. When fit from data, these distributions often correspond
to intuitive notions of topicality. In this work, we build upon the latent Dirichlet allocation
(LDA) [3] model. LDA assumes that the words of each document arise from a mixture
of topics. The topics are shared by all documents in the collection; the topic proportions
are document-specific and randomly drawn from a Dirichlet distribution. LDA allows each
document to exhibit multiple topics with different proportions, and it can thus capture the
heterogeneity in grouped data which exhibit multiple latent patterns. Recent work has
used LDA as a module in more complicated document models [8, 10, 6], and in a variety of
settings such as image processing [11], collaborative filtering [7], disability survey data [4],
population genetics [9], and the modeling of sequential data and user profiles [5].
Our goal in this paper is to address a limitation of the topic models proposed to date: they
fail to directly model correlation between topics. In many—indeed most—text corpora, it
is natural to expect that subsets of the underlying latent topics will be highly correlated. In
a corpus of scientific articles, for instance, an article about genetics may be likely to also
be about health and disease, but unlikely to also be about x-ray astronomy. For the LDA
model, this limitation stems from the independence assumptions implicit in the Dirichlet
distribution on the topic proportions. Under a Dirichlet, the components of the proportions
vector are nearly independent; this leads to the strong and unrealistic modeling assumption
that the presence of one topic is not correlated with the presence of another.
In this paper we present the correlated topic model (CTM). The CTM uses an alterna-
tive, more flexible distribution for the topic proportions that allows for covariance structure
among the components. This gives a more realistic model of latent topic structure where
the presence of one latent topic may be correlated with the presence of another. In the
following sections we develop the technical aspects of this model, and then demonstrate its
potential for the applications envisioned above. We fit the model to a portion of the JSTOR
archive of the journal Science. We demonstrate that the model gives a better fit than LDA,
as measured by the accuracy of the predictive distributions over held out documents. Fur-
thermore, we demonstrate qualitatively that the correlated topic model provides a natural
way of visualizing and exploring such an unstructured collection of textual data.

2 The Correlated Topic Model

The key to the correlated topic model we propose is the logistic normal distribution [1]. The
logistic normal is a distribution on the simplex that allows for a general pattern of variability
between the components by transforming a multivariate normal random variable. Consider
the natural parameterization of a K-dimensional multinomial distribution:
p(z | η) = exp{η T z − a(η)}. (1)
The random variable Z can take on K values; it is represented by a K-vector with one
component equal to one to denote a value in {1, . . . , K}. The cumulant generating function
is: P
K
a(η) = log i=1 exp{η i } . (2)
The mapping between the mean parameterization (i.e., the simplex) and the natural param-
eterization is:
ηi = log θi /θK . (3)
Notice that this is not the minimal exponential family representation of the multinomial
because multiple values of η can yield the same mean parameter.
The logistic normal distribution assumes that η is normally distributed
P and then mapped
to the simplex with the inverse of Eq. 3, that is, f (ηi ) = exp ηi / j exp ηj . It describes
Σ βk
ηd Zd,n Wd,n
N
µ D K

Figure 1: Top: Graphical model representation of the correlated topic model. The logistic
normal distribution, used to model the latent topic proportions of a document, can represent
correlations between topics that are impossible to capture using a single Dirichlet. Bottom:
Example densities of the logistic normal on the 2-simplex. From left: diagonal covariance
and nonzero-mean, negative correlation between components 1 and 2, positive correlation
between components 1 and 2.

correlations between components of the simplicial random variable through the covariance
matrix of the normal distribution. The logistic normal was originally studied in the context
of analyzing observed compositional data such as the proportions of minerals in geological
samples. In this work, we extend its use to a hierarchical model where it describes the
latent composition of topics associated with each document.
Let {µ, Σ} be a K-dimensional mean and covariance matrix, and let topics β1:K be K
multinomials over a fixed word vocabulary. The correlated topic model assumes that an
N -word document arises from the following generative process:
1. Draw η | {µ, Σ} ∼ N (µ, Σ).
2. For n ∈ {1, . . . , N }:
(a) Draw topic assignment Zn | η from Mult(f (η)).
(b) Draw word Wn | {zn , β1:K } from Mult(βzn ).
This process is identical to the generative process of LDA except that the topic proportions
are drawn from a logistic normal rather than a Dirichlet. The model is shown as a directed
graphical model in Figure 1.
The CTM is more expressive than LDA. The strong independence assumption imposed
by the Dirichlet in LDA is not realistic when analyzing document collections, where one
may find strong correlations between topics. The covariance matrix of the logistic normal
in the CTM is introduced to model such correlations. In Section 3, we illustrate how the
higher order structure given by the covariance can be used as an exploratory tool for better
understanding and navigating a large corpus of documents. Moreover, modeling correlation
can lead to better predictive distributions. In some settings, such as collaborative filtering,
the goal is to predict unseen items conditional on a set of observations. An LDA model
will predict words based on the latent topics that the observations suggest, but the CTM
has the ability to predict items associated with additional topics that are correlated with the
conditionally probable topics.
2.1 Posterior inference and parameter estimation

Posterior inference is the central challenge to using the CTM. The posterior distribution of
the latent variables conditional on a document, p(η, z1:N | w1:N ), is intractable to compute;
once conditioned on some observations, the topic assignments z1:N and log proportions
η are dependent. We make use of mean-field variational methods to efficiently obtain an
approximation of this posterior distribution.
In brief, the strategy employed by mean-field variational methods is to form a factorized
distribution of the latent variables, parameterized by free variables which are called the vari-
ational parameters. These parameters are fit so that the Kullback-Leibler (KL) divergence
between the approximate and true posterior is small. For many problems this optimization
problem is computationally manageable, while standard methods, such as Markov Chain
Monte Carlo, are impractical. The tradeoff is that variational methods do not come with
the same theoretical guarantees as simulation methods. See [12] for a modern review of
variational methods for statistical inference.
In graphical models composed of conjugate-exponential family pairs and mixtures, the
variational inference algorithm can be automatically derived from general principles [2,
13]. In the CTM, however, the logistic normal is not conjugate to the multinomial. We
will therefore derive a variational inference algorithm by taking into account the special
structure and distributions used by our model.
We begin by using Jensen’s inequality to bound the log probability of a document:
log p(w1:N | µ, Σ, β) ≥ (4)
PN
Eq [log p(η | µ, Σ)] + n=1 Eq [log p(zn | η)] + Eq [log p(wn | zn , β)] + H (q) ,
where the expectation is taken with respect to a variational distribution of the latent vari-
ables, and H (q) denotes the entropy of that distribution. We use a factorized distribution:
2 QK QN
q(η1:K , z1:N | λ1:K , ν1:K , φ1:N ) = i=1 q(ηi | λi , νi2 ) n=1 q(zn | φn ). (5)
The variational distributions of the discrete variables z1:N are specified by the K-
dimensional multinomial parameters φ1:N . The variational distribution of the continuous
variables η1:K are K independent univariate Gaussians {λi , νi }. Since the variational pa-
rameters are fit using a single observed document w1:N , there is no advantage in introduc-
ing a non-diagonal variational covariance matrix.
The nonconjugacy of the logistic normal leads to difficulty in computing the expected log
probability of a topic assignment:
h PK i
Eq [log p(zn | η)] = Eq η T zn − Eq log( i=1 exp{ηi }) .

(6)
To preserve the upper bound on the log probability, we lower bound the negative log nor-
malizer with a Taylor expansion:
h P i
K PK
Eq log i=1 exp{ηi } ≤ ζ −1 ( i=1 Eq [exp{ηi }]) − 1 + log(ζ), (7)

where we have introduced a new variational parameter ζ. The expectation Eq [exp{ηi }] is

the mean of a log normal distribution with mean and variance obtained from the variational
parameters {λi , νi2 }: Eq [exp{ηi }] = exp{λi + νi2 /2} for i ∈ {1, . . . , K}.
Given a model {β1:K , µ, Σ} and a document w1:N , the variational inference algorithm op-
timizes Eq. 4 with respect to the variational parameters {λ1:K , ν1:K , φ1:N , ζ}. We use
coordinate ascent, repeatedly optimizing with respect to each parameter while holding the
others fixed. In variational inference for LDA, each coordinate can be optimized analyti-
cally. However, iterative methods are required for the CTM when optimizing for λi and νi2 .
The details are given in Appendix A.
dna
atmospheric water rna receptor
co soil mantle transcription binding
0.31
atmosphere 0.31 temperature pressure genes polymerase receptors
ozone ice 0.30 0.35 strand protein
phase genome
air surface temperature dna activation
iron gene 0.35
0.42 0.41
climate crust 0.32 sequence 0.33
0.39yeast 0.44 membrane 0.45
ice plate
cell channels
years 0.35 mantle 0.30
cells 0.37 channel
change volcanic cells
protein ca
0.57
temperature km structure 0.360.32 protein
0.31 0.33
proteins
0.40
cell
proteins
0.34 materials rna ca
0.31 magnetic 0.37 neurons
proteins 0.51 0.38 0.36
ocean properties structure 0.33 0.32 receptors
binding
oil temperature transfer 0.44
0.32 0.48
electron protein 0.37
sea crystal 0.31
protein kinase cell
ozone
0.41 two proteins cells
hole 0.42 membrane
0.32 gene cells
atoms 0.34
0.45 mutations cell surface
carbon genetic 0.41 wall
clusters 0.31 0.45
0.38 0.30 disease cells
molecules genes 0.31
0.40 surface cell virus
molecular 0.32
force cd
0.31 hiv
0.37 il 0.39
0.37 surfaces 0.32 infection
molecular 0.36 mice viral
reaction 0.31
chemical 0.38 molecules cells
light chemistry mice
laser 0.32 transgenic
reactions surface activity
optical products human cycle
surfaces
electron cells circadian
gold
xray insulin phase
glass
patterns clock

Figure 2: A portion of the topic graph learned from 15,744 OCR articles from Science.
Each node represents a topic, and is labeled with the five most probable words from its
distribution; edges are labeled with the correlation between topics.

Given a collection of documents, we carry out parameter estimation in the correlated topic
model by attempting to maximize the likelihood of a corpus of documents as a function
of the topics β1:K and the multivariate Gaussian (µ, Σ). We use variational expectation-
maximization (EM), where we maximize the bound on the log probability of a collection
given by summing Eq. 4 over the documents.
In the E-step, we maximize the bound with respect to the variational parameters by per-
forming variational inference for each document. In the M-step, we maximize the bound
with respect to the model parameters. This is maximum likelihood estimation of the top-
ics and multivariate Gaussian using expected sufficient statistics, where the expectation
is taken with respect to the variational distributions computed in the E-step. The E-step
and M-step are repeated until the bound on the likelihood converges. In the experiments
reported below, we run variational inference until the relative change in the probability
bound of Eq. 4 is less than 0.0001%, and run variational EM until the relative change in the
likelihood bound is less than 0.001%.

3 Examples and Empirical Results: Modeling Science

In order to test and illustrate the correlated topic model, we estimated a 100-topic CTM
on 15,744 Science articles spanning 1971 to 1998. We constructed a graph of the la-
tent topics and the connections among them by examining the most probable words from
each topic and the between-topic correlations. Part of this graph is illustrated in Fig-
ure 2. In this subgraph, there are three densely connected collections of topics: material
science, geology, and cell biology. Furthermore, an estimated CTM can be used to ex-
plore otherwise unstructured observed documents. In Figure 4, we list articles which are
assigned to the cognitive science topic and articles which are assigned to both the cog-
nitive science and visual neuroscience topics. The interested reader is invited to visit
http://www.cs.cmu.edu/˜lemur/science/ to interactively explore this model, in-
cluding the topics, their connections, and the articles that exhibit them.
We compared the CTM to LDA by fitting a smaller collection of articles to models of
varying numbers of topics. This collection contains the 1,452 documents from 1960; we
2200
−112800
CTM
●

2000
LDA ●

−113200
●

1800
●
● ●
● ● ●
●
●
●

−113600
●

1600
●
●
●

Held−out log likelihood ●

−114000

1400
L(CTM) − L(LDA)
● ●
●

1200
−114400

● ●

1000
●
−114800

●
●
●
●

800
●
−115200

600
●
●
−115600

400
●
●
●
−116000

200
● ●

0
●
−116400

5 10 20 30 40 50 60 70 80 90 100 110 120 10 20 30 40 50 60 70 80 90 100 110 120

Number of topics Number of topics

Figure 3: (L) The average held-out probability; CTM supports more topics than LDA. See
figure at right for the standard error of the difference. (R) The log odds ratio of the held-out
probability. Positive numbers indicate a better fit by the correlated topic model.

used a vocabulary of 5,612 words after pruning common function words and terms which
occur once in the collection. We split the data into ten groups; for each group we computed
the log probability of the held-out data given a model estimated from the remaining groups.
A better model of the document collection will assign higher probability to the held out
group. To avoid comparing bounds, we used importance sampling to compute the log
probability of a document where the fitted variational distribution is the proposal.
Figure 3 illustrates the average held out log probability for each model and the average
difference between them. The CTM provides a better fit than LDA and supports more
topics; the likelihood for LDA peaks near 30 topics while the likelihood for the CTM peaks
close to 90 topics. The means and standard errors of the difference in log-likelihood of the
models is shown at right; this indicates that the CTM always gives a better fit.
Another quantitative evaluation of the relative strengths of LDA and the CTM is how well
the models predict the remaining words after observing a portion of the document. Sup-
pose we observe words w1:P from a document and are interested in which model provides
a better predictive distribution p(w | w1:P ) of the remaining words. To compare these dis-
tributions, we use perplexity, which can be thought of as the effective number of equally
likely words according to the model. Mathematically, the perplexity of a word distribu-
tion is defined as the inverse of the geometric per-word average of the probability of the
observations. Note that lower numbers denote more predictive power.
The plot in Figure 4 compares LDA and the CTM in terms of predictive perplexity. When
only a small number of words have been observed, the uncertainty about the remaining
words under the CTM is much less than under LDA—the perplexity is reduced by nearly
200 words, or roughly 10%. The reason is that after seeing a few words in one topic, the
CTM uses topic correlation to infer that words in a related topic may also be probable.
In contrast, LDA cannot predict the remaining words as well until a large portion of the
document as been observed so that all of its topics are represented.
Top Articles with
{brain, memory, learning}
(1) Distributed Neural Network Underlying Musical
Sight-Reading and Keyboard Performance
(2) The Primate Hippocampal Formation: Evidence
for a Time-Limited Role in Memory Storage

2600
(3) Separate Neural Bases of Two Fundamental CTM
LDA
Memory Processes in the Temporal Lobe ●

(4) A Neostriatal Habit Learning System in Humans

(5) The Mental Representation of Hand Movement

2400
after Parietal Cortex Damage
●

Predictive perplexity
Top Articles with

2200
●

{brain, memory, learning} and {neurons, visual, cell}

● ●
(1) Regulation of Synaptic Efficacy by
Coincidence of Postsynaptic APs and EPSPs ●

2000
●

(2) Visual Instruction of the Neural Map of ● ●

●
●
●
Auditory Space in the Developing Optic Tectum ●
● ●
●

(3) Visually Evoked Oscillations of Membrane ●

●

1800
Potential in Cells of Cat Visual Cortex
(4) Corticofugal Modulation of Time-Domain
Processing of Biosonar Information in Bats
(5) A Map of Visual Space Induced in Primary 10 20 30 40 50 60 70 80 90

Auditory Cortex % observed words

Figure 4: (Left) Exploring a collection through its topics. (Right) Predictive perplexity for
partially observed held-out documents from the 1960 Science corpus.

References
[1] J. Aitchison. The statistical analysis of compositional data. Journal of the Royal
Statistical Society, Series B, 44(2):139–177, 1982.
[2] C. Bishop, D. Spiegelhalter, and J. Winn. VIBES: A variational inference engine for
Bayesian networks. In NIPS 15, pages 777–784. Cambridge, MA, 2003.
[3] D. Blei, A. Ng, and M. Jordan. Latent Dirichlet allocation. Journal of Machine
Learning Research, 3:993–1022, January 2003.
[4] E. Erosheva. Grade of membership and latent structure models with application to
disability survey data. PhD thesis, Carnegie Mellon University, Department of Statis-
tics, 2002.
[5] M. Girolami and A. Kaban. Simplicial mixtures of Markov chains: Distributed mod-
elling of dynamic user profiles. In NIPS 16, pages 9–16, 2004.
[6] T. Griffiths, M. Steyvers, D. Blei, and J. Tenenbaum. Integrating topics and syntax.
In Advances in Neural Information Processing Systems 17, 2005.
[7] B. Marlin. Collaborative filtering: A machine learning perspective. Master’s thesis,
University of Toronto, 2004.
[8] A. McCallum, A. Corrada-Emmanuel, and X. Wang. The author-recipient-topic
model for topic and role discovery in social networks. 2004.
[9] J. Pritchard, M. Stephens, and P. Donnelly. Inference of population structure using
multilocus genotype data. Genetics, 155:945–959, June 2000.
[10] M. Rosen-Zvi, T. Griffiths, M. Steyvers, and P. Smith. In UAI ’04: Proceedings of
the 20th Conference on Uncertainty in Artificial Intelligence, pages 487–494.
[11] J. Sivic, B. Rusell, A. Efros, A. Zisserman, and W. Freeman. Discovering object
categories in image collections. Technical report, CSAIL, MIT, 2005.
[12] M. Wainwright and M. Jordan. A variational principle for graphical models. In New
Directions in Statistical Signal Processing, chapter 11. MIT Press, 2005.
[13] E. Xing, M. Jordan, and S. Russell. A generalized mean field algorithm for variational
inference in exponential families. In Proceedings of UAI, 2003.

A Variational Inference
We describe a coordinate ascent optimization algorithm for the likelihood bound in Eq. 4
with respect to the variational parameters.
The first term of Eq. 4 is:
Eq [log p(η | µ, Σ)] = (1/2) log |Σ−1 | − (K/2) log 2π − (1/2)Eq (η − µ)T Σ−1 (η − µ) ,

(8)
where
Eq (η − µ)T Σ−1 (η − µ) = ν 2T diag(Σ−1 ) + (λ − µ)T Σ−1 (λ − µ).

(9)

The second term of Eq. 4, using the additional bound in Eq. 7, is:
PK P
K
Eq [log p(zn | η)] = i=1 λi φn,i − ζ −1 2
i=1 exp{λi + νi /2} + 1 − log ζ. (10)

The third term of Eq. 4 is:

PK
Eq [log p(wn | zn , β)] = i=1 φn,i log βi,wn . (11)

Finally, the fourth term is the entropy of the variational distribution:

PK 1 2
PN Pk
i=1 2 (log νi + log 2π + 1) − n=1 i=1 φn,i log φn,i . (12)

We maximize the bound in Eq. 4 with respect to the variational parameters λ1:K , ν1:K ,
φ1:N , and ζ. We use a coordinate ascent algorithm, iteratively maximizing the bound with
respect to each parameter.
First, we maximize Eq. 4 with respect to ζ, using the second bound in Eq. 7. The derivative
with respect to ζ is:
P
K
f 0 (ζ) = N ζ −2 i=1 exp{λ i + νi
2
/2} − ζ −1
, (13)

which has a maximum at: PK

ζ̂ = i=1 exp{λi + νi2 /2}. (14)
Second, we maximize with respect to φn . This yields a maximum at:
φ̂n,i ∝ exp{λi }βi,wn , i ∈ {1, . . . , K}. (15)
Third, we maximize with respect to λi . Eq. 4 is not amenable to analytic maximization.
We use the conjugate gradient algorithm with derivative
PN
dL/dλ = −Σ−1 (λ − µ) + n=1 φn,1:K − (N/ζ) exp{λ + ν 2 /2} (16)
Finally, we maximize with respect to νi2 . Again, there is no analytic solution. We use
Newton’s method for each coordinate, constrained such that νi > 0:
dL/dνi2 = −Σ−1 2 2
ii /2 − N/2ζ exp{λ + νi /2} + 1/(2νi ). (17)
Iterating between these optimizations defines a coordinate ascent algorithm on Eq. 4.

Statistical Topic Modeling For Afaan Oromo Document Clustering
No ratings yet
Statistical Topic Modeling For Afaan Oromo Document Clustering
10 pages
A Correlated Topic Model of Science1
No ratings yet
A Correlated Topic Model of Science1
19 pages
A Spectral Algorithm For Latent Dirichlet Allocation
No ratings yet
A Spectral Algorithm For Latent Dirichlet Allocation
38 pages
Maier 2018
No ratings yet
Maier 2018
27 pages
Topic Models Indian Institute of Technology Pawangcoursestopicmodelspdf
No ratings yet
Topic Models Indian Institute of Technology Pawangcoursestopicmodelspdf
93 pages
LU - 35 Latent Dirichlet Algorithm
No ratings yet
LU - 35 Latent Dirichlet Algorithm
13 pages
2020 Tacl-1 29
No ratings yet
2020 Tacl-1 29
15 pages
ME314 Day11
No ratings yet
ME314 Day11
77 pages
On Finding The Natural Number of Topics With Latent Dirichlet Allocation Some Observations PDF
No ratings yet
On Finding The Natural Number of Topics With Latent Dirichlet Allocation Some Observations PDF
12 pages
The Supervised Hierarchical Dirichlet Process
No ratings yet
The Supervised Hierarchical Dirichlet Process
13 pages
Detecting Emerging Trends From Scientific Corpora
No ratings yet
Detecting Emerging Trends From Scientific Corpora
7 pages
A Document Exploring System On Lda Topic Model For Wikipedia Articles
No ratings yet
A Document Exploring System On Lda Topic Model For Wikipedia Articles
13 pages
Word Embeddings Paper
No ratings yet
Word Embeddings Paper
7 pages
E Icient Correlated Topic Modeling With Topic Embedding
No ratings yet
E Icient Correlated Topic Modeling With Topic Embedding
9 pages
UTOPIC 2023.eacl-Main.132
No ratings yet
UTOPIC 2023.eacl-Main.132
16 pages
Wete 2203.01570v2
No ratings yet
Wete 2203.01570v2
17 pages
7.2 Latent
No ratings yet
7.2 Latent
27 pages
5 - Ines - Topic Modeling On News Articles Using Latent Dirichlet Allocation Kretinin A Kol
No ratings yet
5 - Ines - Topic Modeling On News Articles Using Latent Dirichlet Allocation Kretinin A Kol
10 pages
Markov Topic Models
No ratings yet
Markov Topic Models
8 pages
Topic Models Dsi Talk March 2017
No ratings yet
Topic Models Dsi Talk March 2017
24 pages
Topic Modelling Using NLP
No ratings yet
Topic Modelling Using NLP
18 pages
Sciadv Aaq1360
No ratings yet
Sciadv Aaq1360
11 pages
Conditional Topic Random Fields: Jun Zhu Eric P. Xing
No ratings yet
Conditional Topic Random Fields: Jun Zhu Eric P. Xing
8 pages
A Beginner's Guide To Latent Dirichlet Allocation (LDA)
No ratings yet
A Beginner's Guide To Latent Dirichlet Allocation (LDA)
9 pages
A Gentle Introduction To Topic Modeling Using Pyth
No ratings yet
A Gentle Introduction To Topic Modeling Using Pyth
10 pages
The Structural Topic Model and Applied Social Science
No ratings yet
The Structural Topic Model and Applied Social Science
4 pages
Exploration of Thesis
No ratings yet
Exploration of Thesis
93 pages
Markov Random Topic Fields: Hal Daum e III School of Computing University of Utah Salt Lake City, UT 84112 [email protected]
No ratings yet
Markov Random Topic Fields: Hal Daum e III School of Computing University of Utah Salt Lake City, UT 84112 [email protected]
4 pages
Visualizing Topic Models
No ratings yet
Visualizing Topic Models
4 pages
Experiments With Non Parametric Topic Models
No ratings yet
Experiments With Non Parametric Topic Models
10 pages
ITD253 L8 TopicModelling
No ratings yet
ITD253 L8 TopicModelling
31 pages
Eai 13-7-2018 159623
No ratings yet
Eai 13-7-2018 159623
16 pages
Draft: Automatic Topic Labeling Using Ontology-Based Topic Models
No ratings yet
Draft: Automatic Topic Labeling Using Ontology-Based Topic Models
7 pages
Sbalchiero Topicmodelinglongtextsand
No ratings yet
Sbalchiero Topicmodelinglongtextsand
14 pages
Yelp Review Pizza Topic
No ratings yet
Yelp Review Pizza Topic
9 pages
Wang 2006
No ratings yet
Wang 2006
10 pages
A Network Approach To Topic Models
No ratings yet
A Network Approach To Topic Models
22 pages
Improving Topic Models With Latent Feature Word Representations
No ratings yet
Improving Topic Models With Latent Feature Word Representations
16 pages
Latent Dirichlet Allocation
100% (2)
Latent Dirichlet Allocation
13 pages
Project Example
No ratings yet
Project Example
19 pages
Latent Dirichlet Allocation LDA and Topic Modeling PDF
No ratings yet
Latent Dirichlet Allocation LDA and Topic Modeling PDF
41 pages
A Novel Heuristic For Graph-Based Topic
No ratings yet
A Novel Heuristic For Graph-Based Topic
9 pages
Running Head: Topic Model by Using Latent Dirichlet Allocation 1
No ratings yet
Running Head: Topic Model by Using Latent Dirichlet Allocation 1
8 pages
Third Term Examination Basic Science and Technology Primary 2 (Basic 2) - Exam Questions - ClassRoomNotes
No ratings yet
Third Term Examination Basic Science and Technology Primary 2 (Basic 2) - Exam Questions - ClassRoomNotes
12 pages
ECIR2009 Topic Trend Detection
No ratings yet
ECIR2009 Topic Trend Detection
5 pages
The Author-Topic Model For Authors and Documents
No ratings yet
The Author-Topic Model For Authors and Documents
8 pages
10 1 1 84 8490 PDF
No ratings yet
10 1 1 84 8490 PDF
7 pages
1 s2.0 S2666285X22000206 Main
No ratings yet
1 s2.0 S2666285X22000206 Main
7 pages
An Integrated Clustering and BERT Framework For Improved Topic Modeling
No ratings yet
An Integrated Clustering and BERT Framework For Improved Topic Modeling
9 pages
Incorporating Topic Transition in Topic Detection and Tracking Algorithmsincorporating Topic Transition in Topic Detection and Tracking Algorithms
No ratings yet
Incorporating Topic Transition in Topic Detection and Tracking Algorithmsincorporating Topic Transition in Topic Detection and Tracking Algorithms
6 pages
DBM 302 Presentation
No ratings yet
DBM 302 Presentation
5 pages
Topic Modelling: A Survey of Topic Models: Abstract-In Recent Years We Have Significant Increase
No ratings yet
Topic Modelling: A Survey of Topic Models: Abstract-In Recent Years We Have Significant Increase
12 pages
Topic Modeling For Social Media Content A Practical Approach
No ratings yet
Topic Modeling For Social Media Content A Practical Approach
7 pages
A LDA Based Model For Topic Evolution: Evidence From Information Science Journals
No ratings yet
A LDA Based Model For Topic Evolution: Evidence From Information Science Journals
6 pages
Topic Model For LDA
No ratings yet
Topic Model For LDA
9 pages
A Survey of Topic Modeling in Text Mining
No ratings yet
A Survey of Topic Modeling in Text Mining
7 pages
Probabilistic Topic Modeling and Its Variants - A Survey: Padmaja CH V R S Lakshmi Narayana
No ratings yet
Probabilistic Topic Modeling and Its Variants - A Survey: Padmaja CH V R S Lakshmi Narayana
5 pages
Exploring Trends in A Topic-Based Search Engine: Wray Buntine, Jukka Perki O, Sami Perttu
No ratings yet
Exploring Trends in A Topic-Based Search Engine: Wray Buntine, Jukka Perki O, Sami Perttu
7 pages
Computer Hardware Presentation
50% (2)
Computer Hardware Presentation
21 pages
Diagramatic and Graphical Representation
No ratings yet
Diagramatic and Graphical Representation
20 pages
Economic Theory by ShumPeter
No ratings yet
Economic Theory by ShumPeter
16 pages
A Survey of Topic Pattern Mining in Text Mining PDF
No ratings yet
A Survey of Topic Pattern Mining in Text Mining PDF
7 pages
Person Centered Astrologytxt
No ratings yet
Person Centered Astrologytxt
329 pages
English in Common 4 Workbook 7-1
No ratings yet
English in Common 4 Workbook 7-1
9 pages
PMI Online Setup and Commisioning
No ratings yet
PMI Online Setup and Commisioning
25 pages
Lab Report Guide
No ratings yet
Lab Report Guide
2 pages
Application Process - Group Management Cadre
No ratings yet
Application Process - Group Management Cadre
8 pages
God of Small Things
100% (2)
God of Small Things
6 pages
Justifying The Use of Language Assessments: Linking Interpretations With Consequences Lyle F. Bachman
No ratings yet
Justifying The Use of Language Assessments: Linking Interpretations With Consequences Lyle F. Bachman
9 pages
Texto Ingles Informatica
100% (1)
Texto Ingles Informatica
2 pages
Amba-Axi Protocol Verification by Using UVM: P. Naveen Kalyan
No ratings yet
Amba-Axi Protocol Verification by Using UVM: P. Naveen Kalyan
9 pages
Drag On A Circular Cylinder: Instructed by
100% (1)
Drag On A Circular Cylinder: Instructed by
12 pages
Survey Q
No ratings yet
Survey Q
5 pages
Thorax Anatomy PDF
No ratings yet
Thorax Anatomy PDF
3 pages
Mat 235
100% (1)
Mat 235
4 pages
F4 Notes Waves and Optics
No ratings yet
F4 Notes Waves and Optics
31 pages
Computational Linguistics and Audio-Visual Readability: Analysing Linguistic Features of Intralingual-Subtitles Corpora
No ratings yet
Computational Linguistics and Audio-Visual Readability: Analysing Linguistic Features of Intralingual-Subtitles Corpora
14 pages
Code of Ethics For Che
No ratings yet
Code of Ethics For Che
2 pages
Time Mage
No ratings yet
Time Mage
11 pages
Algebra pdf-1
No ratings yet
Algebra pdf-1
34 pages
CM2 Hfa100 2001 - 02
No ratings yet
CM2 Hfa100 2001 - 02
20 pages
How To Rank A Website On Google Without A Backlink
No ratings yet
How To Rank A Website On Google Without A Backlink
2 pages
Gas Dynamics Outline Fall 2014
No ratings yet
Gas Dynamics Outline Fall 2014
3 pages
RCCe41 Continuous Beams (A & D)
No ratings yet
RCCe41 Continuous Beams (A & D)
20 pages
61 Function Expression Exercise
No ratings yet
61 Function Expression Exercise
2 pages
21st Century - Writing A Photographic Essay
No ratings yet
21st Century - Writing A Photographic Essay
5 pages
Jurnal Ilmiah Tesis PDF
No ratings yet
Jurnal Ilmiah Tesis PDF
15 pages