0% found this document useful (0 votes)
9 views11 pages

A Dynamic Users' Interest Discovery Model With Distributed Inference Algorithm

The document presents a dynamic users’ interest discovery model called Distributed Author-Topic over Time (D-AToT), which utilizes a distributed inference algorithm to identify latent topics and evolving user interests from large-scale social network data. It addresses limitations of existing models that focus on static interests and single-processor analysis by incorporating both document content and author information. Experimental results demonstrate the model's feasibility and efficiency in tracking changing user interests over time.

Uploaded by

jiaqi bao
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views11 pages

A Dynamic Users' Interest Discovery Model With Distributed Inference Algorithm

The document presents a dynamic users’ interest discovery model called Distributed Author-Topic over Time (D-AToT), which utilizes a distributed inference algorithm to identify latent topics and evolving user interests from large-scale social network data. It addresses limitations of existing models that focus on static interests and single-processor analysis by incorporating both document content and author information. Experimental results demonstrate the model's feasibility and efficiency in tracking changing user interests over time.

Uploaded by

jiaqi bao
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Hindawi Publishing Corporation

International Journal of Distributed Sensor Networks


Volume 2014, Article ID 280892, 11 pages
http://dx.doi.org/10.1155/2014/280892

Research Article
A Dynamic Users’ Interest Discovery Model with
Distributed Inference Algorithm

Shuo Xu,1 Qingwei Shi,1,2 Xiaodong Qiao,3 Lijun Zhu,1 Han Zhang,1 Hanmin Jung,4
Seungwoo Lee,4 and Sung-Pil Choi4
1
Information Technology Support Center, Institute of Scientific and Technical Information of China, No. 15 Fuxing Road,
Haidian District, Beijing 100038, China
2
School of Software, Liaoning Technical University, No. 188 Longwan Street South, Huludao, Liaoning 125105, China
3
College of Software, Northeast Normal University, 5268 Renmin Street, Changchun, Jilin 130024, China
4
Department of Computer Intelligence Research, Korea Institute of Science and Technology Information, 245 Daehak-ro, Yuseong-gu,
Daejeon 305-806, Republic of Korea

Correspondence should be addressed to Xiaodong Qiao; [email protected]

Received 6 December 2013; Accepted 27 February 2014; Published 22 April 2014

Academic Editor: Goreti Marreiros

Copyright © 2014 Shuo Xu et al. This is an open access article distributed under the Creative Commons Attribution License, which
permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

One of the key issues for providing users user-customized or context-aware services is to automatically detect latent topics, users’
interests, and their changing patterns from large-scale social network information. Most of the current methods are devoted either
to discovering static latent topics and users’ interests or to analyzing topic evolution only from intrafeatures of documents, namely,
text content, without considering directly extrafeatures of documents such as authors. Moreover, they are applicable only to the
case of single processor. To resolve these problems, we propose a dynamic users’ interest discovery model with distributed inference
algorithm, named as Distributed Author-Topic over Time (D-AToT) model. The collapsed Gibbs sampling method following the
main idea of MapReduce is also utilized for inferring model parameters. The proposed model can discover latent topics and users’
interests, and mine their changing patterns over time. Extensive experimental results on NIPS (Neural Information Processing
Systems) dataset show that our D-AToT model is feasible and efficient.

1. Introduction Author-Recipient-Topic (ART) model [6–8], Role-Author-


Recipient-Topic (RART) model [6–8], and Author-Persona-
With a dynamic users’ interest discovery model, one can Topic (APT) model [9], (c) Author-Interest-Topic (AIT)
answer a range of important questions about the content of model [10] and Latent-Interest-Topic (LIT) model [11], and
information uploaded or shared to social network service (d) Author-Conference-Topic (ACT) model [12].
(SNS), such as which topics each user prefers, which users are In fact, when people enjoy SNS with their smart devices
similar to each other in terms of their interests, which users including phones and tablets, each user’s interest is usually
are likely to have written documents similar to an observed not static. However, the above models are devoted to discov-
document, and who are influential users at different stages ering static latent topics and user’s interests. Moreover, they
of topic evolution, and it also helps characterize users as are applicable only to the case of single processor. Of course,
pioneers, mainstream, or laggards in different subject areas. one can perform some post hoc or pre hoc analysis [4, 13]
Users’ interests have shown their increasing importance to discover changing patterns over time, but this misses the
for the development of personalized web services and user- opportunity for time to improve topic discovery [14], and it
centric applications [1, 2]. Hence, users’ interest modeling is very difficult to align corresponding topics [15]. Currently,
has been attracting extensive attentions during the past attention for dynamic models is mainly focused on analyzing
few years, such as (a) Author-Topic (AT) model [3–5], (b) topic evolution only from text content, such as Dynamic
2 International Journal of Distributed Sensor Networks

Topics Table 1: Notation used in the generative models.


Authors T1 T2 ··· TK
Author 1 𝜗1,1 𝜗1,2 ··· 𝜗1,K Symbol Description
Author 2 𝜗2,1 𝜗2,2 ··· 𝜗2,K 𝐾
.. .. .. .. Number of topics

. . . . 𝑀 Number of documents
Author A 𝜗A,1 𝜗A,2 ··· 𝜗A,K
𝑉 Number of unique words
T1 T2 ··· TK 𝐴 Number of unique authors
Word 1 𝜑1,1 𝜑2,1 ··· 𝜑K,1
𝜑1,2
𝑁𝑚 Number of word tokens in document 𝑚
Word 2 𝜑2,2 ··· 𝜑K,2
.. .. .. .. 𝐴𝑚 Number of authors in document 𝑚
. . . ⋱ .
𝜑1,V 𝜑2,V ··· 𝜑K,V 𝑎 Single author index, 𝑎 ∈ [1, 𝐴]
Word V
𝑘 Single topic index, 𝑘 ∈ [1, 𝐾]
𝑚 Single document index, 𝑚 ∈ [1, 𝑀]
𝑛 Single word token index, 𝑛 ∈ [1, 𝑁𝑚 ]
Figure 1: The illustration for discovering dynamic users’ interests.
V Single word index, V ∈ [1, 𝑉]
a𝑚 Authors in document 𝑚, a𝑚 ⊆ [1, 𝐴]
Topic Model (DTM) [16], continuous time DTM (cDTM) 𝜗𝑎 Multinomial distribution of topics specific to the author 𝑎.
[17], and Topic over Time (ToT) [14]. Multinomial distribution of topics specific to the
𝜗𝑚
This paper mainly focuses on the dynamic users’ interest document 𝑚.
discovery model, especially collapsed Gibbs sampling fol- 𝜑𝑘 Multinomial distribution of words specific to the topic 𝑘.
lowing the main idea of MapReduce [18]. Figure 1 gives a 𝜓𝑘 Beta distribution of timestamp specific to the topic 𝑘.
detailed illustration for discovering dynamic users’ interests.
𝑧𝑚,𝑛 Topic associated with the 𝑛th token in the document 𝑚
Our previous work [19, 20] is limited to inference algorithm
on single-processor. 𝑤𝑚,𝑛 𝑛th token in document 𝑚
The organization of the rest of this work is as follows. 𝑥𝑚,𝑛 Chosen author associated with the word token 𝑤𝑚,𝑛
In Section 2, we firstly discuss two related generative mod- 𝑡𝑚,𝑛 Timestamp associated with the 𝑛th token in the
els, Author-Topic (AT) model and Topic over Time (ToT) document 𝑚
model, and then introduce in detail our proposed Author- 𝛼 Dirichlet priors (hyperparameter) to the multinomial
Topic over Time (AToT) model. Sections 3 and 4 describe distribution 𝜗
the collapse Gibbs sampling methods used for inferring 𝛽 Dirichlet priors (hyperparameter) to the multinomial
the model parameters and distributed inference algorithm distribution 𝜑
version, respectively. In Section 5, extensive experimental
evaluations are conducted, and Section 6 concludes this
work.
(i) draw a multinomial 𝜗𝑎 from Dirichlet(𝛼);
2. Generative Models for Documents
(3) for each word 𝑛 ∈ [1, 𝑁𝑚 ] in document 𝑚 ∈ [1, 𝑀],
Before presenting our Author-Topic over Time (AToT)
model, we first describe two related generative models: AT (i) draw an author assignment 𝑥𝑚,𝑛 uniformly from
model and ToT model. The notation is summarized in Table 1. the group of authors a𝑚 ;
(ii) draw a topic assignment 𝑧𝑚,𝑛 from
2.1. Author-Topic (AT) Model. Rosen-Zvi et al. [3–5] propose Multinomial(𝜗𝑥𝑚,𝑛 );
an Author-Topic (AT) model for extracting information
(iii) draw a word 𝑤𝑚,𝑛 from Multinomial(𝜑𝑧𝑚,𝑛 ).
about authors and topics from large text collections. Rosen-
Zvi et al. model documents as if they were generated by
a two-stage stochastic process. An author is represented 2.2. Topic over Time (ToT) Model. Unlike other dynamic topic
by a probability distribution over topics, and each topic is models that rely on Markov assumptions or discretization
represented as a probability distribution over words. The of time, each topic in Topic over Time (ToT) model [14] is
probability distribution over topics in a multiauthor paper is associated with a continuous distribution over timestamps,
a mixture of the distributions associated with the authors. and, for each generated document, the mixture distribution
The graphical model representations for AT model are over topics is influenced by both word cooccurrences and
shown in Figure 2. The AT model can be viewed as a the document’s timestamp. Thus, the meaning of a particular
generative process, which can be described as follows. topic can be relied upon as constant, but the topics’ occur-
rence and correlations change significantly over time.
(1) For each topic 𝑘 ∈ [1, 𝐾], The graphical model representations for ToT model
(i) draw a multinomial 𝜑𝑘 from Dirichlet(𝛽); are shown in Figure 3. The ToT is a generative model of
timestamps and the words in the timestamped documents.
(2) for each author 𝑎 ∈ [1, 𝐴], The generative process can be described as follows.
International Journal of Distributed Sensor Networks 3

am
am

xm,n

xm,n

𝛼 𝜗a zm,n

a ∈ [1, A]

𝛼 𝜗a zm,n 𝛽 𝜑k wm,n tm,n 𝜓k

k ∈ [1, K] n ∈ [1, Nm ] k ∈ [1, K]


a ∈ [1, A]
m ∈ [1, M]

wm,n
Figure 4: The graphical model representation of the AToT model.
𝜑k
𝛽

k ∈ [1, K] n ∈ [1, Nm ]
(1) For each topic 𝑘 ∈ [1, 𝐾],
m ∈ [1, M]

Figure 2: The graphical model representation of the AT model. (i) draw a multinomial 𝜑𝑘 from Dirichlet(𝛽);

(2) for each author 𝑎 ∈ [1, 𝐴],


𝛼 𝜗m
(i) draw a multinomial 𝜗𝑎 from Dirichlet(𝛼);

zm,n
(3) for each word 𝑛 ∈ [1, 𝑁𝑚 ] in document 𝑚 ∈ [1, 𝑀],

(i) draw an author assignment 𝑥𝑚,𝑛 uniformly from


𝛽 𝜑k wm,n tm,n 𝜓k
the group of authors a𝑚 ;
k ∈ [1, K] n ∈ [1, Nm ] k ∈ [1, K] (ii) draw a topic assignment 𝑧𝑚,𝑛 from
m ∈ [1, M] Multinomial(𝜗𝑥𝑚,𝑛 );
(iii) draw a word 𝑤𝑚,𝑛 from Multinomial(𝜑𝑧𝑚,𝑛 );
Figure 3: The graphical model representation of the ToT model.
(vi) draw a timestamp 𝑡𝑚,𝑛 from Beta(𝜓𝑧𝑚,𝑛 ).

From the above generative process, one can see that AToT
(1) For each topic 𝑘 ∈ [1, 𝐾], model is parameterized as follows:

(i) draw a multinomial from Dirichlet(𝛽); 𝜗𝑎 | 𝛼 ∼ Dirichlet(𝛼)

(2) for each document 𝑚 ∈ [1, 𝑀], 𝜑𝑘 | 𝛽 ∼ Dirichlet(𝛽)


𝑧𝑚,𝑛 | 𝜗𝑥𝑚,𝑛 ∼ Multinomial(𝜗𝑥𝑚,𝑛 )
(i) draw a multinomial 𝜗𝑚 from Dirichlet(𝛼);
(ii) for each word 𝑛 ∈ [1, 𝑁𝑚 ] in document 𝑚, 𝑤𝑚,𝑛 | 𝜑𝑧𝑚,𝑛 ∼ Multinomial(𝜑𝑧𝑚,𝑛 )

(a) draw a topic assignment 𝑧𝑚,𝑛 from 𝑥𝑚,𝑛 | 𝐴 𝑚 ∼ Multinomial(1/𝐴 𝑚 )


Multinomial(𝜗𝑚 ); 𝑡𝑚,𝑛 | 𝜓𝑧𝑚,𝑛 ∼ Beta(𝜓𝑧𝑚,𝑛 ).
(b) draw a word 𝑤𝑚,𝑛 from Multinomial(𝜑𝑧𝑚,𝑛 );
As a matter of fact, a paper is usually written by the first
(c) draw a timestamp 𝑡𝑚,𝑛 from Beta(𝜓𝑧𝑚,𝑛 ). author and reprint author. If one wants to differentiate the
contributions of the first author and reprint author from those
of other coauthors, it is very easy for AToT model to set
2.3. Author-Topic over Time (AToT) Model. The graphical different weights for different authors. But since there are no
model representations for AToT model are shown in Figure 4. criteria to guide the corresponding weights, we just set the
The AToT model can be viewed as a generative process, which equal weights for all coauthors in this work; that is to say,
can be described as follows. 𝑥𝑚,𝑛 | 𝐴 𝑚 follows the uniform distribution.
4 International Journal of Distributed Sensor Networks

3. Inference Algorithm 𝐾 × 𝑉 count matrix 𝑛𝑘(V) . From these data structures, one can
easily estimate the Φ and Θ as follows:
For inference, the task is to estimate the sets of the following
unknown parameters in the AToT model: (1) Φ = {𝜑𝑘 }𝐾 𝑘=1 , 𝑛𝑘(V) + 𝛽V
Θ = {𝜗𝑎 }𝐴 , and Ψ = {𝜓 }𝐾
and (2) the corresponding 𝜑𝑘,V = , (4)
𝑎=1 𝑘 𝑘=1
topic and author assignments 𝑧𝑚,𝑛 , 𝑥𝑚,𝑛 for each word token ∑𝑉 (V)
V=1 (𝑛𝑘 + 𝛽V )
𝑤𝑚,𝑛 . In fact, inference cannot be done exactly in this model.
A variety of algorithms have been used to estimate the param- 𝑛𝑎(𝑘) + 𝛼𝑘
𝜗𝑎,𝑘 = . (5)
eters of topics models, such as variational EM (expectation ∑𝐾
𝑘=1 (𝑛𝑎(𝑘) + 𝛼𝑘 )
maximization) [21, 22], expectation propagation [23, 24],
belief propagation [25], and Gibbs sampling [19, 20, 26, 27]. As for Ψ, similar to [14], for simplicity and speed, we
In this work, collapsed Gibbs sampling algorithm [26] is used, update it after each Gibbs sample by the method of moments
since it provides a simple method for obtaining parameter [28]:
estimates under Dirichlet priors and allows combination
of estimates from several local maxima of the posterior 𝑡𝑘 (1 − 𝑡𝑘 )
distribution. 𝜓𝑘,1 = 𝑡𝑘 ( − 1) ,
𝑠𝑘2
In the Gibbs sampling procedure, we need to calcu- (6)
late the conditional distribution 𝑃(𝑧𝑚,𝑛 , 𝑥𝑚,𝑛 | w, z¬(𝑚,𝑛) , 𝑡 (1 − 𝑡 )
x¬(𝑚,𝑛) , t, a, 𝛼, 𝛽, Ψ), where z¬(𝑚,𝑛) , x¬(𝑚,𝑛) represents the topic 𝜓𝑘,2 = (1 − 𝑡𝑘 ) ( 𝑘 2 𝑘 − 1) ,
and author assignments for all tokens except 𝑤𝑚,𝑛 , respec- 𝑠𝑘
tively. We begin with the joint distribution 𝑃(w, z, x, t |
a, 𝛼, 𝛽, Ψ) of a dataset, and, using the chain rule, we can get where 𝑡𝑘 and 𝑠𝑘2 indicate the sample mean and biased sample
the conditional probability conveniently as variance of the timestamps belonging to topic 𝑘, respectively.
The readers are invited to consult [28] for details. In fact,
similar to [14], since the Beta distribution with the support
[0, 1] can behave many more shapes including the bell
𝑃 (𝑧𝑚,𝑛 , 𝑥𝑚,𝑛 | w, z¬(𝑚,𝑛) , x¬(𝑚,𝑛) , t, a, 𝛼, 𝛽, Ψ)
curve than Gaussian distribution, it is utilized to model the
(𝑤 ) (𝑧 ) timestamps. But Wang and McCallum [14] did not provide
𝑛𝑧𝑚,𝑛𝑚,𝑛 + 𝛽𝑤𝑚,𝑛 − 1 𝑛𝑎 𝑚,𝑛 + 𝛼𝑧𝑚,𝑛 − 1
∝ × (1) much detail on how to handle documents with 0 and 1
∑𝑉 (V)
V=1 (𝑛𝑘 + 𝛽V ) − 1 ∑𝐾 (𝑘)
𝑘=1 (𝑛𝑥𝑚,𝑛 + 𝛼𝑘 ) − 1
timestamps so that they have some probability, so the time
range of the data is normalized to [0.01, 0.99] in the paper.
× Beta (𝜓𝑧𝑚,𝑛 ) , With (2)–(6), Gibbs sampling algorithm for AToT model
is summarized in Algorithm 1. The procedure itself uses only
seven larger data structures, the count variables 𝑛𝑎(𝑘) and 𝑛𝑘(V) ,
which have dimension 𝐴×𝐾 and 𝐾×𝑉, respectively, their row
where 𝑛𝑘(V) is the number of times tokens of word V are sums 𝑛𝑎 and 𝑛𝑘 with dimensions 𝐴 and 𝐾, Beta parameters Ψ
assigned to topic 𝑘 and 𝑛𝑎(𝑘) represents the number of times with dimension 𝐾 × 2, and the state variable 𝑧𝑚,𝑛 , 𝑥𝑚,𝑛 with
author 𝑎 is assigned to topic 𝑘. Detailed derivation of Gibbs dimension 𝑊 = ∑𝑀 𝑚=1 𝑁𝑚 .
sampling for AToT is provided in the appendix.
If one further manipulates the above (1), one can turn it
into separated update equations for the topic and author of 4. Distributed Inference Algorithm
each token, suitable for random or systematic scan updates:
Our distributed inference algorithm, named as D-AToT, is
inspired by AD-LDA algorithm [29, 30], following the main
(𝑧 )
idea of the well-known distributed programming model,
𝑛𝑥𝑚,𝑛
𝑚,𝑛
+ 𝛼𝑧𝑚,𝑛 − 1 MapReduce [18]. The overall distributed architecture for
𝑃 (𝑥𝑚,𝑛 | x¬(𝑚,𝑛) , z, a, 𝛼) ∝ , (2) AToT model is shown in Figure 5.
∑𝐾 (𝑘)
𝑘=1 (𝑛𝑥𝑚,𝑛 + 𝛼𝑘 ) − 1 As stated in Figure 5, the master firstly distributes 𝑀
𝑃 (𝑧𝑚,𝑛 | w, z¬(𝑚,𝑛) , x, t, 𝛼, 𝛽, Ψ) training documents over 𝑃 mappers, with nearly equal
number 𝑀/𝑃 of documents on each mapper. Specifically, D-
(𝑤 )
𝑛𝑧𝑚,𝑛𝑚,𝑛 + 𝛽𝑤𝑚,𝑛 − 1
(𝑧 )
𝑛𝑎 𝑚,𝑛 + 𝛼𝑧𝑚,𝑛 − 1 AToT partitions document {w}, {a}, and {t} into {{w|𝑝 }}𝑃𝑝=1 ,
∝ × (3) {{a|𝑝 }}𝑃𝑝=1 , and {{t|𝑝 }}𝑃𝑝=1 and corresponding topic and author
∑𝑉 (V)
V=1 (𝑛𝑘 + 𝛽V ) − 1 ∑𝐾 (𝑘)
𝑘=1 (𝑛𝑥𝑚,𝑛 + 𝛼𝑘 ) − 1
assignments {z} and {x} into {{z|𝑝 }}𝑃𝑝=1 and {{x|𝑝 }}𝑃𝑝=1 , where
× Beta (𝜓𝑧𝑚,𝑛 ) . {w|𝑝 }, {a|𝑝 }, {t|𝑝 }, {z|𝑝 }, and {x|𝑝 } exist only on mapper 𝑝.
The Author-Topic count {𝑛𝑎(𝑘) } and topic-word count {𝑛𝑘(V) } are
(𝑘) (V)
likewise distributed, denoted as {𝑛𝑎|𝑝 } and {𝑛𝑘|𝑝 } on mapper
During parameter estimation, the algorithm keeps track 𝑝, which are used to temporarily store local Author-Topic and
of two large data structures: an 𝐴 × 𝐾 count matrix 𝑛𝑎(𝑘) and a topic-word counts.
International Journal of Distributed Sensor Networks 5

Algorithm AToTGibbs({w}, {a}, {t}, 𝛼, 𝛽, 𝜓, 𝐾)


Input: word vectors {w}, author vector {a}, time vector {t}, hyperparameters
𝛼, 𝛽, Beta parameters 𝜓, topic number 𝐾
Global data: count statistics {𝑛𝑎(𝑘) }, {𝑛𝑘(V) } and their sums {𝑛𝑎 }, {𝑛𝑘 }
Output: topic associations {z}, author associations {x}, multinomial parameters
Φ and 𝜃, Beta parameter estimates 𝜓, hyperparameter estimates 𝛼, 𝛽
// initialization
zero all count variables, 𝑛𝑎(𝑘) , 𝑛𝑎 , 𝑛𝑘(V) , 𝑛𝑘
for all documents 𝑚 ∈ [1, 𝑀] do
for all words 𝑛 ∈ [1, 𝑁𝑚 ] in document 𝑚 do
sample topic index 𝑧𝑚,𝑛 ∼ Multinomial(1/𝐾)
1
{ , 𝑎 ∈ a𝑚
sample author index 𝑥𝑚,𝑛 ∼ Multinomial(p) with 𝑝𝑎 = { 𝐴 𝑚
{0, otherwise
// increment counts and sums
(𝑧𝑚,𝑛 ) (𝑤 )
𝑛𝑥𝑚,𝑛 + = 1; 𝑛𝑥𝑚,𝑛 + = 1; 𝑛𝑧𝑚,𝑛𝑚,𝑛 + = 1; 𝑛𝑧𝑚,𝑛 + = 1
// Gibbs sampling over burn-in period and sampling period
while not finished do
for all documents 𝑚 ∈ [1, 𝑀] do
for all words 𝑛 ∈ [1, 𝑁𝑚 ] in documents 𝑚 do
// decrement counts and sums
(𝑧𝑚,𝑛 ) (𝑤 )
𝑛𝑥𝑚,𝑛 − = 1; 𝑛𝑥𝑚,𝑛 − = 1; 𝑛𝑧𝑚,𝑛𝑚,𝑛 − = 1; 𝑛𝑧𝑚,𝑛 − = 1
sample author index 𝑎̃ according to (2)
sample topic index 𝑧̃ according to (3)
// increment counts and sums
̃ (𝑤 )
𝑛𝑎(̃𝑘) + = 1; 𝑛𝑎̃+ = 1; 𝑛̃𝑘 𝑚,𝑛 + = 1; 𝑛̃𝑘 + = 1
update 𝜓 according to (6)
if converged and 𝐿 sampling iterations since last read out then
// different parameters read outs are averaged
read out parameter set Φ according to (4)
read out parameter set 𝜃 according to (5)

Algorithm 1: Gibbs sampling algorithm for AToT model.

Master

{w}, {a}, {t}

Mapper 1 Mapper p Mapper P

{w|1 }, {a|1 }, {t|1 } ··· ···


{w|p }, {a|p }, {t|p } {w|p }, {a|p }, {t|p }

(k)
(k) ( )
} {na(k) } {n(k ) }
(k)
{na|p } {n(k|p) } {n(k) ( ) {na|p } {n(k|p) } {n(k) ( )
a } {n a }
{na|1 } {nk|1 a } {n k }

Reducer
{n(k)
a } {nk( ) }

Update Ψ

Calculate Φ and Θ

Figure 5: The overall distributed architecture for AToT model.


6 International Journal of Distributed Sensor Networks

Table 2: Distribution of number of papers over year in NIPS dataset. the symmetric Dirichlet priors 𝛼 and 𝛽 are set at 0.5 and 0.1,
respectively. Gibbs sampling is run for 2000 iterations.
Year Number of papers
1987 90 (5.2%)
1988 95 (5.5%) 5.1. Examples of Topic, Author Distributions, and Topic Evo-
1989 101 (5.8%) lution. Table 3 illustrates examples of 8 topics learned by
1990 143 (8.2%) AToT model. The topics are extracted from a single sample
1991 144 (8.3%) at the 2000th iteration of the Gibbs sampler. Each topic
1992 127 (7.3%) is illustrated with (1) the top 10 words most likely to be
1993 144 (8.3%) generated conditioned on the topic, (b) the top 10 authors
1994 140 (8.0%) which have the highest probability conditioned on the topic,
1995 152 (8.7%)
and (c) histograms and fitted beta PDFs which show topics
evolution patterns over time.
1996 152 (8.7%)
1997 151 (8.7%)
1998 151 (8.7%) 5.2. Author Interest Evolution Analysis. In order to analyze
1999 150 (8.6%) further author interest evolution, it is interesting to calculate

In each Gibbs sampling iteration, each mapper 𝑝 updates


𝑃 (𝑧, 𝑡 | 𝑎) = 𝑃 (𝑧 | 𝑎) 𝑝 (𝑧 | 𝑡) = 𝜗𝑎,𝑧 × Beta (𝜓𝑧 ) . (8)
{z|𝑝 } and {x|𝑝 } by sampling 𝑧𝑚,𝑛|𝑝 and 𝑥𝑚,𝑛|𝑝 from the
following posterior distributions:

𝑃 (𝑥𝑚,𝑛|𝑝 | x¬(𝑚,𝑛)|𝑝 , z|𝑝 , a|𝑝 , 𝛼) In this subsection, we take Sejnowski T as an example, who
published 43 papers in total from 1987 to 1999 in the NIPS
(𝑧 ) conferences, as shown in Figure 6(a). The research interest
𝑛𝑥𝑚,𝑛
𝑚,𝑛
+ 𝛼𝑧𝑚,𝑛 − 1
∝ , evolution for Sejnowski T is reported in Figure 6(b), in which
∑𝐾 (𝑘)
𝑘=1 (𝑛𝑥𝑚,𝑛 + 𝛼𝑘 ) − 1 the area occupied by a square is proportional to the strength
of his research interest.
𝑃 (𝑧𝑚,𝑛|𝑝 | w|𝑝 , z¬(𝑚,𝑛)|𝑝 , x|𝑝 , t|𝑝 , 𝛼, 𝛽, Ψ) (7) From Figure 6(b), one can see that Sejnowski T’s research
interest focused mainly on Topic 51 (Eye Recognition and
(𝑤 ) (𝑧 )
𝑛𝑧𝑚,𝑛𝑚,𝑛 + 𝛽𝑤𝑚,𝑛 − 1 𝑛𝑎 𝑚,𝑛 + 𝛼𝑧𝑚,𝑛 − 1 Factor Analysis), Topic 37 (Neural Networks), and Topic 58
∝ × (Data Model and Learning Algorithm) but with different
∑𝑉 (V)
V=1 (𝑛𝑘 + 𝛽V ) − 1 ∑𝐾 (𝑘)
𝑘=1 (𝑛𝑥𝑚,𝑛 + 𝛼𝑘 ) − 1 emphasis from 1987 to 1999. In the early phase (1989–1993),
Sejnowski T’s research interest is only limited to Topic 51
× Beta (𝜓𝑧𝑚,𝑛 ) and then extended to Topic 37 in 1994 and Topic 58 in
1996 with great research interest strength and finally back to
(𝑘) (V)
and updates local 𝑛𝑎|𝑝 and 𝑛𝑘|𝑝 according to the new topic and Topic 51 after 1997. Anyway, Sejnowski T did not change his
author assignments. After each iteration, each mapper sends main research direction, Topic 51, which is verified from his
the local counts to the reducer and then the reducer updates homepage again.
Ψ and broadcasts the global 𝑛𝑎(𝑘) , 𝑛𝑘(V) , and Ψ to all mappers.
After all sampling iterations, the reducer calculates the Φ and
Θ according to (4)-(5). 5.3. Predictive Power Analysis. Similar to [5], we further
divide the NIPS papers into a training set Dtrain of 1,557
papers and a test set Dtest of 183 papers of which 102 are single-
5. Experimental Results and Discussions authored papers. Each author in Dtest must have authored
at least one of the training papers. The perplexity, originally
NIPS proceeding dataset is utilized to evaluate the perfor- used in language modeling [31], is a standard measure for
mance of our model, which consists of the full text of the 13 estimating the performance of a probabilistic model. The
years of proceedings from 1987 to 1999 Neural Information perplexity of a test document 𝑚 ̃ ∈ Dtest is defined as the
Processing Systems (NIPS) Conferences. The dataset contains exponential of the negative normalized predictive likelihood
1,740 research papers and 2,037 unique authors. The distribu- under the model:
tion of the number of papers over year is shown in Table 2.
In addition to downcasing and removing stop words and
numbers, we also remove the words appearing less than five
times in the corpus. After the preprocessing, the dataset ̃ , t𝑚,⋅
perplexity (w𝑚,⋅ ̃ | a𝑚
̃ , 𝛼, 𝛽, Ψ)
contains 13,649 unique words and 2,301,375 word tokens in ln 𝑃 (w𝑚,⋅ (9)
̃ , t𝑚,⋅
̃ | a𝑚
̃ , 𝛼, 𝛽, Ψ)
total. Each document’s timestamp is determined by the year = exp [− ]
of the proceedings. In our experiments, 𝐾 is fixed at 100 and 𝑁𝑚̃
International Journal of Distributed Sensor Networks 7

Table 3: An illustration of 8 topics from a 100-topic solution for the NIPS collection. The titles are our own interpretation of the topics. Each
topic is shown with the 10 words and authors that have the highest probability conditioned on that topic. Histograms show how the topics are
distributed over time; the fitted beta PDFs is shown also.

Topic 87 Topic 37 Topic 11 Topic 88


SVM and Kernel methods Neural networks Reinforcement learning EM and mixture models
Word Prop. Word Prop. Word Prop. Word Prop.
set 0.0188195 learning 0.01106740 state 0.0468466 density 0.0279477
support 0.0187117 network 0.00948016 learning 0.0252876 log 0.0217790
vector 0.0186039 neural 0.00780503 belief 0.0213999 distribution 0.0186946
kernel 0.0160163 input 0.00682192 policy 0.0182191 mixture 0.0178379
function 0.0146146 model 0.00681643 function 0.0175122 method 0.0144108
svm 0.0138060 training 0.00604202 action 0.0150383 gaussion 0.0142394
training 0.0129974 data 0.00597611 states 0.0148615 likelihood 0.0140681
problem 0.0124583 figure 0.00594316 reinforcement 0.0118574 entropy 0.0132113
space 0.0119731 networks 0.00560813 actions 0.0118574 gaussians 0.0123546
solution 0.0115957 function 0.00554222 mdp 0.0102670 form 0.0113264
Author Prop. Author Prop. Author Prop. Author Prop.
Scholkopf B 0.949692 Reggia J 0.979832 Zhang N 0.629412 Barron A 0.608507
Crisp D 0.888975 Todorov E 0.976750 Rodriguez A 0.578235 Wainwright M 0.372871
Laskov P 0.706170 Horne B 0.974146 Dietterich T 0.342954 Mukherjee S 0.340927
Steinhage V 0.634973 Thmn S 0.973083 Sallans B 0.228042 Li J 0.337108
Chapelle O 0.610385 Weigend A 0.972806 Walker M 0.189143 Jebara T 0.253203
Li Y 0.513418 McCallum R 0.969777 Koller D 0.1885150 Millman K 0.171569
Herbrich R 0.454384 Camana R 0.969388 Yeung D 0.1213730 Fisher J 0.148230
Gordon M 0.425090 Slaney M 0.969382 Thrun S 0.0842081 Ihler A 0.128369
Vapnik V 0.330421 Miikkulainen R 0.968541 Konda V 0.0680365 Beal M 0.126578
Dom B 0.286036 Bergen J 0.968358 Parr R 0.0468006 Hansen L 0.0849109
×105
16000 6 2 20 4500 6 4500 6
14000 1.8 18 4000 4000
5 1.6 16 3500 5 3500 5
12000 1.4 14
10000 4 3000 4 3000 4
1.2 12 2500 2500
8000 3 1 10 3 2000 3
0.8 8 2000
6000 2 1500 2 1500 2
4000 0.6 6
0.4 4 1000 1 1000 1
2000 1 500 500
0.2 2
0 0 0 0 0 0 0 0
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999

1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999

1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999

1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999

Topic 47 Topic 78 Topic 51 Topic 58


Speech recognition Bayesian learning Eye recognition and factor analysis Data model and learning algorithm
Word Prop. Word Prop. Word Prop. Word Prop.
hmm 0.0415364 bayesian 0.0243032 sejnowski 0.0265409 learning 0.00904655
speech 0.0392921 sampling 0.0184560 eye 0.0265409 model 0.00752741
hmms 0.0216579 prior 0.0178563 ica 0.0183324 neural 0.00705102
mixture 0.0179708 distribution 0.0148578 vor 0.0159531 data 0.00700339
suffix 0.0104362 monte 0.0127588 disparity 0.0153583 function 0.00683930
probabilistic 0.00995527 carlo 0.0118592 head 0.0135738 network 0.00624646
probabilities 0.00947434 model 0.0109597 position 0.0125031 input 0.00593946
singer 0.00883310 posterior 0.0105099 eeg 0.0119083 set 0.00561128
acoustic 0.00883310 priors 0.00946041 parietal 0.0109566 networks 0.00556365
saul 0.00867279 sample 0.00901063 salk 0.0105997 figure 0.00545249
8 International Journal of Distributed Sensor Networks

Table 3: Continued.

Author Prop. Author Prop. Author Prop. Author Prop.


Rigoll G 0.460882 Schuurmans D 0.651505 Sejnowski T 0.410459 Gray M 0.974482
Singer Y 0.437547 Sykacek P 0.495506 Pouget A 0.269781 Dimitrov A 0.973538
Nix D 0.192342 Andrieu C 0.413324 Anastasio T 0.112957 Galperin G 0.97094
Saul L 0.170699 Rasmussen C 0.344185 Horiuchi T 0.0328485 Malik J 0.968536
Hermansky H 0.0795602 Zlochin M 0.244745 Albright T 0.0099278 Davies S 0.966534
Roweis S 0.0391364 Beal M 0.157807 Jousmaki V 0.00791139 Cook G 0.96519
Attias H 0.0357538 Hansen L 0.122773 Fredholm H 0.00681818 Ghosn J 0.964184
Movellan J 0.033414 Herbrich R 0.0882701 Bohr J 0.00643777 Orponen P 0.964184
Schuster M 0.0293324 Downs O 0.0694726 Ramanujam N 0.00621891 Yen S 0.963001
Muller K 0.028258 Williams C 0.0652069 Dixon L 0.00585938 Chatterjee C 0.962627
×105
1800 9 6000 6 1500 2 2 18
1600 8 1.8 1.8 16
1400 7 5000 5 1.6 1.6 14
1200 6 4000 4 1000 1.4 1.4 12
1000 5 1.2 1.2 10
800 4 3000 3 1 1
0.8 0.8 8
600 3 2000 2 500 0.6 0.6 6
400 2 1000 1 0.4 0.4 4
200 1 0.2 0.2 2
0 0 0 0 0 0 0 0
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999

1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999

1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999

1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
7 87

6 37
Number of publications

5 11

88
4
Topic

47
3
78
2
51
1
58
0
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999

1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998

(a) Distribution of number of publications over time (b) Research interest evolution 1999

Figure 6: The distribution of number of publications and research interest evolution for Sejnowski T.

with AT model in a post hoc fashion on 102 single-authored


papers. It is not difficult to see that the perplexity of AToT
̃ , t𝑚,⋅
𝑃 (w𝑚,⋅ ̃ | a𝑚
̃ , 𝛼, 𝛽, Ψ) model is smaller than that of AT model when the number of
1 topics > 10, which indicates that AToT model outperforms AT
= 𝑁𝑚
× ∑Beta (𝜓𝑧𝑚,𝑛
̃ ,1
, 𝜓𝑧𝑚,𝑛
̃ ,2
| Dtrain ) model.
[𝐴 𝑚̃ ] z𝑚,⋅
̃

(10)
× ∫ 𝑝 (Φ | 𝛽, Dtrain ) ∑𝜑𝑧𝑚,𝑛 𝑑Φ 6. Conclusions
̃ ,𝑤𝑚,𝑛
̃
z𝑚,⋅
̃
With a dynamic users’ interest discovery model, one can
train answer many important questions about the content of
× ∫ 𝑝 (Θ | 𝛼, D ) ∑𝜗𝑥𝑚,𝑛 , 𝑧𝑚,𝑛
̃ 𝑑Θ.
x𝑚,⋅
̃
̃ information uploaded or shared to SNS. Based on our
previous work, Author-Topic over Time (AToT) model [19],
We approximate the integrals over Φ and Θ using the for documents using authors and topics with timestamps, this
point estimates obtained in (4)-(5) for each sample 𝑠 ∈ paper proposes a dynamic users’ interest discovery model
{1, 2, . . . , 10} of assignments x, z and then average over with distributed inference algorithm following the main idea
samples. Figure 7 shows the results for the AToT model and of MapReduce, named as Distributed AToT (D-AToT) model.
International Journal of Distributed Sensor Networks 9

𝑀 𝑁𝑚 𝐾
4000
= ∫ ∏ ∏𝑃 (𝑤𝑚,𝑛 | 𝜑𝑧𝑚,𝑛 ) ∏𝑝 (𝜑𝑘 | 𝛽) 𝑑Φ
𝑚=1 𝑛=1 𝑘=1

3500 𝑀 𝑁𝑚 𝐴
× ∫ ∏ ∏𝑃 (𝑧𝑚,𝑛 | 𝜗𝑥𝑚,𝑛 ) ∏𝑝 (𝜗𝑎 | 𝛼) 𝑑Θ
Perplexity

𝑚=1 𝑛=1 𝑎=1


3000
𝑀 𝑁𝑚 𝑀 𝑁𝑚
× ∏ ∏𝑝 (𝑡𝑚,𝑛 | 𝜓𝑧𝑚,𝑛 ) × ∏ ∏𝑃 (𝑥𝑚,𝑛 | a𝑚 )
𝑚=1 𝑛=1 𝑚=1 𝑛=1
2500
1
= 𝑁
∏𝑀
𝑚=1 𝐴 𝑚
𝑚
2000
𝐾 𝑉
𝑛(V)
𝐾 Γ (∑𝑉
V=1 𝛽V )
𝑉
𝛽 −1
5 10 50 100 200 400 × ∫ ∏∏𝜑𝑘,V𝑘 ∏ ( ∏𝜑𝑘,VV ) 𝑑Φ
Number of topics 𝑘=1 V=1 𝑘=1 ∏𝑉
V=1 Γ (𝛽V ) V=1

AT 𝐴 𝐾
𝑛𝑎(𝑘)
𝐴 Γ (∑𝐾
𝑘=1 𝛼𝑘 )
𝐾
𝛼 −1
AToT × ∫ ∏∏𝜗𝑎,𝑘 ∏ ( 𝐾 ∏𝜗𝑎,𝑘𝑘 ) 𝑑Θ
𝑎=1 𝑘=1 𝑎=1 ∏𝑘=1 Γ (𝛼𝑘 ) 𝑘=1
Figure 7: Perplexity of the 102 single-authored test documents.
𝑀 𝑁𝑚
× ∏ ∏𝑝 (𝑡𝑚,𝑛 | 𝜓𝑧𝑚,𝑛 )
𝑚=1 𝑛=1

𝐾 𝐴
The D-AToT model combines the merits of AT and ToT 1 Γ (∑𝑉
V=1 𝛽V ) Γ (∑𝐾
𝑘=1 𝛼𝑘 )
= 𝑁
( ) ( )
models. Specifically, it can automatically detect latent topics, ∏𝑀
𝑚=1 𝐴 𝑚
𝑚
∏𝑉
V=1 Γ (𝛽V ) ∏𝐾
𝑘=1 Γ (𝛼𝑘 )
users’ interests, and their changing patterns from large-scale
social network information. The results on NIPS dataset show 𝑀 𝑁𝑚 𝐾 ∏𝑉 (V)
V=1 Γ (𝑛𝑘 + 𝛽V )
the increase of salient topics and more reasonable users’ × ∏ ∏𝑝 (𝑡𝑚,𝑛 | 𝜓𝑧𝑚,𝑛 ) × ∏ 𝑉
𝑚=1 𝑛=1 𝑘=1 Γ (∑V=1 (𝑛𝑘(V) + 𝛽V ))
interest changing patterns.
One can generalize the approach in the work to construct
alternative dynamic models from other static users’ interest
𝐴 ∏𝐾 (𝑘)
𝑘=1 Γ (𝑛𝑎 + 𝛼𝑘 )
×∏ 𝐾
.
discovery models and ToT model with distributed inference 𝑎=1 Γ (∑𝑘=1 (𝑛𝑎(𝑘) + 𝛼𝑘 ))
algorithm. As a matter of fact, our work currently is limited
(A.1)
to deal with the users and latent topics with timestamps in
SNS. Though NIPS proceeding dataset is a benchmark data
for academic social network, the D-AToT model ignores the
links in SNS. In ongoing work, novel topic model, considering Using the chain rule, we can obtain the conditional
the links in SNS, will be constructed to identify the users with probability conveniently as follows:
similar interests from social networks.

𝑃 (𝑧𝑚,𝑛 , 𝑥𝑚,𝑛 | w, z¬(𝑚,𝑛) , x¬(𝑚,𝑛) , t, a, 𝛼, 𝛽, Ψ)


Appendix
= (𝑃 (𝑧𝑚,𝑛 , 𝑥𝑚,𝑛 , 𝑤𝑚,𝑛 , 𝑡𝑚,𝑛 | w¬(𝑚,𝑛) ,
Gibbs Sampling Derivation for AToT
t¬(𝑚,𝑛) , z¬(𝑚,𝑛) , x¬(𝑚,𝑛) , a, 𝛼, 𝛽, Ψ))
We begin with the joint distribution 𝑃(w, z, x, t | a, 𝛼, 𝛽, Ψ).
We can take advantage of conjugate priors to simplify the × (𝑃 (𝑤𝑚,𝑛 , 𝑡𝑚,𝑛 | w¬(𝑚,𝑛) , t¬(𝑚,𝑛) ,
integrals. Consider −1
x¬(𝑚,𝑛) , z¬(𝑚,𝑛) , a, 𝛼, 𝛽, Ψ))

𝑃 (w, t, z, x | a, 𝛼, 𝛽, Ψ)
𝑃 (w, z, x, t | a, 𝛼, 𝛽, Ψ) ∝
𝑃 (w¬(𝑚,𝑛) , t¬(𝑚,𝑛) , z¬(𝑚,𝑛) , x¬(𝑚,𝑛) | a, 𝛼, 𝛽, Ψ)
= 𝑃 (w | z, 𝛽) 𝑝 (t | Ψ, z) 𝑃 (z | x, 𝛼) 𝑃 (x | a) (𝑤 ) (𝑧 )
𝑛𝑧𝑚,𝑛𝑚,𝑛 + 𝛽𝑤𝑚,𝑛 − 1 𝑛𝑥𝑚,𝑛
𝑚,𝑛
+ 𝛼𝑧𝑚,𝑛 − 1
∝ ×
= ∫ 𝑃 (w | Φ, z) 𝑝 (Φ | 𝛽) 𝑑Φ × 𝑝 (t | Ψ, z) ∑𝑉 (V)
V=1 (𝑛𝑧𝑚,𝑛 + 𝛽V ) − 1 ∑𝐾 (𝑘)
𝑘=1 (𝑛𝑥𝑚,𝑛 + 𝛼𝑘 ) − 1

1
× ∫ 𝑃 (z | x, Θ) 𝑝 (Θ | 𝛼) 𝑑Θ × 𝑃 (x | a) × × 𝑝 (𝑡𝑚,𝑛 | 𝜓𝑧𝑚,𝑛 )
𝐴𝑚
10 International Journal of Distributed Sensor Networks

(𝑤 ) (𝑧 )
𝑛𝑧𝑚,𝑛𝑚,𝑛 + 𝛽𝑤𝑚,𝑛 − 1 𝑛𝑥𝑚,𝑛
𝑚,𝑛
+ 𝛼𝑧𝑚,𝑛 − 1 Mining (KDD ’07), pp. 500–509, San Jose, Calif, USA, August
∝ × 2007.
∑𝑉
V=1 (𝑛𝑧(V)𝑚,𝑛 + 𝛽V ) − 1 ∑𝐾
𝑘=1 (𝑛𝑥(𝑘)𝑚,𝑛 + 𝛼𝑘 ) − 1
[10] N. Kawamae, “Author interest topic model,” in Proceedings of the
33rd Annual International ACM SIGIR Conference on Research
× Beta (𝜓𝑧𝑚,𝑛 ) . and Development in Information Retrieval (SIGIR ’10), pp. 887–
(A.2) 888, ACM, Geneva, Switzerland, July 2010.
[11] N. Kawamae, “Latent interest-topic model: finding the causal
relationships behind dyadic data,” in Proceedings of the 19th
Conflict of Interests International Conference on Information and Knowledge Man-
agement and Co-located Workshops (CIKM ’10), pp. 649–658,
The authors declare that there is no conflict of interests ACM, Toronto, Canada, October 2010.
regarding the publication of this paper.
[12] J. Tang, J. Zhang, R. Jin et al., “Topic level expertise search over
heterogeneous networks,” Machine Learning, vol. 82, no. 2, pp.
Acknowledgments 211–237, 2011.
[13] X. Wang, N. Mohanty, and A. McCallum, “Group and topic
This work was funded partially by the Key Technolo- discovery from relations and their attributes,” in Advances in
gies R&D Program of Chinese 12th Five-Year Plan (2011– Neural Information Processing Systems 18, Y. Weiss, B. Schölkopf,
2015), Key Technologies Research on Large-Scale Seman- and J. Platt, Eds., pp. 1449–1456, MIT Press, Cambridge, Mass,
tic Calculation for Foreign STKOS, and Key Technologies USA, 2006.
Research on Data Mining from the Multiple Electric Vehicle [14] X. Wang and A. McCallum, “Topics over time: a non-markov
Information Sources under Grant nos. 2011BAH10B04 and continuous-time model of topical trends,” in Proceedings of
2013BAG06B01, respectively. the 12th ACM SIGKDD International Conference on Knowledge
Discovery and Data Mining (KDD ’06), pp. 424–433, August
2006.
References
[15] S. Xu, L. Zhu, Q. Xiaodong, S. Qingwei, and G. Jie, “Topic
[1] F. Qiu and J. Cho, “Automatic identification of user interest for linkages between papers and patents,” in Proceedings of the 4th
personalized search,” in Proceedings of the 15th International International Conference on Advanced Science and Technology,
Conference on World Wide Web (WWW ’06), pp. 727–736, pp. 176–183, Science & Engineering Research Support soCiety,
ACM, Edinburgh, UK, May 2006. Daejeon, Republic of Korea, 2012.
[2] J. Kim, D.-H. Jeong, D. Lee, and H. Jung, “User-centered [16] D. M. Blei and J. D. Lafferty, “Dynamic topic models,” in
innovative technology analysis and prediction application in Proceedings of the 23rd International Conference on Machine
mobile environment,” Multimedia Tools and Applications, 2013. Learning (ICML ’06), pp. 113–120, ACM, June 2006.
[3] M. Rosen-Zvi, T. Griffiths, M. Steyvers, and P. Smyth, “The [17] C. Wang, D. Blei, and D. Heckerman, “Continuous time
author-topic model for authors and documents,” in Proceedings dynamic topic models,” in Proceedings of the 24th Conference on
of the 20th Conference on Uncertainty in Artificial Intelligence Uncertainty in Artificial Intelligence (UAI ’08), pp. 579–586, July
(UAI ’04), pp. 487–494, AUAI Press, Arlington, Va, USA, 2004. 2008.
[4] M. Steyvers, P. Smyth, M. Rosen-Zvi, and T. Griffiths, “Prob- [18] J. Dean and S. Ghemawat, “MapReduce: simplified data process-
abilistic author-topic models for information discovery,” in ing on large clusters,” Communications of the ACM, vol. 51, no.
Proceedings of the 10th ACM SIGKDD International Conference 1, pp. 107–113, 2008.
on Knowledge Discovery and Data Mining (KDD ’04), pp. 306–
[19] S. Xu, Q. Shi, X. Qiao et al., “Author-topic over time (AToT):
315, ACM, Seattle, Wash, USA, August 2004.
a dynamic users’ interest model,” in Mobile, Ubiquitous, and
[5] M. Rosen-Zvi, C. Chemudugunta, T. Griffiths, P. Smyth, and Intelligent Computing: The 2nd International Conference on
M. Steyvers, “Learning author-topic models from text corpora,” Ubiquitous Context-Awareness and Wireless Sensor Network, vol.
ACM Transactions on Information Systems, vol. 28, no. 1, article 274, pp. 227–233, Springer, Berlin, Germany, 2014.
4, pp. 1–38, 2010.
[20] Q. Shi, X. Qiao, S. Xu, and G. Nong, “Author-topic evolution
[6] A. McCallum, A. Corrada-Emmanuel, and X. Wang, “The
model and its application in analysis of research interests evo-
author-recipient-topic model for topic and role discovery in
lution,” Journal of the China Society for Scientific and Technical
social networks: experiments with enron and academic email,”
Information, vol. 32, no. 9, pp. 912–919, 2013.
Tech. Rep. um-cs-2004-096, Department of Computer Science,
University of Massachusetts Amherst, 2004. [21] J. M. Winn, Variational message passing and its applications
[Ph.D. thesis], University of Cambridge, 2004.
[7] A. McCallum, A. Corrada-Emmanuel, and X. Wang, “Topic and
role discovery in social networks,” in Proceedings of the 19th [22] D. M. Blei, A. Y. Ng, and M. I. Jordan, “Latent Dirichlet
International Joint Conference on Artificial Intelligence, pp. 786– allocation,” Journal of Machine Learning Research, vol. 3, no. 4-5,
791, Morgan Kaufmann, San Francisco, Calif, USA, 2005. pp. 993–1022, 2003.
[8] A. McCallum, X. Wang, and A. Corrada-Emmanuel, “Topic and [23] T. P. Minka, “Expectation propagation for approximate
role discovery in social networks with experiments on enron Bayesian inference,” in Proceedings of the 17th Conference on
and academic email,” Journal of Artificial Intelligence Research, Uncertainty in Artificial Intelligence, pp. 362–369, Morgan
vol. 30, no. 1, pp. 249–272, 2007. Kaufmann, San Francisco, Calif, USA, 2001.
[9] D. Mimno and A. McCallum, “Expertise modeling for matching [24] T. Minka and J. Lafferty, “Expectation-propagation for the
papers with reviewers,” in Proceedings of the 13th ACM SIGKDD generative aspect model,” in Proceedings of the 18th Conference
International Conference on Knowledge Discovery and Data on Uncertainty in Artificial Intelligence, pp. 352–359, 2002.
International Journal of Distributed Sensor Networks 11

[25] J. Zeng, “A topic modeling toolbox using belief propagation,”


Journal of Machine Learning Research, vol. 13, pp. 2233–2236,
2012.
[26] T. L. Griffiths and M. Steyvers, “Finding scientific topics,”
Proceedings of the National Academy of Sciences of the United
States of America, vol. 101, supplement 1, pp. 5228–5235, 2004.
[27] G. Heinrich, “Parameter estimation for text analysis,” Tech. Rep.
version 2.9, vsonix GmbH and University of Leipzig, 2009.
[28] C. B. Owen, Parameter estimation for the Beta distribution [M.S.
thesis], Brigham Young University, 2008.
[29] D. Newman, A. Asuncion, P. Smyth, and M. Welling, “Dis-
tributed inference for latent Dirichlet allocation,” in Advances
in Neural Information Processing Systems 20, J. C. Platt, D.
Koller, Y. Singer, and S. Roweis, Eds., pp. 1081–1088, MIT Press,
Cambridge, Mass, USA, 2008.
[30] D. Newman, A. Asuncion, P. Smyth, and M. Welling, “Dis-
tributed algorithms for topic models,” Journal of Machine
Learning Research, vol. 10, pp. 1801–1828, 2009.
[31] L. Azzonpardi, M. Girolami, and K. van Risjbergen, “Investigat-
ing the relationship between language model perplexity and IR
precision-recall measures,” in Proceedings of the 26th Interna-
tional ACM SIGIR Conference on Research and Development in
Information Retrieval (SIGIR ’03), pp. 369–370, ACM, Toronto,
Canada, August 2003.

You might also like