Latent Topic Models for Hypertext

Gruber, Amit; Rosen-Zvi, Michal; Weiss, Yair

Computer Science > Information Retrieval

arXiv:1206.3254 (cs)

[Submitted on 13 Jun 2012]

Title:Latent Topic Models for Hypertext

Authors:Amit Gruber, Michal Rosen-Zvi, Yair Weiss

View PDF

Abstract:Latent topic models have been successfully applied as an unsupervised topic discovery technique in large document collections. With the proliferation of hypertext document collection such as the Internet, there has also been great interest in extending these approaches to hypertext [6, 9]. These approaches typically model links in an analogous fashion to how they model words - the document-link co-occurrence matrix is modeled in the same way that the document-word co-occurrence matrix is modeled in standard topic models. In this paper we present a probabilistic generative model for hypertext document collections that explicitly models the generation of links. Specifically, links from a word w to a document d depend directly on how frequent the topic of w is in d, in addition to the in-degree of d. We show how to perform EM learning on this model efficiently. By not modeling links as analogous to words, we end up using far fewer free parameters and obtain better link prediction results.

Comments:	Appears in Proceedings of the Twenty-Fourth Conference on Uncertainty in Artificial Intelligence (UAI2008)
Subjects:	Information Retrieval (cs.IR); Computation and Language (cs.CL); Machine Learning (cs.LG); Machine Learning (stat.ML)
Report number:	UAI-P-2008-PG-230-239
Cite as:	arXiv:1206.3254 [cs.IR]
	(or arXiv:1206.3254v1 [cs.IR] for this version)
	https://doi.org/10.48550/arXiv.1206.3254

Submission history

From: Amit Gruber [view email] [via AUAI proxy]
[v1] Wed, 13 Jun 2012 15:30:14 UTC (1,143 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.IR

< prev | next >

new | recent | 2012-06

Change to browse by:

cs
cs.CL
cs.LG
stat
stat.ML

References & Citations

DBLP - CS Bibliography

listing | bibtex

Amit Gruber
Michal Rosen-Zvi
Yair Weiss

export BibTeX citation

Computer Science > Information Retrieval

Title:Latent Topic Models for Hypertext

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Information Retrieval

Title:Latent Topic Models for Hypertext

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators