Affordable Access

Sparse Stochastic Inference for Latent Dirichlet allocation

Authors
  • Mimno, David
  • Hoffman, Matt
  • Blei, David
Type
Preprint
Publication Date
Jun 27, 2012
Submission Date
Jun 27, 2012
Identifiers
arXiv ID: 1206.6425
Source
arXiv
License
Yellow
External links

Abstract

We present a hybrid algorithm for Bayesian topic models that combines the efficiency of sparse Gibbs sampling with the scalability of online stochastic inference. We used our algorithm to analyze a corpus of 1.2 million books (33 billion words) with thousands of topics. Our approach reduces the bias of variational inference and generalizes to many Bayesian hidden-variable models.

Report this publication

Statistics

Seen <100 times