(Original Paper) Latent Dirichlet Allocation (algorithm) | AISC Foundational
Toronto Deep Learning Series, 15 November 2018
Paper Review: http://www.jmlr.org/papers/volume3/blei03a/blei03a.pdf
Speaker: Renyu Li (Wysdom.ai)
Host: Munich Reinsurance Co-Canada
Date: Nov 15th, 2018
Latent Dirichlet Allocation
We describe latent Dirichlet allocation (LDA), a generative probabilistic model for collections of
discrete data such as text corpora. LDA is a three-level hierarchical Bayesian model, in which each
item of a collection is modeled as a finite mixture over an underlying set of topics. Each topic is, in
turn, modeled as an infinite mixture over an underlying set of topic probabilities. In the context of
text modeling, the topic probabilities provide an explicit representation of a document. We present
efficient approximate inference techniques based on variational methods and an EM algorithm for
empirical Bayes parameter estimation. We report results in document modeling, text classification,
and collaborative filtering, comparing to a mixture of unigrams model and the probabilistic LSI
model.