Introduce
PolyLDA is the shorthand of Polylingual LDA . PolyLDA assumes that a single document has words in multiple languages , but each document has a common distribution of topics . Each topic also has different facets of languages , these topics end up being consistent because of the links across language encoded in the consistent themes present in document .
Gibbs VS Variational Inference
-
Variational Inference :
-- Map reduce : The lda project based on Variational Inference can be implemented on the Hadoop , which is flexible for dealing with data with huge size .
-- Less iterations -
Gibbs sampling :
-- Drawback : Convergence of sampler to its stationary distribution is difficult to diagnose , and sampling algorithm can be slow to converge in high dimensional models .
If we want to deal with data with huge size , we have to configure the Hadoop on several serves . Thus , variable inference is the best choice for topic model . Because it is flexible with map reduce .