PLDA+: Parallel Latent Dirichlet Allocation with Data Placement and Pipeline Processing
View Researcher's Other CodesDisclaimer: The provided code links for this paper are external links. Science Nest has no responsibility for the accuracy, legality or content of these links. Also, by downloading this code(s), you agree to comply with the terms of use as set out by the author(s) of the code(s).
Please contact us in case of a broken link from here
Authors | Zhiyuan Liu, Yuzhou Zhang, E. Chang, Maosong Sun |
Journal/Conference Name | A |
Paper Category | Artificial Intelligence |
Paper Abstract | Previous methods of distributed Gibbs sampling for LDA run into either memory or communication bottlenecks. To improve scalability, we propose four strategies data placement, pipeline processing, word bundling, and priority-based scheduling. Experiments show that our strategies significantly reduce the unparallelizable communication bottleneck and achieve good load balancing, and hence improve scalability of LDA. |
Date of publication | 2011 |
Code Programming Language | C++ |
Comment |