PLDA+: Parallel Latent Dirichlet Allocation with Data Placement and Pipeline Processing

View Researcher's Other Codes

Disclaimer: The provided code links for this paper are external links. Science Nest has no responsibility for the accuracy, legality or content of these links. Also, by downloading this code(s), you agree to comply with the terms of use as set out by the author(s) of the code(s).

Please contact us in case of a broken link from here

Authors Zhiyuan Liu, Yuzhou Zhang, E. Chang, Maosong Sun
Journal/Conference Name A
Paper Category
Paper Abstract Previous methods of distributed Gibbs sampling for LDA run into either memory or communication bottlenecks. To improve scalability, we propose four strategies data placement, pipeline processing, word bundling, and priority-based scheduling. Experiments show that our strategies significantly reduce the unparallelizable communication bottleneck and achieve good load balancing, and hence improve scalability of LDA.
Date of publication 2011
Code Programming Language C++

Copyright Researcher 2022