CSpace
Asynchronous Parallel Fuzzy Stochastic Gradient Descent for High-Dimensional Incomplete Data Representation
Qin, Wen1,2,3; Luo, Xin4
2024-02-01
摘要A latent feature analysis (LFA) model is very efficient in representation learning to high-dimensional incomplete (HDI) data emerging from various Big Data-based applications. The stochastic gradient descent (SGD) is frequently adopted as the learning algorithm by an LFA model owing to its ease of implementation. However, the standard SGD-based LFA model's training process is serial, which greatly decreases its scalability on massive HDI data. On the other hand, all existing parallel SGD-based LFA models suffer from low speedup due to their frequent synchronizations during the training process. Motivated by this discovery, this article proposes an asynchronous parallel fuzzy stochastic gradient descent (APF-SGD) algorithm to establish an efficiently parallelized LFA model on shared memory with threefold ideas: first, decoupling the update interdependences among heterogeneous latent features alternatively, thereby achieving two parallelizable subtasks without learning information loss; second, proposing a novel parallelized learning scheme that eliminates the synchronizations from both the subtask and thread perspectives, i.e., subtasks and their affiliated threads are taken simultaneously and continuously without any synchronization; third, applying the fuzzy optimization to its hyperparameters, thus enabling its adaptation in training. Theoretical convergence proof illustrates that the newly proposed APF-SGD algorithm guarantees the convergence of a resultant LFA model. Experimental results on four real HDI datasets illustrate that an APF-SGD-based LFA model outperforms several state-of-the-art parallel SGD-based LFA models in both missing data estimation accuracy and parallelization speedup.
关键词Training Representation learning Data models Stochastic processes Computational modeling Analytical models Proteins Asynchronous parallelization fuzzy optimization high-dimensional incomplete (HDI) data latent feature analysis (LFA) shared-memory stochastic gradient descent (SGD)
DOI10.1109/TFUZZ.2023.3300370
发表期刊IEEE TRANSACTIONS ON FUZZY SYSTEMS
ISSN1063-6706
卷号32期号:2页码:445-459
通讯作者Luo, Xin(luoxin21@cigit.ac.cn)
收录类别SCI
WOS记录号WOS:001174168600011
语种en