Activated Gradients for Deep Neural Networks

doi:10.1109/TNNLS.2021.3106044

CSpace

	Activated Gradients for Deep Neural Networks
	Liu, Mei 1,2; Chen, Liangming 1,2; Du, Xiaohao 3; Jin, Long 1; Shang, Mingsheng1,2,3
	2021-08-31
摘要	Deep neural networks often suffer from poor performance or even training failure due to the ill-conditioned problem, the vanishing/exploding gradient problem, and the saddle point problem. In this article, a novel method by acting the gradient activation function (GAF) on the gradient is proposed to handle these challenges. Intuitively, the GAF enlarges the tiny gradients and restricts the large gradient. Theoretically, this article gives conditions that the GAF needs to meet and, on this basis, proves that the GAF alleviates the problems mentioned above. In addition, this article proves that the convergence rate of SGD with the GAF is faster than that without the GAF under some assumptions. Furthermore, experiments on CIFAR, ImageNet, and PASCAL visual object classes confirm the GAF's effectiveness. The experimental results also demonstrate that the proposed method is able to be adopted in various deep neural networks to improve their performance. The source code is publicly available at https://github.com/LongJin-lab/Activated-Gradients-for-Deep-Neural-Networks.
关键词	Training Deep learning Neural networks Optimization Visualization Newton method Eigenvalues and eigenfunctions Exploding gradient problems gradient activation function (GAF) ill-conditioned problems saddle point problems vanishing gradient problems
DOI	10.1109/TNNLS.2021.3106044
发表期刊	IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS
ISSN	2162-237X
页码	13
通讯作者	Jin, Long(longjin@ieee.org)
收录类别	SCI
WOS记录号	WOS:000732099500001
语种	英语

中国科学院重庆绿色智能技术研究院机构知识库