论文浏览

【论文题目】Improving deep neural networks with multi-layer maxout networks and a novel initialization method

【作    者】Weichen Sun, Fei Su, Leiquan Wang        点击下载PDF全文

【关 键 字】Deep learning, Convolutional neural networks, Activation function, Image classification, Initialization

【发表刊物/会议】
    Neurocomputing

【摘    要】
     For the purpose of enhancing the discriminability of convolutional neural networks (CNNs) and facilitating the optimization, we investigate the activation function for a neural network and the corresponding initialization method in this paper. Firstly, a trainable activation function with a multi-layer structure (named “Multi-layer Maxout Network”, MMN) is proposed. MMN is a multi-layer structured maxout, inheriting advantages of both a non-saturated activation function and a trainable activation function approximator. Secondly, we derive a robust initialization method specifically for the MMN activation with a theoretical proof, which works for the maxout activation as well. Our novel initialization method could reduce internal covariate shift when signals are propagated through layers and solve the so called “exploding/vanishing gradient” problem, which leads a more efficient training procedure of deep neural networks. Experimental results show that our proposed model yields better performance on three image classification benchmark datasets (CIFAR-10, CIFAR-100 and ImageNet) than quite a few state-of-the-art methods and our novel initialization method improves performance further. Furthermore, the influence of MMN in different hidden layers is analyzed, and a trade-off scheme between the accuracy and computing resources is given.

【发 表 年】2017

【发 表 月】6

【类    别】模式识别


Tel: 086-010-62283118 邮编:100876
地址:北京市海淀区西土城路10号北京邮电大学教二楼多媒体中心
北京市海淀区西土城路十号113#信箱
版权所有:北京邮电大学多媒体通信与模式识别研究室 京ICP证14002347号