【论文题目】Clustering Lightened Deep Representation for Large Scale Face Identification
【作 者】Shilun Lin, Zhicheng Zhao, Fei Su 点击下载PDF全文
【关 键 字】Large Scale Face Identification, Lightened Deep Representation, Clustering, Convolutional Neural Network
IEEE International Conference on Communications（ICC 2017）
On specific face dataset, such as the LFW benchmark, recent face recognition methods have achieved near perfect accuracy. However, the face identification is still a challenging task for a super large scale dataset, where a real application is urgently needed, thus Microsoft challenge of recognizing one million celebrities (MS-Celeb-1M) has attracted an increasing attention. In this paper, we propose a three-step strategy to address this problem. Firstly, based on a corss-domain face dataset, i.e., the CASIA-Web dataset, an efficient and deliberate face representation model with a Max-Feature-Map (MFM) activation function is trained to map raw images into the feature space quickly. Secondly, face representations with the same MID in MS-Celeb-1M are clustered into three subsets: a pure set, a hard set and a mess set. The cluster centers are used as gallery representations of the corresponding MID and this scheme reduces the impact of noisy images and the number of comparisons during the face matching. Finally, locality sensitive hashing (LSH) algorithm is applied to speed up the search of the nearest centroid. Experimental results show that our face CNN model can extract stable and discriminative face representations, and the proposed three-step strategy achieves a promising performance without any manual selection for the MS-Celeb-1M dataset. Furthermore, we find that via clustering a relatively pure set is kept by many MIDs in MS-Celeb-1M, which indicates this scheme is effective for cleaning a huge but mess dataset.
【发 表 年】2017
【发 表 月】5