论文浏览

【论文题目】Towards Improving Canonical Correlation Analysis for Cross-modal Retrieval

【作    者】Jie Shao, Zhicheng Zhao, Fei Su, Ting Yue        点击下载PDF全文

【关 键 字】cross-modal retrieval, semantic feature, progressive framework, imilarity learning

【发表刊物/会议】
    ACM Multimedia (Thematic Workshops) 2017

【摘    要】
     Building correlations for cross-modal retrieval, i.e., image-to-text retrieval and text-to-image retrieval, is a feasible solution to bridge the semantic gap between different modalities. Canonical correlation analysis (CCA) based methods have ever achieved great successes. However, conventional 2-view CCA suffers from three inherent problems: 1) it fails to capture the intra-modal semantic consistency, which is a necessary element to improve the retrieval performance, 2) it is hard to learn the non-linear correlation between different modalities, and 3) there exists problem in similarity measure due to the fact that the latent space learned by CCA is not directly optimized with certain distance measure. To address above problem, in this paper, we propose an improved
CCA algorithm (ICCA) from three aspects. First, we propose two effective semantic features based on text features to improve intra-modal semantic consistency. Second, we expand traditional CCA from 2-view to 4-view, and embed 4-view CCA into a progressive framework to alleviate the over-fitting. Our progressive framework combines the training of linear projection and nonlinear hidden layers to ensure that good representations of the input raw data are learned at the output of the network. Third, inspired by large scale similarity learning (LSSL), a similarity metric is proposed to improve the distance measure. Experiments on three publicly data sets demonstrate the effectiveness of the proposed ICCA method.


【发 表 年】2017

【发 表 月】10

【类    别】模式识别


Tel: 086-010-62283118 邮编:100876
地址:北京市海淀区西土城路10号北京邮电大学教二楼多媒体中心
北京市海淀区西土城路十号113#信箱
版权所有:北京邮电大学多媒体通信与模式识别研究室 京ICP证14002347号