计算机与现代化 ›› 2022, Vol. 0 ›› Issue (09): 85-92.

• 图像处理 • 上一篇    下一篇

融合CNN与交互特征的多标签图像分类方法

  

  1. (上海海事大学信息工程学院,上海201306)
  • 出版日期:2022-09-22 发布日期:2022-09-22
  • 作者简介:王盼红(1998—),女,安徽铜陵人,硕士研究生,研究方向:机器学习与图像处理,E-mail: wph980310@163.com; 朱昌明(1988—),男,上海浦东新区人,副教授,博士,研究方向:图像处理与多视角学习,E-mail: 252213097@qq.com。
  • 基金资助:
    中国博士后科学基金资助项目(2019M651576); 国家自然科学基金资助项目(61602296); 上海市自然科学基金资助项目(16ZR1414500); 上海市教育发展基金会和上海市教育委员会“晨光计划”项目(18CG54)

Multi-label Image Classification Method Combined CNN and Interactive Features

  1. (College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China)
  • Online:2022-09-22 Published:2022-09-22

摘要: 图像在日常生活中广泛存在,图像分类具有重要的现实意义。针对当前多标签图像分类中因神经网络模型复杂以及提取到的图像特征信息不足而导致分类准确率较低、计算复杂度高等问题,提出一种融合卷积神经网络与交互特征的多标签分类方法,即MLCNN-IF模型。MLCNN-IF模型主要分成2步,首先参考传统CNN基本结构搭建一个仅有9层的轻量级神经网络(MLCNN),用于处理图像数据并提取特征;其次基于MLCNN提取的特征,通过交互特征方法产生各独立特征的组合特征,以此获得新的更丰富的特征集。实验结果表明,MLCNN-IF模型对比AlexNet、GoogLeNet和VGG16在4种多标签图像数据集上取得了更好的分类结果,其准确率和精准率分别平均提高9%和4.8%;同时MLCNN网络结构相对更简洁,有效降低了模型参数量和时间复杂度。

关键词: 卷积神经网络, 多标签学习, 深度学习, 图像分类, 交互特征

Abstract: Images exist widely in daily life, and image classification is of great practical significance. Aiming at the problems of low classification accuracy and high computational complexity in current multi-label image classification due to the complexity of the neural network model and the insufficient of extracted image feature information, a multi-label classification method combined CNN and interactive features, namely MLCNN-IF model, is proposed. The model is mainly divided into two steps. Firstly, a lightweight neural network (MLCNN) with only 9 layers is built with reference to the basic structure of traditional CNN, which is used to process image data and extract features. Secondly, based on the features extracted by MLCNN, the combined features of independent features are generated by the interactive feature method, so as to obtain a new and richer feature set. The experimental results show that compared with AlexNet, GoogLeNet and VGG16, the proposed model achieves better classification results on four multi-label image datasets, and its accuracy and precision rate are increased by 9% and 4.8% respectively on average. At the same time, the MLCNN network structure is relatively simpler, which effectively reduces the amount of model parameters and time complexity.

Key words: convolutional neural network, multi-label learning, deep learning, image classification, interactive feature