计算机与现代化

• 人工智能 • 上一篇    下一篇

基于卷积神经网络模型的互联网短文本情感分类

  

  1. 1.中原工学院计算机学院,河南郑州450007; 2.计算机信息系统安全评估河南省工程实验室,河南郑州450007
  • 收稿日期:2016-08-17 出版日期:2017-04-20 发布日期:2017-05-08
  • 作者简介:刘小明(1979-),男,河南许昌人,中原工学院计算机学院讲师,博士,研究方向:机器学习,自然语言处理; 张英(1992-),女,河南洛阳人,硕士研究生,研究方向:自然语言处理,情感分析; 郑秋生(1965-),男,河南郑州人,教授,硕士,研究方向:信息安全,数据资源管理。
  • 基金资助:
    国家自然科学基金资助项目(U1304611); 河南省科技攻关计划项目(132102310284); 河南省教育厅科学技术研究重点项目(14A520015); 郑州市科技攻关项目(131PPTGG416-4)

Sentiment Classification of Short Texts on Internet Based on Convolutional Neural Networks Model

  1. 1. School of Computer Science, Zhongyuan University of Technology, Zhengzhou 450007, China;
    2. Henan Province Engineering Laboratory of Computer Information System Security Assessment, Zhengzhou 450007, China
  • Received:2016-08-17 Online:2017-04-20 Published:2017-05-08

摘要: 情感分类旨在发现用户对热点事件的观点态度,但由于现今互联网短文本格式随意,语言规范性不够,所以目前传统方法的情感分类效果并不理想。面向大数据互联网短文本信息,本文提出一种基于深度卷积神经网络(Convolutional Neural Networks,CNNs)模型的互联网短文本分类。首先选择词向量作为原始特征,然后通过卷积神经网络进一步提取特征,最后训练出基于深度卷积神经网络的互联网短文本情感分类模型。实验结果表明,该模型不仅可以有效处理互联网短文本中的情感分类这一任务,而且明显提高了情感分类的准确率,平均提高约5%。

关键词: 互联网短文本, 情感分类, 卷积神经网络, 自然语言处理, 深度学习

Abstract: Sentiment classification aims to find the users views on hot issues, but now the format of the short texts on the Internet is not normative, the effect of traditional sentiment classification method is not ideal. Facing the information of the short texts on the Internet of big data, this paper puts forward a deep convolution neural network (CNNs) model of the short text on the Internet. First it uses the Skip-gram in the Word2vec training model as the feature vector, then further extracts feature vector into CNNs, finally trains the classification model of the depth convolution neural network. The experimental results show that, compared with classification methods of traditional machine learning, this method not only can effectively handle emotion classification of the short texts on the Internet, but also improves the accuracy of emotion classification significantly, the average increased by about 5%。

Key words: short texts on the Internet, sentiment classification, convolutional neural networks (CNNs), natural language processing, deep learning

中图分类号: