计算机与现代化 ›› 2010, Vol. 1 ›› Issue (10): 20-22,4.doi: 10.3969/j.issn.1006-2475.2010.10.006

• 人工智能 • 上一篇    下一篇

KNN系数修正迭代求精算法

许朝阳   

  1. 莆田学院电子信息工程学系,福建 莆田 351100
  • 收稿日期:2010-06-09 修回日期:1900-01-01 出版日期:2010-10-21 发布日期:2010-10-21

Parameter Iteratively Modified-KNN

XU Chao-yang   

  1. Department of Electronic and Information Engineering, Putian College, Putian 351100, China
  • Received:2010-06-09 Revised:1900-01-01 Online:2010-10-21 Published:2010-10-21

摘要: 随着WWW的迅猛发展,文本分类成为处理和组织大量文档数据的关键技术。KNN方法是一种简单、有效、非参数的分类方法。本文提出利用KNN分类器的封闭测试的结果对分类器进行调整修正系数的算法PIM-KNN(Parameter Iteratively Modified-KNN):错误分类的样本应该拉近与所属类别的“距离”,而增大与被误判的类别的“距离”。实验结果表明,经过PIM-KNN算法调整的KNN分类器的分类效果得到显著提高。

关键词: 文本分类, K近邻, 迭代, 距离

Abstract: With the development of World Wide Web, text classification has become a key technology in organizing and processing large amount of document data. It’s a simple, effective and nonparametric classification method. This paper proposes an algorithm PIM-KNN(Parameter Iteratively Modified-KNN) to adjust parameter in classifier according to results of close test of the KNN algorithm: the sample of wrong judged should reduce the distance between itself and the class which it belongs to, enlarge the distance between itself and the class which wrong judged. The experiments results show that the classification results can be improved significantly by adjusting parameter of the PIM-KNN.

Key words: text classification, KNN, iterative, distance

中图分类号: