一种针对偏标记的加权k近邻分类方法

doi:10.3969/j.issn.1006-2475.2015.12.007

计算机与现代化 ›› 2015, Vol. 0 ›› Issue (12): 35-.doi: 10.3969/j.issn.1006-2475.2015.12.007

一种针对偏标记的加权k近邻分类方法

南京理工大学计算机科学与工程学院，江苏南京210094

收稿日期:2015-07-24 出版日期:2015-12-23 发布日期:2015-12-30
作者简介:梁伟超（1991-），男，江苏南京人，南京理工大学计算机科学与工程学院硕士研究生，研究方向：机器学习，数据挖掘；宋斌（1968-），男，副教授，硕士，研究方向：数据挖掘，Web信息处理。

A Weighted kNN Classification Method for Partial Labeling

School of Computer Science and Engineering， Nanjing University of Science and Technology， Nanjing 210094， China

Received:2015-07-24 Online:2015-12-23 Published:2015-12-30

摘要/Abstract

摘要：

偏标记学习不同于传统的监督学习，它是一种重要的弱监督学习框架。在该框架下，一个示例与一组候选标记相关联，其中只有一个标记是该示例的真实标记。k近邻算法是一种简单且高效的分类

算法。本文提出一种针对偏标记的加权k近邻分类方法。对于给定的一个未见示例，该方法首先在训练集中寻找与未见示例距离最近的k个样本，然后通过求解一个二次规划问题来获得各个近邻样本的权

值，最后采用多数表决原则决定未见示例的标记。实验结果表明，该方法可以有效地提升学习系统的泛化性能。

关键词: 机器学习, 数据挖掘, 偏标记学习, k近邻, 权值估计

Abstract:

As one of the important weaklysupervised machine learning frameworks, partial label learning is different from traditional supervised learning. Under this framework,

an instance might be associated with a set of candidate labels among which only one is valid. The knearest neighbor method is simple but effective for classification. In this

paper, we propose a weighted kNN partial labeling classification method. Firstly, for an unseen instance, it will try to find k nearest neighbors of the unseen instance in

training set. Secondly, the weight of every nearest neighbor is determined by solving a quadratic programming problem. Lastly, the label of the unseen instance is decided in

accordance with the principle of decision by majority. Extensive experiments show that the proposed method can effectively improve the generalization performance of the learning

system.

Key words: machine learning, data mining, partial label learning, k-nearest neighbor, weight estimation

中图分类号:

TP181

梁伟超，宋斌. 一种针对偏标记的加权k近邻分类方法[J]. 计算机与现代化, 2015, 0(12): 35-.

LIANG Weichao， SONG Bin. A Weighted kNN Classification Method for Partial Labeling[J]. Computer and Modernization, 2015, 0(12): 35-.

参考文献

1］Grandvalet Y. Logistic regression for partial labels ［C］// Proceedings of the 9th International Conference on Information Processing and Management of Uncertainty in

KnowledgeBased Systems. Annecy. 2002:1935-1941.
［2］Cour T, Sapp B, Taskar B. Learning from partial labels［J］. Journal of Machine Learning Research, 2011,12(5):1501-1536.
［3］张敏灵. 偏标记学习研究综述［J］. 数据采集与处理, 2015,30(1):77-87.
［4］Jin R, Ghahramani Z. Learning with multiple labels［C］// Advances in Neural Information Processing System. 2003:897-904.
［5］Nguyen N, Caruana R. Classification with partial labels［C］// Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2008:381

-389.
［6］Liu Liping, Dietterich T G. A conditional multinomial mixture model for superset label learning［C］// Advances in Neural Information Processing System. 2012:557565.
［7］Hullermeier E, Beringer J. Learning from ambiguously labeled examples［J］. Intelligent Data Analysis, 2006,10(5):419-439.
［8］Cover T M, Hart P E. Nearest neighbor pattern classification［J］. IEEE Transactions on Information Theory, 1967,13(1):21-27.
［9］Duda R O, Hart P E. Pattern Classification and Scene Analysis［M］. New York: Wiley, 1973.
［10］Wu Xindong, Kumar V, Quinlan J R, et al. Top 10 algorithms in data mining［J］. Knowledge and Information Systems, 2008,14(1):1-37.
［11］Duduni S A. The distanceweighted knearest neighbor rule［J］. IEEE Transactions on Systems, Man, and Cybernetics, 1976,6(2):325-327.
［12］Macleod J E S, Luk A, Titterington D M. A reexamination of the distanceweighted knearest neighbor classification rule［J］. IEEE Transactions on Systems, Man, and

Cybernetics, 1987,17(4):689696.

［13］Zuo Wangmeng, Zhang D, Wang Kuanquan. On kernel differenceweighted knearest neighbor classification［J］. Pattern Analysis and Applications, 2008,11(34):247-

257.
［14］Lichman M. UCI Machine Learning Repository［EB/OL］. http://archive.ics.uci.edu/ml, 2013-04-04.
［15］Zhang Minling. Solving the partial label learning problem: An instancebased approach［C］// Proceedings of the 24th International Joint Conference on Artificial

Intelligence. 2015:4048-4054.
［16］Zhang Minling. Disambiguationfree partial label learning［C］// Proceedings of the 14th SIAM International Conference on Data Mining. 2014:37-45.
［17］Chen Yichen, Patel V M, Chellappa R, et al. On kernel differenceweighted knearest neighbor classification［J］. IEEE Transactions on Information Forensics and

Security, 2014,12(9):2076-2088.

[1]	王梦溪, 李峻. 老年人跌倒检测技术研究综述[J]. 计算机与现代化, 2024, 0(08): 30-36.
[2]	袁红伟1, 常利军1, 郝家欢2, 樊娜2, 王超2, 罗闯2, 张泽辉2. 基于标签传播的轨迹兴趣点挖掘及隐私保护[J]. 计算机与现代化, 2024, 0(05): 46-54.
[3]	贾潇瑶, . 融合CatBoost和SHAP的乳腺癌预测及特征分析[J]. 计算机与现代化, 2023, 0(10): 32-38.
[4]	谢仕斌, 刘梦赤, 唐诗琪, 周瑞平, . 基于多特征提取的时间卷积知识追踪模型[J]. 计算机与现代化, 2023, 0(07): 25-29.
[5]	刘佩. 基于数据挖掘的医保控费系统[J]. 计算机与现代化, 2023, 0(06): 89-94.
[6]	张芸, 白开峰, 王星, 仓甜, 周通, 段锦文, 苏晗. 智能电网环境下窃电行为检测[J]. 计算机与现代化, 2023, 0(03): 60-65.
[7]	王劭华, 欧阳会丹, 孙丹, 王康, 吴鸿萍, 钟询, 褚兴平, 杨松涛. 基于Apriori算法的大学生体测项目关联规则挖掘[J]. 计算机与现代化, 2023, 0(03): 66-70.
[8]	石志伟, 武志峰, 张哲. 纠正学习策略下LightGBM-GRU模型的股票波动率预测[J]. 计算机与现代化, 2023, 0(01): 95-102.
[9]	宋晓丽, 张勇波, 张培颖. 基于半监督学习的学生消费数据异常检测[J]. 计算机与现代化, 2022, 0(12): 13-17.
[10]	关云鹏, 刘玉龙. 基于从共现矩阵提取关联的类别型数据聚类[J]. 计算机与现代化, 2022, 0(11): 1-8.
[11]	段桂芹, 邹臣嵩. 基于近邻传播聚类的职业能力评价模型[J]. 计算机与现代化, 2022, 0(05): 21-27.
[12]	冷涛, . 基于深度学习的加密流量分类研究综述[J]. 计算机与现代化, 2021, 0(08): 112-120.
[13]	邓子云, . 一种为辅助诊断筛选机器学习模型的方法[J]. 计算机与现代化, 2021, 0(03): 88-93.
[14]	杨琳, 白钊, 寇勇刚. 基于RFM模型的随机森林算法对民航客户的流失分析[J]. 计算机与现代化, 2021, 0(01): 100-104.
[15]	郭欣, 陈瑛, 章鸣嬛, 张璇, 潘曙明, 汤璐佳. 利用机器学习方法对灾难生命支持课程NDLS培训效果进行分析预测#br#[J]. 计算机与现代化, 2020, 0(12): 61-66.

一种针对偏标记的加权k近邻分类方法

A Weighted kNN Classification Method for Partial Labeling

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

本文评价