计算机与现代化

• 信息安全 • 上一篇    下一篇

基于差分隐私的决策树发布技术研究

  

  1. 东华大学计算机科学与技术学院,上海201620
  • 收稿日期:2016-08-26 出版日期:2017-03-29 发布日期:2017-03-30
  • 作者简介:陈杨(1992-),女,安徽宿州人,东华大学计算机科学与技术学院硕士研究生,研究方向:差分隐私保护,数据库与数据仓库技术; 于守健(1976-),男,山东威海人,副教授,博士,研 究方向:数据库与数据仓库技术,Web服务。

Research on Differential Privaty for Decision Tree Release Technology

  1. College of Computer Science and Technology, Donghua University, Shanghai 201620, China
  • Received:2016-08-26 Online:2017-03-29 Published:2017-03-30

摘要:

近年来,大数据所带来的隐私泄露问题日趋严重,如何在保护数据隐私的同时保留足够的信息进行数据分析是研究者面临的重要挑战。针对数据分析过程中可能产生的隐私泄露问题,提出一种基
于差分隐私的决策树发布法,该算法是基于非交互模型,利用指数机制保证细分属性的选择满足差分隐私保护,根据数据集的特点自适应分配隐私预算,相比已有的算法隐私预算分配更合理,决策树的
分类准确性更高。实验结果验证了本算法的优越性。

关键词: 泛化, 差分隐私, 决策树, 数据发布, 隐私保护

Abstract:

Privacy disclosure issue is becoming more and more serious due to big data. We proposed a differential private generalization data publishing algorithm for decision
tree. The algorithm is based on the noninteractive model, in the process of attribute segmentation by combining similar branch to reduce the overall noise, and keeps more of
original information, so it can improve the accuracy of classification. According to the need of exponential mechanism, adaptive allocation privacy budget, compared with the
previous algorithms, under the condition of the same privacy budget it can make the data more differentiated, and decision tree classification accuracy is higher. The
experimental results also prove the validity and superiority of this algorithm.

Key words:  , generalization; differential privacy; decision tree; data release; privacy preserving

中图分类号: