计算机与现代化

• 算法设计与分析 • 上一篇    下一篇

基于AdaBoost回归树的多目标预测算法

  

  1. 1.北京交通大学计算机与信息技术学院,北京  100044;  2.交通数据分析与挖掘北京市重点实验室,北京  100044
  • 收稿日期:2017-02-16 出版日期:2017-09-20 发布日期:2017-09-19
  • 作者简介:张晶(1991-),女,河北任县人,北京交通大学计算机与信息技术学院硕士研究生,研究方向:机器学习。

Multi-target Prediction Algorithm Based on AdaBoost Regression Tree

  1. 1. School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044, China; 
    2. Beijing Key Lab of Traffic Data Analysis and Mining, Beijing 100044, China
  • Received:2017-02-16 Online:2017-09-20 Published:2017-09-19

摘要: 预测问题通常涉及相同的输入变量同时预测多个目标变量。当目标变量为二进制时,预测任务被称为多标签分类;当目标变量为实值时,预测任务称为多目标预测。本文提出2种新的多目标回归方法:多目标堆叠(Multi-Target Stacking, MTS)和集成回归链(Ensemble of Regressor Chains, ERC)。灵感来自2种流行的多标签分类方法。MTS和ERC在第一阶段的训练,都将采用基于回归树AdaBoost算法(ART)建立的单目标预测(Single-Target Prediction)模型作为基准方法;在第二阶段的训练,MTS和ERC都通过额外加入第一阶段的目标预测值作为输入变量来扩展第二阶段的输入变量空间,以此建立多目标预测模型。这2种方法都利用目标变量之间的关系,不同的是,ERC除了考虑目标的依赖性关系外还考虑了目标的顺序问题。此外,总结了MTS和ERC这2种方法的缺点,并且对算法进行修改,提出了相应的改进版本MTS Corrected(MTSC)和ERC Corrected(ERCC)。实验结果表明,修改后的回归链ART-ERCC算法在多目标预测问题中表现最好。

关键词: 多目标预测, 多标签分类, 单目标预测, 回归链, 堆叠泛化

Abstract: Real word prediction problems typically involve the simultaneous prediction of multiple target variables using the same set of predictive variables. When the target variables are binary, the prediction task is called multi-label classification, and when the target variable is real-valued, it is called multi-target prediction. This paper puts forward two new multi-target regression algorithms: Multi-Target Stacking(MTS) and Ensemble of Regressor Chains(ERC) which are inspired by two popular multi-label classification methods. Both MTS and ERC have the same baseline method, which are based on the single-target (ST) prediction model that is established by using 100 regression tree AdaBoost iterative algorithms. However, MTS and ERC extend the input space of the second stage by adding the target prediction value of the first stage. Both methods take into account the dependencies between the target variables, besides, ERC takes into account the order selection between targets. In addition, we also summarize the shortcomings of MTS and ERC methods, and propose the corrected versions denoted as MTS Corrected (MTSC) and ERC Corrected (ERCC). Experimental results show that the modified regression chain ERCC performs best in multi-objective prediction problems.

Key words: multi-target prediction, multi-label classification, single-target prediction, regressor chains, stacking

中图分类号: