计算机与现代化

• 信息系统 • 上一篇    下一篇

基于Ksupport稀疏逻辑回归的停电敏感度预测

  

  1. (1.国网河南省电力公司电力科学研究院,河南郑州450052;
    2.南京信息工程大学信息与控制学院,江苏南京210044)
  • 出版日期:2018-04-28 发布日期:2018-05-02
  • 作者简介:耿俊成(1985),男,河南西平人,国网河南省电力公司电力科学研究院高级工程师,硕士,研究方向:配用电大数据应用; 张小斐(1976),男,高级工程师,本科,研究方向:电网大数据分析与挖掘; 孙玉宝(1983),男,南京信息工程大学信息与控制学院副教授,博士,研究方向:高维数据分析,电力大数据处理; 吴博(1984),男,高级工程师,硕士,研究方向:运检信息化,配电自动化; 周强(1991),男,硕士研究生,研究方向:机器学习,电力数据分类与分析。
  • 基金资助:
    国家电网公司2016年总部科技项目(521820140017); 国家自然科学基金资助项目(61300162)

Power Failure Sensitivity Prediction Algorithm Using Ksupport Sparse Logistic Regression

  1. (1. Electric Power Research Institute, State Grid Henan Electric Power Company, Zhengzhou 450052, China;
    2.School of Information and Control, Nanjing University of Information Science and Technology, Nanjing 210044, China) 
  • Online:2018-04-28 Published:2018-05-02

摘要: 有效预测停电敏感度高的客户,可为电力服务部门开展精准营销和差异化服务提供数据与决策支持。本文提出一种基于ksupport稀疏逻辑回归的客户停电敏感度评价算法。不同于常用的l1范数,ksupport范数是对l0范数更为紧致的凸松弛,并能够同时选择多个关联性强的因子进行预测,有利于提升预测准确性。算法首先从客户基本信息、用电信息、 缴费信息、95598工单、停电事件等多个维度筛选用于敏感性预测的自变量(因素),收集各用户的因素信息形成样本数据集。进一步构建停电敏感性预测的ksupport稀疏逻辑回归模型,建立模型快速求解的前向后向算子分裂迭代优化算法,转化为2个子问题的快速迭代。通过优势分析法确定回归模型中对目标变量具有显著影响的自变量因素。运用某省级电网公司近百万客户数据对建立的预测模型进行校验与评估,达到良好的预测准确率,实验结果验证了本文模型的有效性。

关键词: 停电敏感度, 逻辑回归, ksupport稀疏, 优势分析, 算子分裂

Abstract: The prediction of customers with high sensitivity of electric power failure can provide data and decision support for the electric power service departments to offer precision marketing and differentiated services. With regard to the electric power failure sensitivity problem, we propose the electric power failure sensitivity assessment algorithm using ksupport norm regularized logistic regression. Different from the normal l1 norm, ksupport norm is the tighter convex relaxation of l0 norm on the Euclidean norm unit ball and able to select multiple correlated variables to predict the response, which can promote the accuracy of predicted results. Firstly, the variables or factors for predicting response are selected from multiple aspects including the customer information, electric consuming information, electrical bill information, 95598 work sheet, power failure events, etc. The sample set is constructed by collecting the variable information of each consumer. Secondly, ksupport norm regularized logistic regression model is used to predict customers with high sensitivity of electric failure. In terms of forwardbackward operator splitting, an iterative optimization algorithm is also proposed to decompose the original problem into two subproblems and solve the model effectively. Furthermore, dominance analysis method is adopted to identify the importance of each variable for predicting the response result. The model is validated by using about one million customer data from a province supply board and has good prediction accuracy. The experimental results demonstrate the effectiveness of our prediction model.

Key words:  electric power failure sensitivity, logistic regression, ksupport sparse, dominance analysis, operator splitting

中图分类号: