计算机与现代化

• 人工智能 • 上一篇    下一篇

基于深度强化学习DDPG算法的投资组合管理

  

  1. (1.南开大学商学院,天津300071; 2.南开大学中国公司治理研究院,天津300071; 
      3.中国特色社会主义经济建设协同创新中心,天津300071)
  • 收稿日期:2018-02-08 出版日期:2018-06-13 发布日期:2018-06-13
  • 作者简介:齐岳(1970-),男,天津人,南开大学商学院、南开大学中国公司治理研究院、中国特色社会主义经济建设协同创新中心教授,博士,研究方向:投资组合管理,基金管理; 黄硕华(1996-),女,北京人,本科生,研究方向:计算机技术应用,投资组合管理。
  • 基金资助:
    国家自然科学基金重点资助项目(71533002); 教育部人文社会科学重点研究基地重大项目(16JJD630003)

 Portfolio Management Based on DDPG Algorithm of Deep Reinforcement Learning

  1.  (1. School of Business, Nankai University, Tianjing 300071, China; 
      2. China Institute of Corporate Governance, Nankai University, Tianjing 300071, China; 
      3. Socialist Economic Construction with Chinese Characteristics Collaborative Innovation Center, Tianjing 300071, China)
  • Received:2018-02-08 Online:2018-06-13 Published:2018-06-13

摘要: 将深度强化学习技术应用于投资组合管理,采用深度强化学习中的深度确定性策略梯度DDPG(Deep Deterministic Policy Gradient)算法,通过限制单只股票的投资权重,分散风险,并采用丢弃算法(Dropout),即在训练模型时随机丢弃节点,解决过拟合问题。以中国股市为例,选取16只中证100指数成分股作为风险资产进行实验。结果表明,本文基于深度强化学习方法构建的投资组合,在实验期间的价值增幅显著高于对照组(等权重组合),2年达到65%,约为对照组的2.5倍,表明了本文方法的有效性。而且通过进一步实验,表明了当用于训练的数据离测试数据时间越近,则本文构建的投资组合表现越好。

关键词: 深度强化学习, 深度确定性策略梯度, 投资组合管理

Abstract: This paper applies DRL(Deep Reinforcement Learning) technology to portfolio management and adopts DDPG (Deep Deterministic Policy Gradient) algorithm. By limiting the weight of individual stock, risk diversification is achieved and by using Dropout, that is, randomly dropping some nodes when training models, over-fitting problems are solved. Taking Chinese stock market as an example, this paper selects 16 of China securities 100 index constituent stocks as risky assets. The experimental results show that the 2-year accumulative return rate of the portfolio constructed in this paper reaches 65%, which is about 2.5 times of that of the control group(a portfolio with weights evenly distributed among the same 16 stocks). This strongly indicates the effectiveness of the method. Moreover, through further experiments, this paper indicates that the closer the data used for training is from the test data, the better the performance of the portfolio constructed in this paper.

Key words: deep reinforcement learning(DRL), deep deterministic policy gradient(DDPG), portfolio management

中图分类号: