计算机与现代化 ›› 2020, Vol. 0 ›› Issue (11): 28-32.

• 人工智能 • 上一篇    下一篇

基于卷积神经网络的“斗地主”策略

  

  1. (贵州大学计算机科学与技术学院,贵州贵阳550025)
  • 出版日期:2020-12-03 发布日期:2020-12-03
  • 作者简介:徐方婧(1994—),女,贵州遵义人,硕士研究生,研究方向:人工智能与模式识别,深度学习,E-mail: 496949302@qq.com; 魏鲲鹏(1985—),男,河南新乡人,硕士,研究方向:人工智能与模式识别,E-mail: weikunpeng@cmdi.chinamobile.com; 王以松(1975—),男,教授,博士生导师,博士,研究方向:知识表示与推理,人工智能,机器学习,E-mail: yswang@gzu.edu.cn; 彭啟文(1995—),男,贵州织金人,硕士研究生,研究方向:人工智能与模式识别,E-mail: 937356655@qq.com; 于小民(1988—),男,河北唐山人,博士研究生,研究方向:人工智能与模式识别,深度强化学习,E-mail: 1031450835@qq.com。
  • 基金资助:
    国家自然科学基金资助项目(61976065)

Strategy of “Fighting the Landlord” Based on Deep Convolutional Neural Network

  1. (College of Computer Science and Technology, Guizhou University, Guiyang 550025, China)
  • Online:2020-12-03 Published:2020-12-03

摘要: 深度神经网络已经在国外的各种博弈中取得了惊人的成就,近几年,卷积神经网络因为其独特的单元结构获得了极大的关注,被频频运用到博弈AI智能体中,例如AlphaGo、冷扑大师等。而“斗地主”是典型的基于非完备信息的合作对抗博弈。本文设计一种7层卷积神经网络DDZ-CNN,用基于蒙特卡洛树“斗地主”自我博弈的近30万条数据来训练该网络以学习“斗地主”策略,训练过程中采用基于权重的方式对训练数据进行下采样以克服其分布不均的问题,而且网络能较快收敛。最后将训练好的模型与智能MCTS模型和真人进行了实战对抗,取得了不错的胜率,验证了本文算法的有效性与可行性。


关键词: 非完备信息博弈, 卷积神经网络, “斗地主”策略, 非均匀分布

Abstract: Deep neural network has made amazing achievements in various foreign games. In recent years, convolutional neural network has gained great attention because of its unique unit structure, and has been frequently used in game AI agents, such as AlphaGo and Cold Flutter Masters. “Fighting the Landlord” is a typical cooperative game based on incomplete information. In this paper, a 7-layer convolutional neural network DDZ-CNN is designed to train the network with nearly 300,000 pieces of data based on the self-gaming of “Fighting the Landlord” based on Monte Carlo tree to learn the “Fighting the Landlord” strategy. In the training process, the training data are down sampled by a weight-based method to overcome the problem of uneven distribution, and the network can converge quickly. Finally, the trained model is combated with intelligent MCTS model and real person, and a good winning rate is obtained, which verifies the effectiveness and feasibility of the algorithm in this paper.

Key words: imperfect information game, convolutional neural network, “Fighting the Landlord” strategy, nonuniform distribution