计算机与现代化 ›› 2022, Vol. 0 ›› Issue (01): 1-9.

• 算法设计与分析 •    下一篇

基于交互关系分组建模融合的组群行为识别算法

  

  1. (青岛科技大学信息科学技术学院,山东青岛266061)
  • 出版日期:2022-01-24 发布日期:2022-01-24
  • 作者简介:王传旭(1968—),男,山东邹城人,教授,硕士生导师,研究方向:计算机视觉,E-mail: Wangchuanxu_qd@163.com; 通信作者:刘冉(1995—),男,硕士研究生,研究方向:计算机视觉与模式识别,E-mail: liuran2016@163.com。
  • 基金资助:
    国家自然科学基金资助项目(61672305, 61802217)

Group Activity Recognition Algorithm Based on Interaction Relationship Grouping Modeling Fusion

  1. (College of Information Science and Technology, Qingdao University of Science and Technology, Qingdao 266061, China)
  • Online:2022-01-24 Published:2022-01-24

摘要: 组群成员间的交互关系建模是组群行为识别的核心技术。本文为解决复杂场景下组群关系繁琐、关系推理时复杂度高并存在信息冗余等问题,提出一种交互关系分组推理的模型。首先,利用CNN网络和RoIAlign提取视频帧中的场景信息和个人信息作为初始特征,利用个人空间坐标对人群进行二分组(例如:在Volleyball数据集中,利用参与者的bounding boxes的X坐标信息进行排序,然后为每个人建立序号ID,并从左到右将12名成员分为2组);其次,将划分后的2个局部分组以及全局场景组群,分别利用图卷积网络(Graph Convolutional Network, GCN)进行组交互关系推理,并确定各自组内的关键人物;然后,以全局关系特征作为真实值,将二分组的局部关系特征合并作为预测值,构建两者之间的交叉熵损失函数反馈优化上一级分组交互关系GCN网络,旨在确保2个分组的关键人物与全局关键人物匹配成功。再以全局交互关系中的关键人物信息为指导,分别与2个分组的关键人物进行匹配,将匹配成功后2个小组中的关键人物作为目标节点,建立组间关系图,并经GCN推理得到组间的关系特征;最后,初始特征分别与组间和全局交互关系特征融合得到2个群组行为支路,经过决策融合得到最终的识别结果。实验表明,在Volleyball数据集和NBA数据集上分别取得93.1%和48.1%的准确率。

关键词: 分组交互关系融合, 关键人物匹配, 决策融合, 组群行为识别

Abstract: The modeling of interaction relationship between group members is the core technology of group activity recognition. High complexity and information redundancy in relational reasoning are tough problems in complex scenarios when modeling its group interactions. In order to solve these problems, we propose a model of grouping interactive relation. Firstly, CNN and RoIAlign are used to extract the scene information and personal information as initial features in each frame, and the whole group is divided into two subgroups by the personal spatial coordinates (For example, in the Volleyball data set, the X coordinates of participants’bounding boxes are used to rank, then, everyone set is set up an ordinal ID and 12 people are divided into two group from left to right). Secondly, the two local groups and the global scene groups are divided, the Graph Convolutional Network (GCN) is used to deduce their interaction relationship respectively, and the key persons in each group are determined. Then, we can regard global relationship features as the real value, and merge the characteristics of local relation of two groups as predicted value. In order to match the key figures of two groups with key figures from the whole group successfully, the cross-entropy loss function is built between the two and feedback to optimize the upper-level group GCN interaction relationship network. Next, with the information of key figures in the global interaction relationship as a guide, the key figures in the two subgroups are matched respectively. After successful matching, the matched key figures in the two subgroups are taken as the target nodes to establish a relationship graph between these two subgroups, and then it is deduced by GCN. Finally, the initial features are fused with intergroup and global interaction characteristics respectively to obtain two group behavior branches, and the final recognition result is obtained through decision fusion. The experiment shows that the accuracy is 93.1% on Volleyball data set and the accuracy is 48.1% on NBA data set.

Key words: grouping interaction relationship fusion, key person matching, decision fusion, group behavior recognition