Computer and Modernization ›› 2021, Vol. 0 ›› Issue (11): 1-6.
Online:
2021-12-13
Published:
2021-12-13
WANG Tian-xing, YUAN Jia-bin, LIU Xin. Approach for Visual Question Answering Based on Equal Attention Graph Networks[J]. Computer and Modernization, 2021, 0(11): 1-6.
[1] | 葛梦颖,孙宝山. 基于深度学习的视觉问答系统[J]. 现代信息科技, 2019,3(11):11-13. |
[2] | 杨睿,刘瑞军,师于茜,等. 面向智能交互的视觉问答研究综述[J]. 电子测量与仪器学报, 2019,33(2):117-124. |
[3] | ANTOL S, AGRAWAL A, LU J S, et al. VQA: Visual question answering[C]// Proceedings of the 2015 IEEE International Conference on Computer Vision. 2015:2425-2433. |
[4] | YANG Z C, HE X D, GAO J F, et al. Stacked attention networks for image question answering[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. 2016:21-29. |
[5] | XU D F, ZHU Y K, CHOY C B, et al. Scene graph generation by iterative message passing[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. 2017:3097-3106. |
[6] | TENEY D, LIU L Q, VAN DEN HENGEL A. Graph-structured representations for visual question answering[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. 2017:3233-3241. |
[7] | HAURILET M, ROITBERG A, STIEFELHAGEN R. It’s not about the journey; It’s about the destination: Following soft paths under question-guidance for visual reasoning[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019:1930-1939. |
[8] | SHI J X, ZHANG H W, LI J Z. Explainable and explicit visual reasoning over scene graphs[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019:8368-8376. |
[9] | 于东飞. 基于注意力机制与高层语义的视觉问答研究[D]. 合肥:中国科学技术大学, 2019. |
[10] | REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[C]// Proceedings of the 2015 Advances in Neural Information Processing Systems. 2015:91-99. |
[11] | PENNINGTON J, SOCHER R, MANNING C D. Glove: Global vectors for word representation[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2014:1532-1543. |
[12] | CHO K, VAN MERRIENBOER B, GULCEHRE C, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2014:1724-1734. |
[13] | ZHANG Y, HARE J, PRGEL-BENNETT A. Learning to count objects in natural images for visual question answering[C]// 2018 International Conference on Learning Representations. 2018. https://openreview.net/forum?id=B12Js_yRb. |
[14] | HUDSON D A, MANNING C D. GQA: A new dataset for real-world visual reasoning and compositional question answering[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019:6693-6702. |
[15] | KRISHNA R, ZHU Y, GROTH O, et al. Visual genome: Connecting language and vision using crowdsourced dense image annotations[J]. International Journal of Computer Vision, 2017,123(1):32-73. |
[16] | GOYAL Y, KHOT T, SUMMERS-STAY D, et al. Making the V in VQA matter: Elevating the role of image understanding in visual question answering[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. 2017:6325-6334. |
[17] | LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft coco: Common objects in context[C]// 2014 European Conference on Computer Vision. 2014:740-755. |
[18] | ANDERSON P, HE X D, BUEHLER C, et al. Bottom-up and top-down attention for image captioning and visual question answering[C]// Proceedings of the 2018 IEEE/CVF conference on Computer Vision and Pattern Recognition. 2018:6077-6086. |
[19] | HUDSON D A, MANNING C D. Compositional attention networks for machine reasoning[C]// 2018 International Conference on Learning Representations. 2018. https://arxiv.org/abs/1803.03067. |
[20] | FUKUI A, PARK D H, YANG D, et al. Multimodal compact bilinear pooling for visual question answering and visual grounding[C]// Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. 2016:457-468. |
[21] | ILIEVSKI I, FENG J S. Multimodal learning and reasoning for visual question answering[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017:551-562. |
[22] | LU J S, YANG J W, BATRA D, et al. Hierarchical question-image co-attention for visual question answering[C]// Proceedings of the 30th International Conference on Neural Information Processing Systems. 2016:289-297. |
[23] | YU D F, FU J L, MEI T, et al. Multi-level attention networks for visual question answering[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. 2017:4187-4195. |
[24] | NGUYEN D K, OKATANI T. Improved fusion of visual and language representations by dense symmetric co-attention for visual question answering[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018:6087-6096. |
[1] | LIANG Tian-kai, HUANG Kang-hua, LIU Kai-hang, LAN Lan, ZENG Bi. Deep Federated Image Classification Method Based on Bilateral Homomorphic Encryption [J]. Computer and Modernization, 2023, 0(12): 36-40. |
[2] | WU Song-lin, ZHANG Guo-wei, LU Qiu-hong, SHI Jian-zhuang, HUANG Wei. Laser SLAM Mapping Method Based on Visual Information [J]. Computer and Modernization, 2023, 0(02): 17-23. |
[3] | WANG Hao-chang, LIU Ru-yi. Review of Relation Extraction Based on Pre-training Language Model [J]. Computer and Modernization, 2023, 0(01): 49-57. |
[4] | HUANG Yan-hui, LAN Hai, WEI Xian. Lightweight Vision Transformer Based on Separable Structured Transformations [J]. Computer and Modernization, 2022, 0(10): 75-81. |
[5] | LI Wei-qiang, WANG Dong, NING Zheng-tong, LU Ming-liang, QIN Peng-fei. Survey of Fruit Object Detection Algorithms in Computer Vision [J]. Computer and Modernization, 2022, 0(06): 87-95. |
[6] | ZHENG Xin-yue, REN Jun-chao. Intention Recognition and Classification Based on BERT-FNN [J]. Computer and Modernization, 2021, 0(07): 71-76. |
[7] | GAO Yi-fan, WANG Yong. An Image Description Algorithm Based on Object Detection and Part of Speech Analysis [J]. Computer and Modernization, 2021, 0(03): 108-114. |
[8] | ZHOU Jing-wei, HAN Li-xin, LI Xiao-shuang. Target Tracking Algorithm Based on Multimodal Data [J]. Computer and Modernization, 2020, 0(11): 16-22. |
[9] | ZHU Da-qing, CAO Guo. Particle Size Detection of Sandstone Images Based on Full Convolutional Network [J]. Computer and Modernization, 2020, 0(07): 111-116. |
[10] | CHEN Chuan, CHEN Zhe, DING Shuang-hui. Innovation of Computer Vision Teaching Contents Under Development of Deep Learning [J]. Computer and Modernization, 2020, 0(06): 107-. |
[11] | WU Shi-hai, BAO Yi-dong, CHEN Guo, CHEN Qiu-shi. Reduced-reference Crop Image Quality Assessment Based on Random Gabor Feature [J]. Computer and Modernization, 2020, 0(05): 70-. |
[12] | CAO Yan, LI Huan, WANG Tian-bao. A Survey of Research on Target Detection Algorithms Based on Deep Learning [J]. Computer and Modernization, 2020, 0(05): 63-. |
[13] | LUO Wei, LIANG Shi-hao, JIANG Xin, AN Ni, DU Rui. Crack Recognition of Outcrop Area Based on Deep Learning [J]. Computer and Modernization, 2020, 0(05): 56-. |
[14] | LI Fan-ruo, HAN Ying, DAI Guang-bin, FENG Tian-ge, HU Lin. Application Research and Realization of Character Recognition in Virtual Tour Guide System [J]. Computer and Modernization, 2019, 0(12): 83-. |
[15] | WANG Wen, XU Yi-bai, LU Shan, FENG Yu. A SLAM Technology Combining Area Detection and Semantic Segmentation [J]. Computer and Modernization, 2019, 0(07): 55-. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||