[1] DENG J, DONG W, SOCHER R, et al. ImageNet: A large-scale hierarchical image database[C]// Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2009:248-255.
[2] KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[C]// Proceedings of International Conference on Neural Information Processing Systems. 2012:1097-1105.
[3] SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[C]// IEEE International Conference on Learning Representations. 2015:1264-1278.
[4] SZEGEDY C, LIU W, JIA Y, et al. Going deeper with convolutions[C]// Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2015:1-9.
[5] HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]// Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2016:770-778.
[6] GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]// Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2014:580-587.
[7] HE K, ZHANG X, REN S, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[C]// Proceedings of European Conference on Computer Vision. 2014:346-361.
[8] GIRSHICK R. Fast R-CNN[C]// Proceedings of IEEE International Conference on Computer Vision. 2015:1440-1448.
[9] REN S, HE K, GIRSHICK R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[C]// Proceedings of the Advances in Neural Information Processing Systems. 2015:91-99.
[10]LIU W, ANGUELOV D, ERHAN D, et al. SSD: Single shot multibox detector[C]// Proceedings of European Conference on Computer Vision. 2016:21-37.
[11]REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: Unified, real-time object detection[C]// Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2016:779-788.
[12]LIN T Y, DOLLR P, GIRSHICK R B, et al. Feature pyramid networks for object detection[C]// Proceedings of IEEE Conference on Computer Vision and Image Processing. 2017:936-944.
[13]ZHANG S, WEN L, BIAN X, et al. Singleshot refinement neural network for object detection[C]// Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2018:4203-4212.
[14]EVERINGHAM M, VAN GOOL L, WILLIAMS C K I, et al. The Pascal Visual Object Classes(VOC) challenge[J]. International Journal of Computer Vision, 2010,88(2):303-338.
[15]EVERINGHAM M, ESLAMI S A, VAN GOOL L, et al. The Pascal visual object classes challenge: A retrospective[J]. International Journal of Computer Vision, 2015,111(1):98-136.
[16]LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: Common objects in context[C]// Proceedings of European Conference on Computer Vision. 2014:740-755.
[17]LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C]// Proceedings of IEEE Conference on Computer Vision. 2017:2999-3007.
[18]LI Y, CHEN Y, WANG N, et al. Scale-aware trident networks for object detection[J]. Computer Vision and Pattern Recognition, arXiv:1901.01892, 2019.
[19]田萱,王亮,丁琪. 基于深度学习的图像语义分割方法综述[J]. 软件学报, 2019,30(2):440-468.
[20]HARIHARAN B, ARBELEZ P, GIRSHICK R, et al. Simultaneous detection and segmentation[C]// Proceedings of the European Conference on Computer Vision. 2014:297-312.
[21]PINHEIRO P O, COLLOBERT R, DOLLAR P. Learning to segment object candidates[C]// Proceedings of the Advances in Neural Information Processing Systems. 2015:1990-1998.
[22]LIU S, QI X, SHI J, et al. Multi-scale patch aggregation(MPA) for simultaneous detection and segmentation[C]// Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2016:3141-3149.
[23]PINHEIRO P O, LIN T Y, COLLOBERT R, et al. Learning to refine object segments[C]// Proceedings of the European Conference on Computer Vision. 2016:75-91.
[24]ZAGORUYKO S, LERER A, LIN T Y, et al. A multipath network for object detection[J]. Computer Vision and Pattern Recognition, arXiv:1604.02135, 2016.
[25]HE K, GKIOXARI G, DOLLR P, et al. Mask R-CNN[C]// Proceedings of IEEE Conference on Computer Vision. 2017:2980-2988.
[26]SHELHAMER E, LONG J, DARRELL T. Fully convolutional networks for semantic segmentation[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2014,39(4):640-651.
[27]NOH H, HONG S, HAN B. Learning deconvolution network for semantic segmentation[C]// Proceedings of IEEE Conference on Computer Vision. 2015:1520-1528.
[28]LIU Z, LI X, LUO P, et al. Semantic image segmentation via deep parsing network[C]// Proceedings of IEEE Conference on Computer Vision. 2015:1377-1385.
[29]CHEN L C, PAPANDREOU G, KOKKINOS I, et al. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2018,40(4):834-848.
[30]ZHAO H, SHI J, QI X, et al. Pyramid scene parsing network[C]// Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2017:6230-6239.
[31]CHEN L C, PAPANDREOU G, SCHROFF F, et al. Rethinking atrous convolution for semantic image segmentation[J]. Computer Vision and Pattern Recognition, arXiv:1706.05587, 2017.
[32]WANG N, YEUNG D Y. Learning a deep compact image representation for visual tracking[C]// Proceedings of the 26th Annual Conference on Neural Information Processing Systems. 2013,1:809-817.
[33]WANG N, LI S, GUPTA A, et al. Transferring rich feature hierarchies for robust visual tracking[J]. Computer Vision and Pattern Recognition, arXiv:1501.04587, 2015.
[34]WANG L, OUYANG W, WANG X, et al. Visual tracking with fully convolutional networks[C]// Proceedings of IEEE International Conference on Computer Vision. 2015:3119-3127.
[35]NAM H, HAN B. Learning multi-domain convolutional neural networks for visual tracking[C]// Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2016:4293-4302.
[36]EIGEN D, PUHRSCH C, FERGUS R. Depth map prediction from a single image using a multi-scale deep network[C]// International Conference on Neural Information Processing Systems. 2014,2:2366-2374.
[37]CHOY C B, XU D, GWAK J, et al. 3DR2N2: A unified approach for single and multi-view 3D object reconstruction[C]// Proceedings of European Conference on Computer Vision. 2016:628-644.
[38]TATARCHENKO M, DOSOVITSKIY A, BROX T. Octree generating networks: Efficient convolutional architectures for high resolution 3D outputs[C]// Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2017:2088-2096.
[39]MATURANA D, VOXNE S. VoxNet: A 3D convolutional neural network for real-time object recognition[C]// Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems. 2015:922-928.
[40]WU Z, SONG S, KHOSLA A, et al. 3D shapenets: A deep representation for volumetric shapes[C]// Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2015:1912-1920.
[41]RICHTER S R, ROTH S. Matryoshka networks: Predicting 3D geometry via nested shape layers[C]// Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2018:1936-1944.
[42]李阳,陈秀万,王媛,等. 基于深度学习的单目图像深度估计的研究进展[J]. 激光与光电子学进展,2019,56(19):1-17.
|