[1] |
ZHU Y, YAO C, BAI X. Scene text detection and recognition: Recent advances and future trends[J]. Frontiers of Computer Science, 2016,10(1):19-36.
|
[2] |
KRIZHEVSKY A, SUTSKEVER I, HINTON G E. Imagenet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017,60(6):84-90.
|
[3] |
GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2014:580-587.
|
[4] |
LIAO M H, SHI B G, BAI X, et al. Textboxes: A fast text detector with a single deep neural network[C]// The 31st AAAI Conference on Artificial Intelligence. 2016. DOI:10.1609/aaai.v31i1.11196.
|
[5] |
LIU W, ANGUELOV D, ERHAN D, et al. SSD: Single shot multibox detector[C]// European Conference on Computer Vision. Springer. 2016:21-37.
|
[6] |
ZHOU X Y, YAO C, WEN H, et al. East: An efficient and accurate scene text detector[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017:2642-2651.
|
[7] |
LONG J, SHELHAMER E, DARRELL T. Fully convolutional networks for semantic segmentation[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015:3431-3440.
|
[8] |
NEUBECK A, VAN GOOL L. Efficient non-maximum suppression[C]// The 18th International Conference on Pattern Recognition (ICPR’06). IEEE, 2006,3:850-855.
|
[9] |
LUO W J, LI Y J, URTASUN R, et al. Understanding the effective receptive field in deep convolutional neural networks[J]. arXiv preprint arXiv:1701.04128, 2017.
|
[10] |
WANG Y X, XIE H T, ZHA Z J, et al. R-Net: A relationship network for efficient and accurate scene text detection[J]. IEEE Transactions on Multimedia, 2020,23:1316-1329.
|
[11] |
SHI B G, BAI X, BELONGIE S. Detecting oriented text in natural images by linking segments[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017:2550-2558.
|
[12] |
WANG P Q, CHEN P F, YUAN Y, et al. Understanding convolution for semantic segmentation[C]// 2018 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, 2018:1451-1460.
|
[13] |
LONG S B, RUAN J Q, ZHANG W J, et al. Textsnake: A flexible representation for detecting text of arbitrary shapes[C]// Proceedings of the European Conference on Computer Vision (ECCV). 2018:20-36.
|
[14] |
LI X, WANG W H, HOU W B, et al. Shape robust text detection with progressive scale expansion network[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019:9336-9345.
|
[15] |
WANG W H, XIE E, SONG X, et al. Efficient and accurate arbitrary-shaped text detection with pixel aggregation network[C]// Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019:8440-8449.
|
[16] |
LIAO M H, WAN Y, YAO C, et al. Real-time scene text detection with differentiable binarization[C]// Proceedings of the AAAI Conference on Artificial Intelligence. 2020,34(7):11474-11481.
|
[17] |
ZHAO H S, SHI J P, QI X J, et al. Pyramid scene parsing network[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017:2881-2890.
|
[18] |
HE K M, ZHANG X Y, REN S Q, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015,37(9):1904-1916.
|
[19] |
REDMON J, FARHADI A. YOLOv3: An incremental improvement[J]. arXiv preprint arXiv:1804.02767, 2018.
|
[20] |
HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016:770-778.
|
[21] |
DAI J F, QI H Z, XIONG Y W, et al. Deformable convolutional networks[C]// Proceedings of the IEEE International Conference on Computer Vision. 2017:764-773.
|
[22] |
WANG J D, SUN K, CHENG T H, et al. Deep high-resolution representation learning for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020,43(10):3349-3364.
|
[23] |
ZHAO H S, SHI J P, QI X J, et al. Pyramid scene parsing network[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017:2881-2890.
|
[24] |
蔡鑫鑫,王敏. 基于分割的任意形状场景文本检测[J]. 计算机系统应用, 2020,29(12):257-262.
|
[25] |
VATTI B R. A generic solution to polygon clipping[J]. Communications of the ACM, 1992,35(7):56-63.
|
[26] |
RUBY U, YENDAPALLI V. Binary cross entropy with deep learning technique for image classification[J]. International Journal of Advanced Trends in Computer Science and Engineering, 2020. DOI: 10.30534/ijatcse/2020/175942020.
|
[27] |
SHRIVASTAVA A, GUPTA A, GIRSHICK R. Training region-based object detectors with online hard example mining[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016:761-769.
|
[28] |
KINGMA D P, BA J. Adam: A method for stochastic optimization[J]. arXiv preprint arXiv:1412.6980, 2014.
|
[29] |
LIN T Y, DOLLÁR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017:2117-2125.
|