[1] BINMAKHASHEN G M, MAHMOUD S A. Document layout analysis: A comprehensive survey[J]. ACM Computing Surveys (CSUR), 2019,52(6):192.1-192.36.
[2] 路敏. 蒙古文铅活字报纸图像识别关键技术研究[D]. 呼和浩特:内蒙古大学, 2022.
[3] BREUEL T M. Two geometric algorithms for layout analysis[C]// Proceedings of the 5th International Workshop on Document Analysis Systems. Springer, 2002:188-199.
[4] WAHL F M, WONG K Y, CASEY R G. Block segmentation and text extraction in mixed text/image documents[J]. Computer Graphics and Image Processing, 1982,20(4):375-390.
[5] MAO S, ROSENFELD A, KANUNGO T. Document structure analysis algorithms: A literature survey[J]. Document Recognition and Retrieval X, 2003,5010. DOI:10.1117/12.476326.
[6] BOULID Y, SOUHAR A, ELKETTANI M Y. Arabic handwritten text line extraction using connected component analysis from a multi agent perspective[C]// 2015 15th International Conference on Intelligent Systems Design and Applications(ISDA). IEEE, 2015:80-87.
[7] O’GORMAN L. The document spectrum for page layout analysis[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1993,15(11):1162-1173.
[8] BUKHARI S S, BREUEL T M, ASI A, et al. Layout analysis for arabic historical document images using machine learning[C]// 2012 International Conference on Frontiers in Handwriting Recognition. IEEE, 2012:639-644.
[9] BUKHARI S S, AL AZAWI M I A, SHAFAIT F, et al. Document image segmentation using discriminative learning over connected components[C]// Proceedings of the 9th IAPR International Workshop on Document Analysis Systems. ACM, 2010:183-190.
[10] 李玺,查宇飞,张天柱,等. 深度学习的目标跟踪算法综述[J]. 中国图象图形学报, 2019,24(12):2057-2080.
[11] SOTO C, YOO S. Visual detection with context for document layout analysis[C]// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing(EMNLP-IJCNLP). ACL, 2019:3464-3470.
[12] GIRSHICK R. Fast R-CNN[C]// Proceedings of the 2015 IEEE International Conference on Computer Vision. IEEE,2015:1440-1448.
[13] XU C H, SHI C, BI H Y, et al. A page object detection method based on mask R-CNN[J]. IEEE Access, 2021,9:143448-143457.
[14] HE K M, GKIOXARI G, DOLLÁR P, et al. Mask R-CNN[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. IEEE, 2017:2961-2969.
[15] PRUSTY A, AITHA S, TRIVEDI A, et al. Indiscapes: Instance segmentation networks for layout parsing of historical indic manuscripts[C]// 2019 International Conference on Document Analysis and Recognition (ICDAR). IEEE, 2019:999-1006.
[16] CHEN K, SEURET M, HENNEBERT J, et al. Convolutional neural networks for page segmentation of historical document images[C]// Proceedings of the 14th IAPR International Conference on Document Analysis and Recognition(ICDAR 2017). IEEE, 2017:965-970.
[17] LONG J, SHELHAMER E, DARRELL T. Fully convolutional networks for semantic segmentation[C]// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2015:3431-3440.
[18] WICK C, PUPPE F. Fully convolutional neural networks for page segmentation of historical document images[C]// Proceedings of the 13th IAPR International Workshop on Document Analysis Systems(DAS). IEEE, 2018:287-292.
[19] RONNEBERGER O, FISCHER P, BROX T. U-Net: Convolutional networks for biomedical image segmentation[C]// Proceedings of the 18th International Conference on Medical Image Computing and Computer-Assisted Intervention(MICCAI 2015). Springer, 2015:234-241.
[20] KISE K. Page segmentation techniques in document analysis[M]// Handbook of Document Image Processing and Recognition. Springer, 2014:135-175.
[21] BADRINARAYANAN V, KENDALL A, CIPOLLA R. SegNet: A deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017,39(12):2481-2495.
[22] ZHONG X, TANG J B, YEPES A J. PubLayNet: Largest dataset ever for document layout analysis[C]// 2019 International Conference on Document Analysis and Recognition(ICDAR). IEEE, 2019:1015-1022.
[23] GUO M H, LU C Z, HOU Q, et al. SegNeXt: Rethinking convolutional attention design for semantic segmentation[J]. Advances in Neural Information Processing Systems, 2022, 35:1140-1156.
[24] WANG W H, XIE E Z, LI X, et al. PVT v2: Improved baselines with pyramid vision transformer[J]. Computational Visual Media, 2022,8(3):415-424.
[25] SANDLER M, HOWARD A, ZHU M, et al. MobileNetv2: Inverted residuals and linear bottlenecks[C]// Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2018:4510-4520.
[26] YU F, KOLTUN V, FUNKHOUSER T. Dilated residual networks[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2017:472-480.
[27] HU J, SHEN L, SUN G, et al. Squeeze-and-excitation networks[C]// Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2018:7132-7141.
[28] 周飞燕,金林鹏,董军. 卷积神经网络研究综述[J]. 计算机学报, 2017,40(6):1229-1251.
[29] LOSHCHILOV I, HUTTER F. SGDR: Stochastic gradient descent with warm restarts[J]. arXiv preprint arXiv:1608.03983, 2016.
[30] CHEN L C, ZHU Y, PAPANDREOU G, et al. Encoder-decoder with atrous separable convolution for semantic image segmentation[C]// Proceedings of the 2018 European Conference on Computer Vision (ECCV). Springer, 2018:801-818.