基于生成对抗网络的图像动漫化

摘要/Abstract

摘要： 动漫风格的图像具有高度的简化和抽象等特征，为了解决将现实世界图像转化成动漫风格图像这一问题，提出一种基于生成对抗网络的图像动漫化方法。本文的生成网络是类U-Net的全卷积结构，对输入图像先下采样，并加上浅层的特征用双线性插值的方法进行上采样，判别网络则采用Patch GAN加谱归一化的结构，分别计算语义内容损失和风格损失以提高网络的稳定性。本文采用surface表征损失、structure表征损失和texture表征损失代替风格损失，使得生成动漫图像的效果更可控。写实图像选用train2014，人脸图像采用CelebA-HQ数据集。使用本文模型在这些数据集上进行实验，实验结果表明，本文模型能够有效地完成图像动漫化的过程，并生成较高质量的动漫化图像。

关键词: 深度学习, 生成对抗网络, 图像动漫化

Abstract: Anime-style images are highly simplified and abstract. In order to solve the problem of transforming real-world images into anime-style images, this paper proposes an image animation method based on generative adversarial networks. The generation network in this paper is like a U-Net fully convolutional structure. The input image is down-sampled first, and the shallow features are up-sampled by bilinear interpolation. The discriminant network uses Patch GAN and spectrum normalization. Semantic content loss and style loss are calculated separately to improve the stability of the network. Surface representation loss, structure representation loss, and texture representation loss are used to replace style loss to make the effect of generating animation pictures more controllable. We use train2014 for realistic images, and use the CelebA-HQ data set for face images. Experiments are performed on these data sets using this model. The experimental results show that the model in this paper can effectively complete the process of image animation and generate high-quality animation images.

Key words: deep learning, generative adversarial networks, image animation

翟慧聪, 张明, 邓星, 王利群. 基于生成对抗网络的图像动漫化[J]. 计算机与现代化, 2022, 0(07): 21-26.

ZHAI Hui-cong, ZHANG Ming, DENG Xing, WANG Li-qun. Image Animation Based on Generative Adversarial Networks[J]. Computer and Modernization, 2022, 0(07): 21-26.

参考文献

［1］ ROSIN P, COLLOMOSSE J. Image and Video-Based Art-istic Stylisation［M］. Springer, 2012.
［2］ GATYS L A, ECKER A S, BETHGE M. A neural algorithm of artistic style［J］. arXiv preprint arXiv:1508.06576, 2015.
［3］ JOHNSON J, ALAHI A, LI F F. Perceptual losses for real-time style transfer and super-resolution［J］. arXiv preprint arXiv:1603.08155, 2016.
［4］ GATYS L A, ECKER A S, BETHGE M. Image style transfer using convolutional neural networks［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. 2016:2414-2423.
［5］ MIRZA M, OSINDERO S. Conditional generative adversarial nets［J］. arXiv preprint arXiv:1411.1784, 2014.
［6］ ISOLA P, ZHU J Y, ZHOU T H, et al. Image-to-image translation with conditional adversarial networks［J］. arXiv preprint arXiv:1611.07004, 2018.
［7］ ZHU J Y, PARK T, ISOLA P, et al. Unpaired image-to-image translation using cycle-consistent adversarial networks［J］. arXiv preprint arXiv:1703.10593, 2018.
［8］ CHEN Y, LAI Y K, LIU Y J. CartoonGAN: Generative adversarial networks for photo cartoonization［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018:9465-9474.
［9］ WANG X R, YU J Z. Learning to cartoonize using white-box cartoon representations［C］// Proceedings of the 2020 IEEE Conference on Computer Vision and Pattern Recognition. 2020:1120-1130.
［10］LEE H Y, TSENG H Y, HUANG J B, et al. Diverse image-to-image translation via disentangled representations［C］// Proceedings of the 2018 European Conference on Computer Vision. 2018:35-51．
［11］BIAN Y A, LI X, LIU Y C, et al. Parallel coordinate descent Newton method for efficient L1-regularized loss minimization［J］. IEEE Transactions on Neural Networks and Learning Systems, 2019,30(11):3233-3245．
［12］ZHANG R, ISOLA P, EFROS A A. Colorful image colorization［C］// 2016 European Conference on Computer Vision. 2016:649-666.

［13］GOODFELLOW I J, POUGET-ABADIE J, MIRZA M, et al. Generative adversarial nets［C］// Proceedings of the 27th International Conference on Neural Information Processing Systems. 2014,2:2672-2680.

［14］SIMONYAN K, ZISSERMAN A. Very deep convolutional network for large-scale image recognition［J］. arXiv preprint arXiv:1409.1556, 2014.
［15］MAO X D, LI Q, XIE H R, et al. Least squares generative adversarial networks［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision. 2017:2813-2821.
［16］RONNEBERGER O, FISCHER P, BROX T. U-Net: Convolutional networks for biomedical image segmentation［C］// Proceedings of the 18th International Conference on Medical Image Computing and Computer Assisted Intervention. 2015:234-241.
［17］YU J H, LIN Z, YANG J M, et al. Free-form image inpainting with gated convolution［C］// Proceedings of the 2019 IEEE /CVF International Conference on Computer Vision. 2019:4470-4479.
［18］TONG T, LI G, LIU X J, et al. Image super-resolution using dense skip connections［C］// Proceedings of 2017 IEEE International Conference on Computer Vision. 2017:4809-4817.
［19］LAN Z Z, LIN M, LI X C, et al. Beyond Gaussian pyramid: Multi-skip feature stacking for action recognition［C］// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. 2015:204-212.
［20］MIYATO T, KATAOKA T, KOYAMA M, et al. Spectral normalization for generative adversarial networks［J］. arXiv preprint arXiv:1803.05957, 2018.
［21］SHAHCHERAGHI Z, SEE J. On the effects of pre- and post-processing in video cartoonization with bilateral filters［C］// Proceedings of the 2013 IEEE International Conference on Signal and Image Processing Applications. 2013:37-42．
［22］HUANG X, BELONGIE S. Arbitrary style transfer in real-time with adaptive instance normalization［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision. 2017:1510-1519.
［23］KINGMA D P, BA J L. Adam: A method for stochastic optimization［J］. arXiv preprint arXiv:1412.6980, 2014.
［24］KARRAS T, AILA T, LAINE S, et al. Progressive growing of GANs for improved quality, stability, and variation［C］// Proceedings of the 2018 IEEE International Conference on Learning Representations. 2018:324-330.
［25］WANG Z, BOVIK A C. A universal image quality index［J］. IEEE Signal Processing Letters, 2002,9(3):81-88.
［26］WANG Z, BOVIK A C, SHEIKH H R, et al. Image quality assessment: From error visibility to structural similarity［J］. IEEE Transactions on Image Processing: A Publication of the IEEE Signal Processing Society, 2004,13(4):600-612.
［27］HEUSEL M, RAMSAUER H, UNTERTHINER T, et al. GANs trained by a two time-scale update rule converge to a local Nash equilibrium［C］// Proceedings of the 31th International Conference on Neural Information Processing Systems. 2017:6629-6640.

[1]	祁贤, 刘大铭, 常佳鑫. 基于改进自注意力机制的多视图三维重建[J]. 计算机与现代化, 2024, 0(11): 106-112.
[2]	陈凯1, 李宜汀1, 2, 全华凤1 . 基于改进YOLOv8的河道废弃瓶检测方法[J]. 计算机与现代化, 2024, 0(11): 113-120.
[3]	杨骏1, 胡为1, 朱文福2. 基于改进MobileNetV3的视觉SLAM回环检测算法[J]. 计算机与现代化, 2024, 0(10): 21-26.
[4]	王莹莹, 郝潇. 基于Res2Net和递归门控卷积的细粒度图像分类[J]. 计算机与现代化, 2024, 0(10): 74-79.
[5]	史星宇1, 李强2, 庄莉3, 梁懿3, 王秋琳3, 陈锴3, 伍臣周3, 常胜1. 一种面向工业部署的目标检测模型蒸馏技术[J]. 计算机与现代化, 2024, 0(10): 93-99.
[6]	张泽1, 张建权2, 3, 周国鹏2, 3. 基于改进YOLOv8s的摄像头模组缺陷检测[J]. 计算机与现代化, 2024, 0(09): 107-113.
[7]	程亚子1, 雷亮1, 2, 陈瀚1, 赵毅然1. 基于转置注意力的多尺度深度融合单目深度估计[J]. 计算机与现代化, 2024, 0(09): 121-126.
[8]	程萌, 李浩. 改进YOLOv5s的落叶树鸟巢检测方法[J]. 计算机与现代化, 2024, 0(08): 24-29.
[9]	王梦溪, 李峻. 老年人跌倒检测技术研究综述[J]. 计算机与现代化, 2024, 0(08): 30-36.
[10]	时现伟1, 范鑫2. 基于轻量化的视频帧场景语义分割方法[J]. 计算机与现代化, 2024, 0(08): 49-53.
[11]	徐新爱, 李钢. 基于DCGAN的课堂表情图像生成方法[J]. 计算机与现代化, 2024, 0(08): 88-91.
[12]	高帅鹏, 王怡凡. 基于图像的群体情绪识别综述[J]. 计算机与现代化, 2024, 0(08): 98-107.
[13]	黄文栋, 王怡凡. 基于模态类别的多模态信息处理与融合综述[J]. 计算机与现代化, 2024, 0(07): 47-62.
[14]	武丽1, 张征浩2, 葛彩成2, 俞俊2. 基于改进SCNN网络的车道线检测算法[J]. 计算机与现代化, 2024, 0(07): 87-92.
[15]	王志强, 郑爽. 基于半监督学习的StyleGAN图像生成模型[J]. 计算机与现代化, 2024, 0(06): 14-18.