计算机与现代化 ›› 2025, Vol. 0 ›› Issue (09): 61-66.doi: 10.3969/j.issn.1006-2475.2025.09.009

• 图像处理 • 上一篇    下一篇

基于CycleGAN和注意力机制的人脸素描图像转换

  


  1. (江西财经大学软件与物联网工程学院,江西 南昌 330013)
  • 出版日期:2025-09-24 发布日期:2025-09-24
  • 作者简介: 作者简介:林睿姿(2003—),女,江西南昌人,本科生,研究方向:计算机视觉,深度学习,E-mail: rickylin0791@163.com; 姚达(2003—),男,江西上饶人,本科生,研究方向:图像处理,机器学习,E-mail: 2741695224@qq.com; 戴欣(2004—),女,江西抚州人,本科生,研究方向:计算机视觉,机器学习,E-mail:1377534818@qq.com; 沈国誉(2005—),男,江西九江人,本科生,研究方向:图像识别,深度学习,E-mail: 259916010@qq.com; 王嘉慧(2004—),女,江西赣州人,本科生,研究方向:图像处理,深度学习,E-mail: 1436969608@qq.com; 通信作者:万伟国(1991—),江西上饶人,讲师,博士,研究方向:计算机视觉,深度学习。
  • 基金资助:
        基金项目:国家自然科学基金资助项目(62261025); 江西省自然科学基金资助项目(20232BAB212015)

Facial Sketch Image Conversion Based on CycleGAN and Attention Mechanism


  1. (School of Software and Internet of Things Engineering, Jiangxi University of Finance and Economics, Nanchang 330013, China)
  • Online:2025-09-24 Published:2025-09-24

摘要:
摘要:近年来,人脸素描-照片合成技术因其在执法、刑事及娱乐等领域的需求,成为研究热点。CycleGAN作为一种无需配对图像监督的深度学习模型,擅长图像跨域转换,为素描与照片间的高效转换提供了有力工具。鉴于收集大量成对的人脸图像和素描图像存在较大难度,同时人脸素描图像生成任务中存在图像细节模糊和低清晰度的问题,提出一种改进的CycleGAN模型。本文在CycleGAN模型中ResNet架构的生成器的残差块中引入自注意力机制,使得 CycleGAN的生成器模型能够更有效地学习不同通道特征以及人脸图像中不同区域的重要程度,在处理图像时自动聚焦于面部特征的重要区域,如眼睛、鼻子、嘴巴等,同时增加素描的边缘清晰度与完整度,从而提升生成的人脸素描图像质量。在数据集CUHK和FS2K上进行实验,本文模型的图像质量评估指标结构相似性、峰值信噪比、多尺度结构相似度在数据集CUHK上分别为0.7741、11.7451、0.8504,在数据集FS2K上分别为0.7049、13.2745、0.7970,优于对比的CycleGAN、Pix2Pix、MUNIT、DCLGAN模型。对比实验以及主观视觉结果表明,本文模型能够有效地完成人脸素描化的过程,并生成较高质量的人脸素描图像。


关键词: 关键词:CycleGAN, 生成对抗网络, 注意力机制, 残差网络

Abstract:
Abstract: In recent years, because of its demand in law enforcement, criminal and entertainment fields, face sketch-photo synthesis has become a research hotspot. As deep learning model without paired image supervision, CycleGAN is good at cross-domain image conversion, providing a powerful tool for efficient conversion between sketches and photos. In view of the difficulty of collecting a large number of pairs of face images and sketch images, and the problems of fuzzy and low definition image details in face sketch image generation, an improved CycleGAN model is proposed. In this paper, the self-attention mechanism is introduced into the residual block of the ResNet architecture generator in the CycleGAN model, so that the CycleGAN generator model can learn the features of different channels and the importance of different regions in the face image more effectively, and automatically focus on the important regions of facial features, such as eyes, nose, mouth, etc., during image processing. At the same time, the edge clarity and integrity of the sketch are increased, so as to improve the quality of the generated face sketch image. The proposed model is implemented on the datasets CUHK and FS2K. The structural similarity, peak signal-to-noise ratio and multi-scale structural similarity are 0.7741, 11.7451 and 0.8504 respectively on CUHK and 0.7049, 13.2745 and 0.7970 respectively on FS2K. These results outperformed the comparison models of CycleGAN, Pix2Pix, MUNIT, and DCLGAN. According to the comparison experiment and subjective vision, the proposed model can effectively complete the process of face sketching and generate higher quality face sketching images.

Key words: Key words: CycleGAN; generative adversarial network; attention mechanism; residual network ,

中图分类号: