计算机与现代化 ›› 2025, Vol. 0 ›› Issue (07): 21-27.doi: 10.3969/j.issn.1006-2475.2025.07.004

• 图像处理 • 上一篇    下一篇

多模态大语言模型在色素性皮肤病变诊断中的应用

  


  1. (安徽中医药大学医药信息工程学院,安徽 合肥 230012) 
  • 出版日期:2025-07-22 发布日期:2025-07-22
  • 作者简介: 作者简介:孙凯杰(1999—),男,安徽合肥人,硕士研究生,研究方向:多模态大语言模型,E-mail: 2022215215003@stu.ahtcm.edu.cn; 通信作者:胡继礼(1980—),男,安徽淮南人,副教授,硕士,研究方向:大语言模型,中医药信息学,E-mail: hujili@ahtcm.edu.cn。
  • 基金资助:
     基金项目:安徽省高校科研重点项目(2024AH050917); 安徽省级教学研究项目(2021jyxm0822)

Application of Multimodal Large Language Models in Diagnosis of Pigmented Skin Lesions 


  1. (School of Medical Information Engineering, Anhui University of Chinese Medicine, Hefei 230012, China)
  • Online:2025-07-22 Published:2025-07-22

摘要: 摘要:精准诊断色素性皮肤病变是一项复杂且具有挑战性的任务。在当代医疗环境中,智能自动化诊断工具可以显著提高诊断和治疗的准确性。本文提出一种创新的多模态大语言模型——SkinCPM-V,以应对皮肤镜图像中的纹理、毛发和血管结构带来的诊断挑战。SkinCPM-V基于MiniCPM-V进行深度优化,特别针对皮肤病变的特点进行定制化处理,并在Kaggle公开的皮肤病数据集上进行深度处理,采用LoRA技术实现高效的参数微调。对SkinCPM-V的全面评估显示,其在BLEU-4、ROUGE-1、ROUGE-2和ROUGE-L指标上分别获得了0.8880、0.9380、0.9104和0.9349的高分,表明生成文本与标准答案高度一致。模型在实际诊断任务中的性能也通过F1分数(0.9067)、精确率(0.9028)和召回率(0.9444)得到验证,表现优异。与同类多模态大语言模型相比,SkinCPM-V在各项评估指标上均表现突出,显示了其在生成高质量文本描述方面的优势,并在实际医疗环境中展现出其潜在应用前景。研究结果验证了SkinCPM-V在色素性皮肤病变诊断中的潜力,并为多模态大语言模型在医学领域的应用提供了新思路,有望推动医疗诊断技术的发展。

关键词: 关键词:色素性皮肤病变, 自动化诊断, 多模态大语言模型, 皮肤镜诊断, 参数高效微调, 模型评估


Abstract: Abstract: Accurate diagnosis of pigmented skin lesions presents a complex and challenging task. In contemporary medical practice, intelligent diagnostic tools can significantly enhance the precision of both diagnosis and treatment. This study proposes an innovative multimodal large language model, SkinCPM-V, to address diagnostic challenges associated with textural patterns, hair artifacts, and vascular structures in dermoscopic images. SkinCPM-V is deeply optimized based on MiniCPM-V, and specially customized for the characteristics of skin lesions. It has been extensively trained on publicly available dermatological datasets from Kaggle, leveraging the LoRA technique to achieve efficient parameter fine-tuning. Comprehensive evaluations reveal that SkinCPM-V achieves exceptional performance, with BLEU-4, ROUGE-1, ROUGE-2, and ROUGE-L scores of 0.8880, 0.9380, 0.9104, and 0.9349, respectively, indicating a high level of alignment between generated outputs and reference standards. Additionally, the model’s effectiveness in real-world diagnostic tasks is validated through F1 score of 0.9067, precision of 0.9028, and recall of 0.9444, highlighting its robust performance. Compared to other multimodal large language models, SkinCPM-V demonstrates superior results across all evaluation metrics. This highlights its ability to generate high-quality textual descriptions and underscores its potential for integration into clinical workflows. The findings of this study validate the utility of SkinCPM-V in the diagnosis of pigmented skin lesions and pave the way for broader applications of multimodal large language models in medical domains, offering a promising avenue for advancing diagnostic technologies.

Key words: Key words: pigmented skin lesions; automated diagnosis; multimodal large language model; dermoscopic diagnosis; parameter-efficient fine-tuning; model evaluation ,

中图分类号: