计算机与现代化 ›› 2025, Vol. 0 ›› Issue (02): 19-27.doi: 10.3969/j.issn.1006-2475.2025.02.003

• 人工智能 • 上一篇    下一篇

基于语义拓展和嵌入的零样本学习


  

  1. (1.南京邮电大学计算机学院、软件学院、网络空间安全学院,江苏 南京 210023)
  • 出版日期:2025-02-28 发布日期:2025-02-28
  • 基金资助:
    国家自然科学基金资助项目(61876091)

Zero-shot Learning Based on Semantic Extension and Embedding

  1. (1. School of Computer Science, Nanjing University of Posts and Telecommunications, Nanjing 210023, China)
  • Online:2025-02-28 Published:2025-02-28

摘要: 在零样本图像分类中,语义嵌入技术(即用语义属性描述类标签)通过传递已知对象的知识,为生成未知对象的视觉特征提供了条件。当前研究往往使用语义属性作为描述视觉特征的辅助信息。然而,语义属性通常由人工标注等外部范式获得,这使得其与视觉特征间的一致性较弱,且视觉特征的多样性通常无法通过单一的语义属性进行描述。为提升语义属性的多样性,增强语义属性对视觉特征的描述能力,本文提出一种基于语义拓展和嵌入的零样本学习(Semantic extension and embedding for Zero-Shot Learning, SeeZSL)。SeeZSL通过构造每个类潜在的语义空间对语义属性进行拓展,再基于语义空间生成未知类的视觉特征。此外,为缓解原始特征空间与语义属性一致性弱、缺乏判别能力的问题,本文将基于语义拓展的生成模型与对比嵌入式模型相结合。在4个benchmark数据集上实验验证了所提SeeZSL方法的有效性。

关键词: 零样本学习, 语义拓展, 视觉-语义映射, 对比学习

Abstract:  In zero-shot image classification, semantic embedding technology (i.e., using semantic attributes to describe class labels) provides the means to generate visual features for unknown objects by transferring knowledge from known objects. Current research often utilizes class semantic attributes as auxiliary information for describing class visual features. However, class semantic attributes are typically obtained through external paradigms such as manual annotation, resulting in weak consistency with visual features. Moreover, a single class semantic attribute is insufficient to capture the diversity of visual features. To enhance the diversity of class semantic attributes and their capacity to describe visual features, this paper introduces a Semantic Extension and Embedding for Zero-Shot Learning (SeeZSL) based on semantic extension and embedding. SeeZSL expands semantic information by constructing a latent semantic space for each class, enabling the generation of visual features for unknown classes using this semantic space. Additionally, to address the issues of weak consistency and the lack of discriminative ability between the original feature space and class semantic attributes, a semantic extension-based generation model is integrated with an contrastive-embedding model. The effectiveness of the proposed SeeZSL method was experimentally validated on four benchmark datasets.

Key words: zero-shot learning, semantic expansion, visual-semantic mapping, contrastive learning

中图分类号: