计算机与现代化 ›› 2025, Vol. 0 ›› Issue (02): 19-27.doi: 10.3969/j.issn.1006-2475.2025.02.003
出版日期:
2025-02-28
发布日期:
2025-02-28
基金资助:
Online:
2025-02-28
Published:
2025-02-28
摘要: 在零样本图像分类中,语义嵌入技术(即用语义属性描述类标签)通过传递已知对象的知识,为生成未知对象的视觉特征提供了条件。当前研究往往使用语义属性作为描述视觉特征的辅助信息。然而,语义属性通常由人工标注等外部范式获得,这使得其与视觉特征间的一致性较弱,且视觉特征的多样性通常无法通过单一的语义属性进行描述。为提升语义属性的多样性,增强语义属性对视觉特征的描述能力,本文提出一种基于语义拓展和嵌入的零样本学习(Semantic extension and embedding for Zero-Shot Learning, SeeZSL)。SeeZSL通过构造每个类潜在的语义空间对语义属性进行拓展,再基于语义空间生成未知类的视觉特征。此外,为缓解原始特征空间与语义属性一致性弱、缺乏判别能力的问题,本文将基于语义拓展的生成模型与对比嵌入式模型相结合。在4个benchmark数据集上实验验证了所提SeeZSL方法的有效性。
中图分类号:
郭晨光, 茅健, 汪云云. 基于语义拓展和嵌入的零样本学习[J]. 计算机与现代化, 2025, 0(02): 19-27.
GUO Chenguang, MAO Jian, WANG Yunyun. Zero-shot Learning Based on Semantic Extension and Embedding[J]. Computer and Modernization, 2025, 0(02): 19-27.
[1] CAO W P, WU Y H, SUN Y X, et al. A review on multimodal zero‐shot learning[J]. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 2023,13(2):e1488. DOI: 10.1002/widm.1488. [2] YU W K, ZHAO C H, HUANG B. MoniNet with concurrent analytics of temporal and spatial information for fault detection in industrial processes[J]. IEEE Transactions on Cybernetics, 2022,52(8): 8340-8351. [3] CORDEIRO F R, SACHDEVA R, BELAGIANNIS V, et al. LongReMix: Robust learning with high confidence samples in a noisy label environment[J]. Pattern Recognition, 2023, 133:109013. DOI: 10.1016/j.patcog.2022.109013. [4] LAMPERT C H, NICKISCH H, HARMELING S. Learning to detect unseen object classes by between-class attribute transfer[C]// 2009 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2009:951-958. [5] FARHADI A, ENDRES I, HOIEM D, et al. Describing objects by their attributes[C]// 2009 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2009: 1778-1785. [6] MIKOLOV T, CHEN K, CORRADO G, et al. Efficient estimation of word representations in vector space[J]. arXiv preprint arXiv:1301.3781, 2013. [7] SOCHER R, GANJOO M, MANNING C D, et al. Zero-shot learning through cross-modal transfer[C]// Proceedings of the 26th International Conference on Neural Information Processing Systems. ACM, 2013,1:935-943. [8] XIAN Y Q, SCHIELE B, AKATA Z. Zero-shot learning—the good, the bad and the ugly[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2017: 3077-3086. [9] CHEN S M, CHEN S H, HOU W J, et al. EGANS: Evolutionary generative adversarial network search for zero-shot learning[J]. IEEE Transactions on Evolutionary Computation,2024,28(3):582-596. [10] FROME A, CORRADO G S, SHLENS J, et al. DeViSE: A deep visual-semantic embedding model[C]// Proceedings of the 26th International Conference on Neural Information Processing Systems. ACM, 2013,2:2121-2129. [11] XIAN Y Q, LAMPERT C H, SCHIELE B, et al. Zero-shot learning—A comprehensive evaluation of the good, the bad and the ugly[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019,41(9):2251-2265. [12] FENG L J, ZHAO J C, ZHAO C H. A systematic evaluation and benchmark for embedding-aware generative models: Features, models, and any-shot scenarios[J]. arXiv preprint arXiv:2302.04060, 2023. [13] JURIE F, BUCHER M, HERBIN S. Generating visual representations for zero-shot classification[C]// 2017 IEEE International Conference on Computer Vision Workshops. IEEE, 2017:2666-2673. [14] WANG D H, LI Y N, LIN Y T, et al. Relational knowledge transfer for zero-shot learning[C]// Proceedings of the 30th AAAI Conference on Artificial Intelligence. AAAI, 2016,30(1). DOI: 10.1609/aaai.v30i1.10195. [15] ZHAO B, WU B T, WU T F, et al. Zero-shot learning posed as a missing data problem[C]// 2017 IEEE International Conference on Computer Vision Workshops. IEEE, 2017:2616-2622. [16] PENG X B, KANAZAWA A, TOYER S, et al. Variational discriminator bottleneck: Improving imitation learning, inverse RL, and GANs by constraining information flow[J]. arXiv preprint arXiv:1810.00821, 2018. [17] SCHÖNFELD E, EBRAHIMI S, SINHA S, et al. Generalized zero- and few-shot learning via aligned variational autoencoders[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, 2019:8239-8247. [18] SARIYILDIZ M B, CINBIS R G. Gradient matching generative networks for zero-shot learning[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, 2019:2163-2173. [19] WANG W L, PU Y C, VERMA V, et al. Zero-shot learning via class-conditioned deep generative models[C]// Proceedings of the 32nd AAAI Conference on Artificial Intelligence. AAAI, 2018,32(1). DOI: 10.1609/aaai.v32i1.11600. [20] YU Y L, JI Z, HAN J G, et al. Episode-based prototype generating network for zero-shot learning[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, 2020:14032-14041. [21] SHEN Y M, QIN J, HUANG L, et al. Invertible zero-shot recognition flows[C]// Proceedings of the 2020 European Conference on Computer Vision(ECCV). Springer, 2020:614-631. [22] VYAS M R, VENKATESWARA H, PANCHANATHAN S. Leveraging seen and unseen semantic relationships for generative zero-shot learning[C]// Proceedings of the 2020 European Conference on Computer Vision(ECCV). Springer, 2020:70-86. [23] BRAINERD C J, LIU X, BIALER D M, et al. The big three: Accuracy, organization, and retrieval effects of latent semantic attributes[J]. Journal of Experimental Psychology: General, 2023,152(6):1768-1786. [24] ZHAO X J, SHEN Y M, WANG S D, et al. Generating diverse augmented attributes for generalized zero shot learning[J]. Pattern Recognition Letters, 2023,166:126-133. [25] LIU J W, LIN J K, RUFFY F, et al. NNSmith: Generating diverse and valid test cases for deep learning compilers[C]// Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems. ACM, 2023,2:530-543. [26] ZHU Y Z, ELHOSEINY M, LIU B C, et al. A generative adversarial approach for zero-shot learning from noisy texts[C]// Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2018:1004-1013. [27] HUANG H, WANG C H, YU P S, et al. Generative dual adversarial network for generalized zero-shot learning[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, 2019:801-810. [28] HAN Z Y, FU Z Y, CHEN S, et al. Contrastive embedding for generalized zero-shot learning[C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, 2021:2371-2381. [29] ZHANG H F, LIU L, LONG Y, et al. Deep transductive network for generalized zero shot learning[J]. Pattern Recognition, 2020,105:107370. DOI: 10.1016/j.patcog.2020.107370. [30] FU Y W, HOSPEDALES T M, XIANG T, et al. Transductive multi-view zero-shot learning[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015,37(11):2332-2345. [31] AKATA Z, REED S, WALTER D, et al. Evaluation of output embeddings for fine-grained image classification[C]// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2015:2927-2936. [32] CHAO W L, CHANGPINYO S, GONG B Q, et al. An empirical study and analysis of generalized zero-shot learning for object recognition in the wild[C]// Computer Vision–ECCV 2016. Springer, 2016:52-68. [33] LIU S C, LONG M S, WANG J M, et al. Generalized zero-shot learning with deep calibration network[C]// Proceedings of the 32nd International Conference on Neural Information Processing Systems. ACM, 2018:2009-2019. [34] BISWAS S, ANNADANI Y. Preserving semantic relations for zero-shot learning[C]// Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2018:7603-7612. [35] ROMERA-PAREDES B, TORR P H S. An embarrassingly simple approach to zero-shot learning[C]// Proceedings of the 32nd International Conference on Machine Learning. PMLR, 2015:2152-2161. [36] HUYNH D, ELHAMIFAR E. Fine-grained generalized zero-shot learning via dense attribute-based attention[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, 2020: 4482-4492. [37] MIN S B, YAO H T, XIE H T, et al. Domain-aware visual bias eliminating for generalized zero-shot learning[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, 2020: 12661-12670. [38] MANDAL D, NARAYAN S, DWIVEDI S K, et al. Out-of-distribution detection for generalized zero-shot action recognition[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, 2019: 9977-9985. [39] KESHARI R, SINGH R, VATSA M. Generalized zero-shot learning via over-complete distribution[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, 2020:13297-13305. [40] CHEN X Y, LAN X G, SUN F C, et al. A boundary based out-of-distribution classifier for generalized zero-shot learning[C]// European Conference on Computer Vision 2020. Springer, 2020:572-588. [41] LEE C W, FANG W, YEH C K, et al. Multi-label zero-shot learning with structured knowledge graphs[C]// Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2018:1576-1585. [42] WANG X L, YE Y F, GUPTA A. Zero-shot recognition via semantic embeddings and knowledge graphs[C]// Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2018:6857-6866. [43] LIU S T, CHEN J J, PAN L M, et al. Hyperbolic visual embedding learning for zero-shot recognition[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, 2020: 9270-9278. [44] MISHRA A, REDDY S K, MITTAL A, et al. A generative model for zero shot learning using conditional variational autoencoders[C]// Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition Workshops. IEEE, 2018: 2269-22698. [45] VERMA V K, ARORA G, MISHRA A, et al. Generalized zero-shot learning via synthesized examples[C]// Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2018:4281-4289. [46] XIAN Y Q, LORENZ T, SCHIELE B, et al. Feature generating networks for zero-shot learning[C]// Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2018:5542-5551. [47] XIAN Y Q, SHARMA S, SCHIELE B, et al. F-VAEGAN-D2: A feature generating framework for any-shot learning[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, 2019: 10267-10276. [48] PAUL A, KRISHNAN N C, MUNJAL P. Semantically aligned bias reducing zero shot learning[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, 2019: 7049-7058. [49] KINGMA D P, WELLING M. Auto-encoding variational bayes[J]. arXiv preprint arXiv:1312.6114, 2013. [50] GOODFELLOW I J, POUGET-ABADIE J, MIRZA M, et al. Generative adversarial nets[J]. Proceedings of the 27th International Conference on Neural Information Processing Systems, ACM, 2014,2:2672-2680. [51] ARJOVSKY M, CHINTALA S, BOTTOU L. Wasserstein generative adversarial networks[C]// Proceedings of the 34th International Conference on Machine Learning. PMLR, 2017,70:214-223. [52] DINH L, KRUEGER D, BENGIO Y. Nice: Non-linear independent components estimation[J]. arXiv preprint arXiv:1410.8516, 2014. [53] DINH L,SOHL-DICKSTEIN J,BENGIO S. Density estimation using real NVP[J]. arXiv preprint arXiv:1605.08803, 2016. [54] KINGMA D P, DHARIWAL P. Glow: Generative flow with invertible 1x1 convolutions[C]// Proceedings of the 32nd International Conference on Neural Information Processing Systems. ACM, 2018:10236-10245. [55] LUO Y X, YANG Z W. DynGAN: Solving mode collapse in GANs with dynamic clustering[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024,46(8):5493-5503. [56] CHEN T, KORNBLITH S, NOROUZI M, et al. A simple framework for contrastive learning of visual representations[C]// Proceedings of the 37th International Conference on Machine Learning. PMLR, 2020:1597-1607. [57] LI K, MIN M R,FU Y. Rethinking zero-shot learning: A conditional visual classification perspective[C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. IEEE, 2019: 3582-3591. [58] ZHANG F, SHI G M. Co-representation network for generalized zero-shot learning[C]// Proceedings of the 36th International Conference on Machine Learning. PMLR, 2019,97:7434-7443. [59] LI J J, JING M M, LU K, et al. Leveraging the invariant side of generative zero-shot learning[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, 2019: 7394-7403. [60] VERMA V K, BRAHMA D, RAI P. Meta-learning for generalized zero-shot learning[C]// Proceedings of the 34th AAAI Conference on Artificial Intelligence. AAAI, 2020,34(4):6062-6069. [61] LIU B, DONG Q L, HU Z Y. Zero-shot learning from adversarial feature residual to compact visual feature[C]// Proceedings of the 34th AAAI Conference on Artificial Intelligence. 2020,34(7): 11547-11554. [62] NARAYAN S, GUPTA A, KHAN F S, et al. Latent embedding feedback and discriminative features for zero-shot classification[C]// Computer Vision–ECCV 2020. Springer, 2020:479-495. [63] CHEN S M, HONG Z M, XIE G S, et al. MSDN: Mutually semantic distillation network for zero-shot learning[C]// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, 2022: 7602-7611. [64] ZHANG J, LIAO S B, ZHANG H F, et al. Data driven recurrent generative adversarial network for generalized zero shot image classification[J]. Information Sciences, 2023,625:536-552. |
[1] | 余晴1, 江金光1, 2, 3, 谢东朋1, 刘江华4. 基于LSTM场景分类的行人自适应低功耗定位方案[J]. 计算机与现代化, 2025, 0(02): 44-51. |
[2] | 刘重宜, 李华, 任德均, 柳尧凯, 王玉龙. 基于双向多尺度知识蒸馏的异常检测算法[J]. 计算机与现代化, 2025, 0(02): 58-63. |
[3] | 李灿伟, 吴春雷, 路静, 王春霖, 朱明飞. 基于交互注意力分解融合的地震速度建模[J]. 计算机与现代化, 2025, 0(01): 7-14. |
[4] | 张岳, 李华昱, 张智康, 沈鑫怡. 基于知识图谱和语义信息的学术推荐系统[J]. 计算机与现代化, 2025, 0(01): 50-58. |
[5] | 李希, 潘誉. 基于改进YOLOv8的探地雷达管线目标检测方法[J]. 计算机与现代化, 2025, 0(01): 94-99. |
[6] | 何思达, 陈平华. 基于意图的轻量级自注意力序列推荐模型[J]. 计算机与现代化, 2024, 0(12): 1-9. |
[7] | 赵晨阳, 薛涛, 刘俊华. 基于改进Stable Diffusion的时尚服饰图案生成[J]. 计算机与现代化, 2024, 0(12): 15-23. |
[8] | 黄庭培1, 马禄彪1, 李世宝2, 刘建航1. 基于WiFi和原型网络的手势识别方法[J]. 计算机与现代化, 2024, 0(12): 34-39. |
[9] | 刘云海1, 冯广1, 吴晓婷2, 杨群2. 复杂施工场景下的安全帽佩戴检测算法[J]. 计算机与现代化, 2024, 0(12): 66-71. |
[10] | 张志霞, 秦志毅. 基于变分模态分解和IGJO-SVR的网络舆情预测[J]. 计算机与现代化, 2024, 0(11): 77-83. |
[11] | 万鸿炜, 陈平华. 基于Involution算子和协调反向注意力的息肉图像分割[J]. 计算机与现代化, 2024, 0(11): 84-90. |
[12] | 杨正科, 沈小东, 王凯翔, 何立. 基于改进麻雀搜索算法的接地网腐蚀故障定位[J]. 计算机与现代化, 2024, 0(10): 14-20. |
[13] | 韩瑞超, 孟令军, 敖利丞, 谢宇斌, 甄明硕. 基于改进YOLOv5的施工防护佩戴检测[J]. 计算机与现代化, 2024, 0(10): 49-54. |
[14] | 王佳1, 顾文俊1, 鞠炜刚2, 李玉维1, 张云龙2, 米传民3, 周志鹏3. 基于多元级差优良化遗传算法的环境拓扑结构任务调度[J]. 计算机与现代化, 2024, 0(10): 65-73. |
[15] | 杨世军1, 狄广义1, 高军1, 陈见飞1, 王耀坤1, 季晓晗2. 跨模态注意力融合和信息感知的情感一致检测[J]. 计算机与现代化, 2024, 0(10): 113-119. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||