基于组稀疏联合学习的影像遗传学数据关联分析

摘要/Abstract

摘要： 影像遗传学的发展很大程度上促进精神类疾病的研究，其主要是分析并挖掘多模态数据以找出与疾病相关的致病机制，但是此类数据的特征之间通常呈现出群组相关或者多个特征相关的特性，传统的方法很难找出具有相关性的疾病机制，易出现过稀疏的问题。针对上述问题，本文引入可以实现组内稀疏和组间平滑的正则化项l1,2范数，并将其与可以实现组间稀疏和组内平滑的l2,1范数联合共同惩罚典型相关分析，通过优化数据之间的相关性实现具有相关性的群组特征和组内特征之间的两模态数据集的特征选择。仿真实验结果表明，本文方法在较准确地估计出2组数据之间的相关系数的同时可选择出具有相关性的组间特征和组内特征;在真实的精神分裂症数据集上，本文方法可找出更多的与精神分裂症相关的易感基因和风险脑区。

关键词: l1,2范数, l2,1范数, 相关性, 组稀疏典型相关分析, 特征选择

Abstract: The development of image genetics has greatly promoted the research of mental diseases. It mainly analyzes and mines multimodal data to find out the disease-related pathogenesis. However, the data usually show the characteristics of group correlation or multiple feature correlation. It is difficult to find the relevant disease mechanism by traditional methods, which is prone to the problem of too sparse. To solve the above problems, this paper introduces the regularization term l1,2 norm which can achieve intra-group sparsity and inter-group smoothing, and jointly punishes canonical correlation analysis with the l2,1 norm which can achieve inter-group sparsity and intra-group smoothing. By optimizing the correlation between data, the feature selection of two-modal data sets with related group features and intra-group features is realized. The results of simulation experiments show that this method can not only accurately estimate the correlation coefficient between the two groups of data, but also select the relevant inter-group and intra-group features. On the real schizophrenia data set, this method can find more susceptibility genes and risk brain regions related to schizophrenia.

Key words: l1,2 norm, l2,1 norm, relevance, group sparse canonical correlation analysis, feature selection

赵迎利, 朱旭. 基于组稀疏联合学习的影像遗传学数据关联分析[J]. 计算机与现代化, 2022, 0(08): 43-49.

ZHAO Ying-li, ZHU Xu. Association Analysis of Image Genetic Data Based on Group Sparse Joint Leraning[J]. Computer and Modernization, 2022, 0(08): 43-49.

参考文献

［1］ FAN J, HAN F, LIU H. Challenges of big data analysis［J］. National Science Review, 2014,1（2）:293-314.
［2］ HOTELLING H. Relations between two sets of variates［J］. Breakthroughs in Statistics, 1936,28（3-4）:321-377.
［3］ WOLD S, SJSTRM M, ERIKSSON L. PLS-regression: A basic tool of chemometrics［J］. Chemometrics and Intelligent Laboratory Systems, 2001,58（2）:109-130.
［4］ PARKHOMENKO E, TRITCHLER D, BEYENE J: Sparse Canonical correlation analysis with application to genomic data integration［J］. Statistical Applications in Genetics and Molecular Biology, 2009,8（1）:1-34.
［5］ HARDOON D R, SHAWE-TAYLOR J. Sparse canonical correlation analysis［J］. Machine Learning, 2011,83（3）:331-353.
［6］ LI Y, ZHU J. L1-Norm Quantile Regression ［J］. Journal of Computational and Graphical Statistics, 2008,17（1）:163-185.
［7］ YUAN M, LIN Y. Model selection and estimation in regression with grouped variables［J］. Journal of the Royal Statistical Society Statistical Methodology, 2006,68（1）:49-67.
［8］ MEIER L, SVD G, BUHLMANN P. The group lasso for logistic regression［J］. Journal of the Royal Statistical Society Statistical Methodology, 2008,70（1）:53-71.
［9］ LIN D, ZHANG J, LI J, et al. Group sparse canonical correlation analysis for genomic data integration［J］. Bmc Bioinformatics, 2013,14（1）:1-16.
［10］罗辽复,张利绒,陈颖丽,等. 基因组中基因间的关联［J］. 内蒙古大学学报（自然科学版）,2000（1）:37-37.
［11］KONG D, FUJIMAKI R, LIU J, et al. Exclusive feature learning on arbitrary structures via 1,2-norm［C］// Proceedings of the 27th International Conference on Neural Information Processing Systems. 2014,1: 1655-1663.
［12］TIBSHIRANI R. Regression shrinkage and selection via the lasso［J］. Journal of the Royal Statistical Society, 1996,58（1）:267-288.
［13］XI C, HAN L. An efficient optimization algorithm for structured sparse CCA, with applications to eQTL mapping［J］. Statistics in Biosciences, 2012,4（1）:3-26.
［14］DU L, HUANG H, YAN J, et al. Structured sparse canonical correlation analysis for brain imaging genetics: An improved GraphNet method［J］. Bioinformatics, 2016,32（10）:1544-1551
［15］NADALIN S, REBI J, JENGI V, et al. Association between PLA2G6 gene polymorphism for calcium-independent phospholipase A2 and nicotine dependence among males with schizophrenia［J］. Prostaglandins, Leukotrienes and Essential Fatty Acids, 2019,148:9-15.
［16］GIBBONS A, UDAWELA M, JEON W, et al. The neurobiology of APOE in schizophrenia and mood disorders［J］. Frontiers in Bioscience, 2011,16（3）:962-979.
［17］MICHALCZYK A, PEKA-WYSIECKA J, KUCHARSKA-MAZUR J, et al. Association between DRD2 and ANKK1 polymorphisms with the deficit syndrome in schizophrenia［J］. Annals of General Psychiatry, 2020,19. DOI:10.1186/s12991-020-00289-0.
［18］MAH S, NELSON M R, DELISI L E, et al. Identification of the semaphorin receptor PLXNA2 as a candidate for susceptibility to schizophrenia［J］. Molecular Psychiatry, 2006,11（5）:471-478.
［19］SAETRE P, LUNDMARK P, WANG A,et al. The tryptophan hydroxylase 1 （TPH1） gene, schizophrenia susceptibility, and suicidal behavior: A multi-centre case-control study and meta-analysis［J］. Am .J .Med. Genet B Neuropsychiatr Genet, 2010,153B（2）:387-396.
［20］KIM B, KIM H, JOO Y H, et al. Sex-different association of DAO with schizophrenia in Koreans［J］. Psychiatry Res., 2010,179（2）:121-125.
［21］LAI J H, ZHU Y S, HUO Z H, et al. Association study of polymorphisms in the promoter region of DRD4 with schizophrenia, depression, and heroin addiction［J］. Brain Research, 2010,1359:227-232.
［22］YANG Y, ZHANG L, GUO D,et al. Association of DTNBP1 with schizophrenia: Findings from two independent samples of han Chinese population［J］. Front Psychiatry, 2020. DOI:10.3389/fpsyt.2020.00446.
［23］FACHIM H A, LOUREIRO C M, CORSI-ZUELLI F, et al. GRIN2B promoter methylation deficits in early-onset schizophrenia and its association with cognitive function［J］. Epigenomics, 2019,11（4）:401-410.
［24］YANDAVU, KUMAR P, GUPTA S，et al. Role of MTHFR C677T gene polymorphism in the susceptibility of schizophrenia: An updated meta-analysis［J］. Asian Journal of Psychiatry, 2016,20:41-51.
［25］YOSHIDA M, SHIROIWA K, MOURI K, et al. Haplotypes in the expression quantitative trait locus of interleukin-1β gene are associated with schizophrenia［J］. Schizophrenia Research, 2012,140（1-3）:185-191.
［26］SHEFFIELD J M, ROGERS B P, BLACKFORD J U, et al. Insula functional connectivity in schizophrenia［J］. Schizophrenia Research, 2020,220:69-77.
［27］HUANG X, PU W, LI X,et al. Decreased left putamen and thalamus volume correlates with delusions in first-Episode schizophrenia patients［J］. Front Psychiatry, 2017. DOI:10.3389/fpsyt.2017.00245.
［28］CACHIA A, CURY C, BRUNELIN J,et al. Deviations in early hippocampus development contribute to visual hallucinations in schizophrenia［J］. Translational Psychiatry, 2020,10（1）:102.
［29］ARNOLD S E. The medial temporal lobe in schizophrenia［J］. J Neuropsychiatry Clin Neurosci, 1997,9（3）:460-470.
［30］MOTHERSILL O, KNEE-ZASKA C, DONOHOE G. Emotion and theory of mind in schizophrenia—investigating the role of the cerebellum［J］. The Cerebellum. 2016,15（3）:357-368.
［31］DAS T K, KUMAR J, FRANCIS S, et al. Parietal lobe and disorganisation syndrome in schizophrenia and psychotic bipolar disorder: A bimodal connectivity study［J］. Psychiatry Research: Neuroimaging. 2020,303:111139.
［32］MUBARIK A, TOHID H. Frontal lobe alterations in schizophrenia: a review［J］. Trends Psychiatry Psychother, 2016,38（4）:198-206.
［33］HO N F, CHONG P L H, LEE D R, et al. The amygdala in schizophrenia and bipolar disorder: A synthesis of structural MRI, diffusion tensor imaging, and resting-state functional connectivity findings［J］. Harvard Review of Psychiatry 2019,27（3）:150-164.
［34］RAJARETHINAM R P, Dequardo J R, NALEPA R, et al. Superior temporal gyrus in schizophrenia: A volumetric magnetic resonance imaging study［J］. Schizophrenia Research, 2000,41（2）:303-312.

[1]	苏凯旋. 基于改进XGBoost模型的网络入侵检测研究[J]. 计算机与现代化, 2024, 0(06): 109-114.
[2]	罗澍寰, 孙武, 游杰, 王伟, 胡必伟, 姜南. 基于可见-近红外光谱法无损检测梨总酸含量[J]. 计算机与现代化, 2024, 0(05): 80-84.
[3]	崔少国, 胡光平. 基于语义分割的嵌套命名实体识别方法[J]. 计算机与现代化, 2024, 0(02): 69-74.
[4]	缪思巧, 凡红梅, 袁非梦. 考虑异强度相关性下多元退化系统的可靠性置信评估#br#[J]. 计算机与现代化, 2024, 0(02): 108-113.
[5]	彭穗, 许亮, 张志强, 娄源媛, 余浩, 秦晓辉. 基于混合藤Copula和ILHS的概率电压稳定评估算法[J]. 计算机与现代化, 2022, 0(12): 6-12.
[6]	王扬, 陈梅, 李晖. FOCoR:一种基于特征选择优化的课程推荐技术[J]. 计算机与现代化, 2022, 0(10): 1-7.
[7]	莫云. 基于混合特征选择的脑电解码方法[J]. 计算机与现代化, 2022, 0(04): 92-96.
[8]	赵延平, 王芳, 夏杨. 基于支持向量机的短文本分类方法[J]. 计算机与现代化, 2022, 0(02): 92-96.
[9]	王镇宇, 郑扬飞. 基于排序学习算法的智能检索系统[J]. 计算机与现代化, 2021, 0(10): 35-40.
[10]	张东方, 陈海燕, 袁立罡. S2R2:基于相关性与冗余性分析的半监督特征选择[J]. 计算机与现代化, 2021, 0(09): 113-120.
[11]	代继鹏, 邵峰晶, 孙仁诚. 基于改进CHI和TF-IDF的短文本分类的研究[J]. 计算机与现代化, 2021, 0(06): 6-11.
[12]	陈丝雨, 庄毅, 李静. 基于LSTM网络的移动云计算多元负载预测模型[J]. 计算机与现代化, 2021, 0(06): 74-85.
[13]	杨秋良, 王钰, 杨杏丽, 李济洪. 基于互信息F统计量特征选择技术的地基气象云图分类[J]. 计算机与现代化, 2021, 0(02): 18-23.
[14]	蒋万明1,2，郭春1,2，蒋朝惠1,2. 一种基于BiLSTM的低速率DDoS攻击检测方法[J]. 计算机与现代化, 2020, 0(05): 120-.
[15]	彦逸，李波，陈守明，林强，黄巨涛，温柏坚. 一种面向反馈网络的因果特征选择算法及其应用[J]. 计算机与现代化, 2019, 0(12): 95-.