面向微阵列基因数据的基于PA指标双向聚类算法

doi:10.3969/j.issn.1006-2475.2014.12.003

计算机与现代化

面向微阵列基因数据的基于PA指标双向聚类算法

（1.广东医学院信息工程学院，广东东莞 523808; 2.广东医学院公共卫生学院，广东东莞 523808）

收稿日期:2014-09-24 出版日期:2014-12-22 发布日期:2014-12-22
作者简介:林勤（1987-），男，广东揭阳人，广东医学院信息工程学院助理实验师，硕士研究生，研究方向：数据挖掘，并行计算; 林斯达（1992-），男，广东汕头人，广东医学院公共卫生学院本科生，研究方向：数据挖掘，生物信息学; 朱文敏(1992-)，男，广东河源人，本科生，研究方向：云计算，数据挖掘。
基金资助:
广东医学院面上基金资助项目(XK1330); 广东医学院大学生科研立项发明类重点项目(2012FZDI004); 广东医学院大学生科研立项社科类一般项目(2013SYDG009)

A Biclustering Algorithm Based on PA Index in Microarray Gene Expression Data

(1. School of Information Engineering, Guangdong Medical College, Dongguan 523808, China;
2. School of Public Health, Guangdong Medical College, Dongguan 523808, China)

Received:2014-09-24 Online:2014-12-22 Published:2014-12-22

摘要/Abstract

摘要： 针对目前双聚类算法很少考虑所得聚类结果整体的划分质量问题，提出一种基于PA指标的双聚类算法。该算法选定一种衡量所有簇划分效果的PA指标来构造双聚类的模型，运用启发式贪心策略，通过迭代增删行列的方式挖掘出划分效果较高的几个双聚类。将所提算法与CC、FLOC算法进行算法性能的比较。实验结果表明，该算法能获得更好的结果。这说明该算法更能挖掘出具备既有统计意义又有生物意义的局部模式。

关键词: PA指标, 双聚类, GO分析, 微阵列基因数据

Abstract: To improve the global quality of the outcome of the biclustering, a biclustering algorithm based on PA index was proposed. In this algorithm, the PA index which can estimate the effect of outcome of the biclustering was chosen to construct the bicluster model. Through deleting or inserting rows and columns in the heuristic greedy fashion, the algorithm obtained the significant biclusters of high global quality. To compare the performance of the algorithm with CC and FLOC, a real dataset experiment was simulated. The result shows that the algorithm in this paper can obtain better results. In a word, the algorithm is capable of detecting potentially statistically and biologically significant biclusters.

Key words: PA index, biclustering, GO analysis, microarray gene expression data

中图分类号:

TP391

林勤1，林斯达2，朱文敏1. 面向微阵列基因数据的基于PA指标双向聚类算法[J]. 计算机与现代化, doi: 10.3969/j.issn.1006-2475.2014.12.003.

LIN Qin1, LIN Si-da2, ZHU Wen-min1. A Biclustering Algorithm Based on PA Index in Microarray Gene Expression Data[J]. Computer and Modernization, doi: 10.3969/j.issn.1006-2475.2014.12.003.

参考文献

[1] Cheng Yizong, Church G M. Biclustering of expression data[C]// Proceedings of the 8th International Conference on Intelligent Systems for Molecular Biology. 2000:93-103.

[2] Yang Jiong, Wang Wei. Enhanced biclustering on gene expression data[C]// Proceedings of the 3rd IEEE Conference on Bioinformatics and Bioengineering. 2003:321-327.

[3] 朱娴,许建华. 基于模拟退火粒子群优化的基因数据双聚类算法[J]. 计算机与应用化学, 2013,30(1):93-96.

[4] 王常武,刘楠楠,贾永伟,等. 应用于癌症基因表达数据的OMB双向聚类算法[J]. 计算机工程与应用, 2011,47(28):237-240.

[5] 胡云,苗夺谦,王睿智,等. 一种基于粗糙k均值的双聚类算法[J]. 计算机科学, 2007,34(11):174-177.

[6] 闫雷鸣,孙志挥. 一种基于二次互信息的双聚类算法[J]. 计算机工程与应用, 2006,42(22):158-160.

[7] Zhang Ya, Zha Hongyuan, Chu C H. A time-series biclustering algorithm for revealing co-regulated genes[C]// Proceedings of the 5th IEEE International Conference on Information Technology: Coding and Computing. 2005:32-37.

[8] 杨蜜静,尚学群,许涛,等. 面向时序基因表达数据的双聚类算法[J]. 计算机应用研究, 2013,30(8):2308-2314.

[9] 林勤,薛云. 双聚类算法在电信高价值客户细分的应用[J]. 计算机应用, 2014,34(6):1807-1811.

[10] Ashburner M, Ball C A, Blake J A, et al. Gene ontology: Tool for the unification of biology[J]. Nature Genetics, 2000,25(1):25-29.

[11] David Martin, Christine Brun, Elisabeth Remy, et al. GOToolBox: Functional analysis of gene datasets based on gene ontology[J]. Genome Biology, 2004,5(12):R101.

[12] Zeeberg B R, Feng Weimin, Wang G, et al. GoMiner: A resource for biological interpretation of genomic and proteomic data[J]. Genome Biology, 2003,4(4):R28.

[13] Fadhl Al-Akwaa, Yasser Kadah. An automatic gene ontology software tool for bicluster and cluster comparisons[C]// Proceedings of the 2009 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology. 2009.

[14] Shamir R, Maron-Katz A, Tanay A, et al. Expander: An integrative program suite for microarray data analysis[J]. BMC Bioinformatics, 2005,6: Article Number 232.

[15] Barkow S, Bleuler S, Prelic A, et al. BicAT: A biclustering analysis toolbox[J]. Bioinformatics, 2006,22(10):1282-1283.

[16] Mitra S, Banka H. Multi-objective evolutionary biclustering of gene expression data[J]. Pattern Recognition, 2006,39(12):2464-2477.

[17] Liu Junwan, Li Zhoujun, Liu Feifei, et al. Multi-objective particle swarm optimization biclustering of microarray data[C]// Proceedings of the 2008 IEEE International Conference on Bioinformatics and Biomedicine. 2008:363-366.

面向微阵列基因数据的基于PA指标双向聚类算法

A Biclustering Algorithm Based on PA Index in Microarray Gene Expression Data

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

本文评价

[1]	付鸿林, 张太红, 杨雅婷, 艾孜麦提·艾瓦尼尔, 马博. 基于生成对抗网络的维语场景文字修改网络[J]. 计算机与现代化, 2024, 0(01): 41-46.
[2]	王秋忆, 周浩, 郑婷婷. 改进RetinaNet的电力设备目标检测方法[J]. 计算机与现代化, 2024, 0(01): 47-52.
[3]	林启钊, 彭志平, 郭棉, 崔得龙. 基于双向多步预测的炉管温度场重构方法[J]. 计算机与现代化, 2024, 0(01): 53-58.
[4]	郑立瑞, 肖晓霞, 邹北骥, 刘彬, 周展. 基于BERT的电子病历命名实体识别[J]. 计算机与现代化, 2024, 0(01): 87-91.
[5]	李颖颖, 黄文培. 基于优化八叉树的场景视锥体裁剪算法[J]. 计算机与现代化, 2024, 0(01): 103-108.
[6]	夏千涵, 何胜煌, 吴元清, 赵乐乐. 基于可学习记忆特征金字塔网络的小样本目标检测[J]. 计算机与现代化, 2023, 0(12): 7-13.
[7]	周成诚, 曾庆军, 杨康, 胡家铭, 韩春伟. 基于高效通道注意力模块的运动想象脑电识别[J]. 计算机与现代化, 2023, 0(12): 19-23.
[8]	曾伟平, 陈俊洪, Muhammad ASIM, 刘文印, 杨振国. 基于多阶段分形组合的点云补全算法[J]. 计算机与现代化, 2023, 0(12): 24-29.
[9]	白晓波, 江梦茜, 王铁山, 邵景峰, 李勃, . 聚类质心与指数递减方法改进的哈里斯鹰算法[J]. 计算机与现代化, 2023, 0(12): 30-35.
[10]	邱凯星, 冯广. 基于双重特征注意力的多标签图像分类模型[J]. 计算机与现代化, 2023, 0(12): 41-47.
[11]	杜康, 郭鲁钰, 徐啟蕾, 单宝明, 张方坤. 基于模型种群分析变量选择的红外光谱建模方法[J]. 计算机与现代化, 2023, 0(12): 48-52.
[12]	刘语珵, 贺奇, 董延华, 王晓宇. 结合时间相关度与课程搭配度的课程推荐方法[J]. 计算机与现代化, 2023, 0(12): 53-58.
[13]	张浩洋, 尹梓名, 乐珺怡, 沈达聪, 束翌俊, 杨自逸, 孔祥勇, 龚伟. 3D-SPRNet: 一种基于并行解码器和双注意力机制的胆囊癌分割模型[J]. 计算机与现代化, 2023, 0(12): 59-66.
[14]	张伯泉, 麦海鹏, 陈嘉敏, 逄锦聚. 基于高灰度值注意力机制的脑白质高信号分割[J]. 计算机与现代化, 2023, 0(12): 67-75.
[15]	张在成, 李健. 改进的神经渲染方法在建筑施工场景中的应用[J]. 计算机与现代化, 2023, 0(12): 76-81.