Một phương pháp phân hạng gen gây bệnh mới dựa trên tổng xác suất liên kết trong mạng tương tác protein

  • Đặng Vũ Tùng Vietnam Youth Academy
  • Nguyễn Đại Phong
  • Lê Đức Hậu
  • Từ Minh Phương


Prioritizing candidate disease-related genes using computational methods and biological networks data is an important problem in bioinformatics. Random walk with restart (RWR) algorithm is widely used for this problem due to its relatively high accuracy. However, RWR is computationally expensive as it considers every node in a network. Here we propose to use a new method for prioritizing candidate genes, in which genes with low probability of association with disease genes are excluded from further consideration, thus reducing computational complexity. Experiments on real protein interaction networks show that the proposed method was computationally efficient, and more accurate than RWR, as measured by AUC scores. We applied the proposed method to prioritizing candidate genes for human diabetes type 2. The results were promising: among top 20 ranked genes, 11 are associated with diabetes, as reported in the biomedical literature.

Author Biography

Đặng Vũ Tùng, Vietnam Youth Academy
Bộ môn Tin học


G. H. FERNALD, E. CAPRIOTTI, R. DANESHJOU, K. J. KARCZEWSKI and R. B. ALTMAN, "Bioinformatics challenges for personalized medicine", Bioinformatics, 27 (2011), pp. 1741-1748.

D. JONES, "Steps on the road to personalized medicine", Nature Reviews Drug Discovery, 6 (2007), pp. 770-771.

K. REYNOLDS, "Achieving the Promise of Personalized Medicine", Clinical Pharmacology & Therapeutics, 92 (2012), pp. 401-405.

S. R, U. I and S. R, "Network-based prediction of protein function", Molecular Systems Biology, 3(88) (2007).

M. ML, M. JC, L. AC, A.-B. M, C. ME and E. AL, "Meta-analysis of 13 genome scans reveals multiple cleft lip/palate genes with novel loci on 9q21 and 2q32-35", American Journal of Human Genetics, 75(2) (2004), pp. 161-173.

J. LB, "Linkage disequilibrium and the search for complex disease genes", Genome Research, 10(10) (2000), pp. 1435-1444.

A. EA, A. RR, E. KL, P. DJ and P. BS, "Suspects: enabling fast and effective prioritization of positional candidates", Bioinformatics, 22 (2006), pp. 773-774.

H. JE, K. AT, M. HL and P. MA, "Candid: a flexible method for prioritizing candidate genes for complex human traits", Genetic Epidemiology, 32 (2008), pp. 779-790.

A. S, L. D, M. S, V. L. P, C. B and E. AL, "Gene prioritization through genomic data fusion", Nature Biotechnology, 24 (2006), pp. 537-544.

C. J, X. H, A. BJ and J. AG, "Improved human disease candidate gene prioritization using mouse phenotype", BMC Bioinformatics, 8 (2007).

S. D, S. JM and S. M, "Genedistiller - distilling candidate genes from linkage intervals", PLoS ONE,, 3 (2008).

C. J., A. B. and J. A., "Disease candidate gene identification and prioritization using protein interaction networks", BMC Bioinformatics, 10 (2009).

Đ. V. TÙNG, D. A. TRÀ, L. Đ. HẬU and T. M. PHƯƠNG, "Phân hạng gen gây bệnh sử dụng học tăng cường kết hợp với xác suất tiền nghiệm", Tạp chí Công nghệ thông tin & Truyền thông, 13(33) (2015), pp. 55-66.

S. KÖHLER, S. BAUER, D. HORN and P. N. ROBINSON, "Walking the Interactome for Prioritization of Candidate Disease Genes", The American Journal of Human Genetics, 82 (2008), pp. 949-958.

S. NAVLAKHA and C. KINGSFORD, "The power of protein interaction networks for associating genes with diseases.", Bioinformatics 26 (2010), pp. 1057-1063.

D.-H. LE, "Network-based ranking methods for prediction of novel disease associated microRNAs", Computational Biology and Chemistry, 58 (2015), pp. 139-148.

X. CHEN, M.-X. LIU and G.-Y. YAN, "Drug–target interaction prediction by random walk on the heterogeneous network", Molecular BioSystems, 8 (2012), pp. 1970-1978.

H. WANG, C. K. CHANG, H.-I. YANG and Y. CHEN, "Estimating the Relative Importance of Nodes in Social Networks", Journal of Information Processing Society of Japan, 21(3) (2013), pp. 414-422.

D.-H. LE and Y.-K. KWON, "Neighbor-favoring weight reinforcement to improve random walk-based disease gene prioritization", Computational Biology and Chemistry, 44 (2013), pp. 1-8.

B. LINGHU, E. S. SNITKIN, Z. HU, Y. XIA and C. DELISI, "Genome-wide prioritization of disease genes and identification of disease-disease associations from an integrated human functional linkage network", Genome Biology, 10 (2009).

J. AMBERGER, C. A. BOCCHINI, A. F. SCOTT and A. HAMOSH, "McKusick's Online Mendelian Inheritance in Man (OMIM®)", Nucleic Acids Research, 37 (2009), pp. D793-D796.

D. J. WATTS and S. H. STROGATZ, "Collective dynamics of small-world networks", Nature 393(1) (1998), pp. 440-442.

B. H. JUNKER, D. KOSCHÜTZKI and F. SCHREIBER, "Exploration of biological network centralities with CentiBiN", BMC Bioinformatics, 7:219 (2006).

J. D. OSBORNE, S. LIN, W. A. KIBBE, L. J. ZHU, M. I. DANILA and R. L. CHISHOLM, "GeneRIF is a more comprehensive, current and computationally tractable source of gene-disease relationships than OMIM", Oxford University Press (2006).

B. D, S. M, G. S, M. PP, R. MR, M. V and R. V, "Association of His1085His INSR gene polymorphism with type 2 diabetes in South Indians", Diabetes Technol Ther, 14 (2012), pp. 696-700.

B. KAZEMI, N. SEYED, E. MOSLEMI, M. BANDEHPOUR, M. B. TORBATI, N. SAADAT, A. EIDI, E. GHAYOOR and F. AZIZI, "Insulin Receptor Gene Mutations in Iranian Patients with Type II Diabetes Mellitus", Iranian Biomedical Journal, 13 (2009), pp. 161-168.

D. RENDE, N. BAYSAL and B. KIRDAR, "Complex Disease Interventions from a Network Model for Type 2 Diabetes", PLoS One, 8 (2013).

S. A. MYERS, A. NIELD and M. MYERS, "Zinc Transporters, Mechanisms of Action and Therapeutic Utility: Implications for Type 2 Diabetes Mellitus", Journal of Nutrition and Metabolism, 2012 (2012), pp. 13.

C. S. C. RICHARD I. G. HOLT, ALLAN FLYVBJERG, BARRY J. GOLDSTEIN, Textbook of Diabetes, Wiley-Blackwell, 2010.

L. PORETSKY, Principles of Diabetes Mellitus, Springer New York Dordrecht Heidelberg London, 2010.

R. TAYLOR, "Insulin Resistance and Type 2 Diabetes", Diabetes, 61 (2012), pp. 778-779.

M. WINKLER, R. LUTZ, U. RUSS, U. QUAST and J. BRYAN, "Analysis of two KCNJ11 neonatal diabetes mutations, V59G and V59A, and the analogous KCNJ8 I60G substitution: differences between the channel subtypes formed with SUR1.", J Biol Chem, 284 (2009), pp. 6752-6762.

K.-P. A, P.-C. A, P. P, B. K, M.-K. M, B. P, S. K, L. HY, Q. E, P. R, K. A and P. LJ, "Andersen-Tawil syndrome: report of 3 novel mutations and high risk of symptomatic cardiac involvement", Muscle Nerve, 51 (2015), pp. 192-196.

X. JIANG, W. ZHANG, H. KAYED, P. ZHENG, N. A. GIESE, H. FRIESS and J. KLEEFF, "Loss of ONECUT1 expression in human pancreatic cancer cells", Oncol Rep, 19 (2008), pp. 157-163.

P. HOLM, B. RYDLANDER, H. LUTHMAN and I. KOCKUM, "Interaction and Association Analysis of a Type 1 Diabetes Susceptibility Locus on Chromosome 5q11-q13 and the 7q32 Chromosomal Region in Scandinavian Families", Diabetes, 53 (2004), pp. 1584-1591.

V. RANDHAWA, P. SHARMA, S. BHUSHAN and G. BAGLER, "Identification of Key Nodes of Type 2 Diabetes Mellitus Protein Interactome and Study of their Interactions with Phloridzin", OMICS: A Journal of Integrative Biology, 17 (2013), pp. 302-317.

A. BORGLYKKE, N. GRARUP, T. SPARSØ, A. LINNEBERG, M. FENGER, J. JEPPESEN, T. HANSEN, O. PEDERSEN and T. JØRGENSEN, "Genetic Variant SCL2A2 Is Associated with Risk of Cardiovascular Disease – Assessing the Individual and Cumulative Effect of 46 Type 2 Diabetes Related Genetic Variants", PLoS One, 7 (2012).

O. LAUKKANEN, J. LINDSTRÖM, J. ERIKSSON, T. T. VALLE, H. HÄMÄLÄINEN, P. ILANNE-PARIKKA, S. KEINÄNEN-KIUKAANNIEMI, J. TUOMILEHTO, M. UUSITUPA and M. LAAKSO, "Polymorphisms in the SLC2A2 (GLUT2) Gene Are Associated With the Conversion From Impaired Glucose Tolerance to Type 2 Diabetes: The Finnish Diabetes Prevention Study", Diabetes, 54 (2005), pp. 2256-2260.

J. CHEN, Y. MENG, J. ZHOU, M. ZHUO, F. LING, Y. ZHANG, H. DU and X. WANG, "Identifying Candidate Genes for Type 2 Diabetes Mellitus and Obesity through Gene Expression Profiling in Multiple Tissues or Cells", J Diabetes Res, 2013 (2013).

N. MC, L. VK, T. CH, C. AW, S. WY, M. RC, Z. BC, W. MM, M. WW, H. C, W. CR, T. PC, J. WP and C. JC, "Association of the POU class 2 homeobox 1 gene (POU2F1) with susceptibility to Type 2 diabetes in Chinese populations", Diabetic Medicine, 27 (2010), pp. 1443-1449.

REFSEQ, ENPP5 ectonucleotide pyrophosphatase/phosphodiesterase 5, 2014.

B. A, V. E, P. J, S. B, L. S, Y. L, H. M, C. H, B. K, S. R, P. M, A.-R. M, F. P and V. M, "Transcription factor gene MNX1 is a novel cause of permanent neonatal diabetes in a consanguineous family", Diabetes Metab, 39 (2013), pp. 276-280.

S. KONDO, K. FUJIKI, S. B. H. KO, A. YAMAMOTO, M. NAKAKUKI, Y. ITO, N. SHCHEYNIKOV, M. KITAGAWA, S. NARUSE and H. ISHIGURO, "Functional characteristics of L1156F-CFTR associated with alcoholic chronic pancreatitis in Japanese", American Journal of Physiology - Gastrointestinal and Liver Physiology, 309 (2015), pp. 260-269.

I. HERSA, E. E. VINCENT and J. M. TAVARÉ, "Akt signalling in health and disease", Cellular Signalling, 23 (2011), pp. 1515-1527.

H. LI, J. W. C. LOUEY, K. W. CHOY, D. T. L. LIU, W. M. CHAN, Y. M. CHAN, N. S. K. FUNG, B. J. FAN, L. BAUM, J. C. N. CHAN, D. S. C. LAM and C. P. PANG, "EDN1 Lys198Asn is associated with diabetic retinopathy in type 2 diabetes", Molecular Vision, 2008 (2008), pp. 1698-1704.

V. S. FAROOK, R. L. HANSON, J. K. WOLFORD, C. BOGARDUS and M. PROCHAZKA, "Molecular Analysis of KCNJ10 on 1q as a Candidate Gene for Type 2 Diabetes in Pima Indians", Diabetes, 51 (2002), pp. 3342-3346.

T. OHSHIGE, M. IWATA, S. OMORI, Y. TANAKA, H. HIROSE, K. KAKU, H. MAEGAWA, H. WATADA, A. KASHIWAGI, R. KAWAMORI, K. TOBE, T. KADOWAKI, Y. NAKAMURA and S. MAEDA, "Association of New Loci Identified in European Genome-Wide Association Studies with Susceptibility to Type 2 Diabetes in the Japanese", PLoS One, 6 (2011).

D.-H. LE and Y.-K. KWON, "GPEC: A Cytoscape plug-in for random walk-based gene prioritization and biomedical evidence collection", Computational Biology and Chemistry, 37 (2012), pp. 17-23.

L. Y and P. JC, "Genome-wide inferring gene-phenotype relationship by walking on the heterogeneous network", Bioinformatics, 26 (2010), pp. 1219-1224.

Bài báo