Computer and Modernization ›› 2016, Vol. 0 ›› Issue (1): 20-25.doi: 10.3969/j.issn.1006-2475.2016.01.005

Previous Articles     Next Articles

Prediction of DNA-protein Binding Sites Based on Combining Sequence with Structure Information

  

  1. School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing 210094, China
  • Received:2015-10-19 Online:2016-01-22 Published:2016-01-26

Abstract: Most of the research of DNA-protein binding sites are focusing on just computing protein sequence information or structure information, while the results are terrible if combing this two information, no matter what at home or abroad. To solve this problem, we combine protein structure information of accessible surface area, relative solvent accessibility, depth index and protrusion index with protein sequence information of position specific scoring matrix to predict DNA-Protein binding sites. Then we use under sampling to solve the unbalance problem of training dataset. Finally, we use support vector machine to make prediction. The result of experiment shows the method that we proposed can achieve better performance in prediction.

Key words: position specific scoring matrix, accessible surface area, relative solvent accessibility, depth index and protrusion index, under sampling, support vector machine

CLC Number: