DRPPP: A machine learning based tool for prediction of disease resistance proteins in plants

Tarun Pal, Varun Jaiswal, Rajinder S. Chauhan

Research output: Contribution to journalArticlepeer-review

61 Scopus citations

Abstract

Plant disease outbreak is increasing rapidly around the globe and is a major cause for crop loss worldwide. Plants, in turn, have developed diverse defense mechanisms to identify and evade different pathogenic microorganisms. Early identification of plant disease resistance genes (R genes) can be exploited for crop improvement programs. The present prediction methods are either based on sequence similarity/domain-based methods or electronically annotated sequences, which might miss existing unrecognized proteins or low similarity proteins. Therefore, there is an urgent need to devise a novel machine learning technique to address this problem. In the current study, a SVM-based tool was developed for prediction of disease resistance proteins in plants. All known disease resistance (R) proteins (112) were taken as a positive set, whereas manually curated negative dataset consisted of 119 R proteins. Feature extraction generated 10,270 features using 16 different methods. The ten-fold cross validation was performed to optimize SVM parameters using radial basis function. The model was derived using libSVM and achieved an overall accuracy of 91.11% on the test dataset. The tool was found to be robust and can be used for high-throughput datasets. The current study provides instant identification of R proteins using machine learning approach, in addition to the similarity or domain prediction methods.

Original languageEnglish
Pages (from-to)42-48
Number of pages7
JournalComputers in Biology and Medicine
Volume78
DOIs
StatePublished - 1 Nov 2016
Externally publishedYes

Keywords

  • Domain class
  • Nucleotide binding site-leucine rich repeat (NBS-LRR)
  • Receptor-like kinases (RLK)
  • Resistance proteins
  • SVM

ASJC Scopus subject areas

  • Health Informatics
  • Computer Science Applications

Fingerprint

Dive into the research topics of 'DRPPP: A machine learning based tool for prediction of disease resistance proteins in plants'. Together they form a unique fingerprint.

Cite this