TY - JOUR
T1 - Lrrpredictor—a new LRR motif detection method for irregular motifs of plant NLR proteins using an ensemble of classifiers
AU - Martin, Eliza C.
AU - Sukarta, Octavina C.A.
AU - Spiridon, Laurentiu
AU - Grigore, Laurentiu G.
AU - Constantinescu, Vlad
AU - Tacutu, Robi
AU - Goverse, Aska
AU - Petrescu, Andrei Jose
N1 - Publisher Copyright:
© 2020 by the authors. Licensee MDPI, Basel, Switzerland.
PY - 2020/3/1
Y1 - 2020/3/1
N2 - Leucine-rich-repeats (LRRs) belong to an archaic procaryal protein architecture that is widely involved in protein–protein interactions. In eukaryotes, LRR domains developed into key recognition modules in many innate immune receptor classes. Due to the high sequence variability imposed by recognition specificity, precise repeat delineation is often difficult especially in plant NOD-like Receptors (NLRs) notorious for showing far larger irregularities. To address this problem, we introduce here LRRpredictor, a method based on an ensemble of estimators designed to better identify LRR motifs in general but particularly adapted for handling more irregular LRR environments, thus allowing to compensate for the scarcity of structural data on NLR proteins. The extrapolation capacity tested on a set of annotated LRR domains from six immune receptor classes shows the ability of LRRpredictor to recover all previously defined specific motif consensuses and to extend the LRR motif coverage over annotated LRR domains. This analysis confirms the increased variability of LRR motifs in plant and vertebrate NLRs when compared to extracellular receptors, consistent with previous studies. Hence, LRRpredictor is able to provide novel insights into the diversification of LRR domains and a robust support for structure-informed analyses of LRRs in immune receptor functioning.
AB - Leucine-rich-repeats (LRRs) belong to an archaic procaryal protein architecture that is widely involved in protein–protein interactions. In eukaryotes, LRR domains developed into key recognition modules in many innate immune receptor classes. Due to the high sequence variability imposed by recognition specificity, precise repeat delineation is often difficult especially in plant NOD-like Receptors (NLRs) notorious for showing far larger irregularities. To address this problem, we introduce here LRRpredictor, a method based on an ensemble of estimators designed to better identify LRR motifs in general but particularly adapted for handling more irregular LRR environments, thus allowing to compensate for the scarcity of structural data on NLR proteins. The extrapolation capacity tested on a set of annotated LRR domains from six immune receptor classes shows the ability of LRRpredictor to recover all previously defined specific motif consensuses and to extend the LRR motif coverage over annotated LRR domains. This analysis confirms the increased variability of LRR motifs in plant and vertebrate NLRs when compared to extracellular receptors, consistent with previous studies. Hence, LRRpredictor is able to provide novel insights into the diversification of LRR domains and a robust support for structure-informed analyses of LRRs in immune receptor functioning.
KW - LRR motif
KW - LRR structure
KW - Leucine-rich repeat prediction
KW - NOD-like receptors
KW - R proteins
KW - Supervised learning
UR - http://www.scopus.com/inward/record.url?scp=85081256857&partnerID=8YFLogxK
U2 - 10.3390/genes11030286
DO - 10.3390/genes11030286
M3 - Article
C2 - 32182725
AN - SCOPUS:85081256857
SN - 2073-4425
VL - 11
JO - Genes
JF - Genes
IS - 3
M1 - 286
ER -