Description
Pathogenic variants underlying Mendelian diseases often disrupt the normal physiology of a few tissues and organs. However, variant effect prediction tools that aim to identify pathogenic variants are typically oblivious to tissue contexts. Here we report a machine- learning framework, denoted ‘Tissue Risk Assessment of Causality by Expression for variants’ (TRACEvar, https://netbio.bgu.ac.il/TRACEvar/), that offers two advancements. First, TRACEvar predicts pathogenic variants that disrupt the normal physiology of specific tissues. This was achieved by creating 14 tissue-specific models that were trained on over 14,000 variants and combined 85 attributes of genetic variants with 495 attributes derived from tissue omics. TRACEvar outperformed 10 well-established and tissue-oblivious variant effect prediction tools. Second, the resulting models are interpretable, thereby illuminating variants' mode-of-action. Application of TRACEvar to variants of 52 rare-disease patients highlighted pathogenicity mechanisms and relevant disease processes. Lastly, interpretation of large-scale models revealed that top-ranking determinants of pathogenicity included attributes of disease-affected tissues, particularly cellular process activities. Hence, tissue contexts and interpretable machine-learning models can greatly enhance the etiology of rare diseases.
Date made available | 21 Jul 2024 |
---|---|
Publisher | ZENODO |