DeepSELEX: Inferring DNA-binding preferences from HT-SELEX data using multi-class CNNs

Maor Asif, Yaron Orenstein

Research output: Contribution to journalArticlepeer-review

2 Scopus citations

Abstract

Motivation: Transcription factor (TF) DNA-binding is a central mechanism in gene regulation. Biologists would like to know where and when these factors bind DNA. Hence, they require accurate DNA-binding models to enable binding prediction to any DNA sequence. Recent technological advancements measure the binding of a single TF to thousands of DNA sequences. One of the prevailing techniques, high-throughput SELEX, measures protein-DNA binding by high-throughput sequencing over several cycles of enrichment. Unfortunately, current computational methods to infer the binding preferences from high-throughput SELEX data do not exploit the richness of these data, and are under-using the most advanced computational technique, deep neural networks. Results: To better characterize the binding preferences of TFs from these experimental data, we developed DeepSELEX, a new algorithm to infer intrinsic DNA-binding preferences using deep neural networks. DeepSELEX takes advantage of the richness of high-throughput sequencing data and learns the DNA-binding preferences by observing the changes in DNA sequences through the experimental cycles. DeepSELEX outperforms extant methods for the task of DNA-binding inference from high-throughput SELEX data in binding prediction in vitro and is on par with the state of the art in in vivo binding prediction. Analysis of model parameters reveals it learns biologically relevant features that shed light on TFs' binding mechanism.

Original languageEnglish
Pages (from-to)I634-I642
JournalBioinformatics
Volume36
DOIs
StatePublished - 1 Dec 2020

Fingerprint

Dive into the research topics of 'DeepSELEX: Inferring DNA-binding preferences from HT-SELEX data using multi-class CNNs'. Together they form a unique fingerprint.

Cite this