• Atlas of Transcription Factor Binding Sites from ENCODE DNase Hypersensitivity Data across 27 Tissue Types

      Funk, Cory C.; Casella, Alex M.; Jung, Segun; Richards, Matthew A.; Rodriguez, Alex; Shannon, Paul; Donovan-Maiye, Rory; Heavner, Ben; Chard, Kyle; Xiao, Yukai; et al. (Elsevier B.V., 2020-08-18)
      DNase-seq footprinting provides a means to predict genome-wide binding sites for hundreds of transcription factors (TFs) simultaneously. Funk et al. analyze data from the ENCODE consortium to create a resource of footprints in 27 human tissues, demonstrating associations of tissue-specific TF occupancy with gene regulation and disease risk. © 2020 The AuthorsCharacterizing the tissue-specific binding sites of transcription factors (TFs) is essential to reconstruct gene regulatory networks and predict functions for non-coding genetic variation. DNase-seq footprinting enables the prediction of genome-wide binding sites for hundreds of TFs simultaneously. Despite the public availability of high-quality DNase-seq data from hundreds of samples, a comprehensive, up-to-date resource for the locations of genomic footprints is lacking. Here, we develop a scalable footprinting workflow using two state-of-the-art algorithms: Wellington and HINT. We apply our workflow to detect footprints in 192 ENCODE DNase-seq experiments and predict the genomic occupancy of 1,515 human TFs in 27 human tissues. We validate that these footprints overlap true-positive TF binding sites from ChIP-seq. We demonstrate that the locations, depth, and tissue specificity of footprints predict effects of genetic variants on gene expression and capture a substantial proportion of genetic risk for complex traits. © 2020 The Authors