eCOMPASS: evaluative comparison of multiple protein alignments by statistical score
PublisherOxford University Press
MetadataShow full item record
AbstractMotivation: Detecting subtle biologically relevant patterns in protein sequences often requires the construction of a large and accurate multiple sequence alignment (MSA). Methods for constructing MSAs are usually evaluated using benchmark alignments, which, however, typically contain very few sequences and are therefore inappropriate when dealing with large numbers of proteins. Results: eCOMPASS addresses this problem using a statistical measure of relative alignment quality based on direct coupling analysis (DCA): to maintain protein structural integrity over evolutionary time, substitutions at one residue position typically result in compensating substitutions at other positions. eCOMPASS computes the statistical significance of the congruence between high scoring directly coupled pairs and 3D contacts in corresponding structures, which depends upon properly aligned homologous residues. We illustrate eCOMPASS using both simulated and realMSAs. © 2021 The Author(s).
SponsorsNational Science Foundation
Identifier to cite or link to this itemhttp://hdl.handle.net/10713/19516