Show simple item record

dc.contributor.authorLin, Jiadong
dc.contributor.authorYang, Xiaofei
dc.contributor.authorKosters, Walter
dc.contributor.authorXu, Tun
dc.contributor.authorJia, Yanyan
dc.contributor.authorWang, Songbo
dc.contributor.authorZhu, Qihui
dc.contributor.authorRyan, Mallory
dc.contributor.authorGuo, Li
dc.contributor.authorZhang, Chengsheng
dc.contributor.authorLee, Charles
dc.contributor.authorDevine, Scott E
dc.contributor.authorEichler, Evan E
dc.contributor.authorYe, Kai
dc.date.accessioned2021-07-07T16:52:40Z
dc.date.available2021-07-07T16:52:40Z
dc.date.issued2021-07-03
dc.identifier.urihttp://hdl.handle.net/10713/16143
dc.description.abstractComplex structural variants (CSVs) are genomic alterations that have more than two breakpoints and are considered as the simultaneous occurrence of simple structural variants. However, detecting the compounded mutational signals of CSVs is challenging through a commonly used model-match strategy. As a result, there has been limited progress for CSV discovery compared with simple structural variants. We systematically analyzed the multi-breakpoint connection feature of CSVs, and proposed Mako, utilizing a bottom-up guided model-free strategy, to detect CSVs from paired-end short-read sequencing. Specifically, we implemented a graph-based pattern growth approach, where the graph depicts potential breakpoint connections, and pattern growth enables CSV detection without pre-defined models. Comprehensive evaluations on both simulated and real datasets revealed that Mako outperformed other algorithms. Notably, validation rates of CSV on real data based on experimental and computational validations as well as manual inspections are around 70%, where the medians of experimental and computational breakpoint shift are 13bp and 26bp, respectively. Moreover, the Mako CSV subgraph effectively characterized the breakpoint connections of a CSV event and uncovered a total of 15 CSV types, including two novel types of adjacent segments swap and tandem dispersed duplication. Further analysis of these CSVs also revealed the impact of sequence homology in the formation of CSVs. Mako is publicly available at https://github.com/xjtu-omics/Mako.en_US
dc.description.urihttps://doi.org/10.1016/j.gpb.2021.03.007en_US
dc.language.isoenen_US
dc.publisherElsevier B.V.en_US
dc.relation.ispartofGenomics, Proteomics & Bioinformaticsen_US
dc.rightsCopyright © 2021. Published by Elsevier B.V.en_US
dc.subjectComplex structural variantsen_US
dc.subjectFormation mechanismen_US
dc.subjectGraph miningen_US
dc.subjectNext-generation sequencingen_US
dc.subjectPattern growthen_US
dc.titleMako: A Graph-based Pattern Growth Approach to Detect Complex Structural Variantsen_US
dc.typeArticleen_US
dc.identifier.doi10.1016/j.gpb.2021.03.007
dc.identifier.pmid34224879
dc.source.countryChina


Files in this item

Thumbnail
Name:
Publisher version

This item appears in the following Collection(s)

Show simple item record