ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

Long-read mapping to repetitive reference sequences using Winnowmap2

Jain, C and Rhie, A and Hansen, NF and Koren, S and Phillippy, AM (2022) Long-read mapping to repetitive reference sequences using Winnowmap2. In: Nature Methods .

[img] PDF
nat_met_2022.pdf - Published Version
Restricted to Registered users only

Download (2MB) | Request a copy
Official URL: https://doi.org/10.1038/s41592-022-01457-8


Approximately 5�10 of the human genome remains inaccessible due to the presence of repetitive sequences such as segmental duplications and tandem repeat arrays. We show that existing long-read mappers often yield incorrect alignments and variant calls within long, near-identical repeats, as they remain vulnerable to allelic bias. In the presence of a nonreference allele within a repeat, a read sampled from that region could be mapped to an incorrect repeat copy. To address this limitation, we developed a new long-read mapping method, Winnowmap2, by using minimal confidently alignable substrings. Winnowmap2 computes each read mapping through a collection of confident subalignments. This approach is more tolerant of structural variation and more sensitive to paralog-specific variants within repeats. Our experiments highlight that Winnowmap2 successfully addresses the issue of allelic bias, enabling more accurate downstream variant calls in repetitive sequences. © 2022, The Author(s), under exclusive licence to Springer Nature America, Inc.

Item Type: Journal Article
Publication: Nature Methods
Publisher: Nature Research
Additional Information: The copyright for this article belongs to Nature Research
Keywords: allele; article; human; paralogy
Department/Centre: Division of Interdisciplinary Sciences > Computational and Data Sciences
Date Deposited: 18 May 2022 08:50
Last Modified: 18 May 2022 08:50
URI: https://eprints.iisc.ac.in/id/eprint/71831

Actions (login required)

View Item View Item