ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

Chasing perfection: validation and polishing strategies for telomere-to-telomere genome assemblies

Mc Cartney, AM and Shafin, K and Alonge, M and Bzikadze, AV and Formenti, G and Fungtammasan, A and Howe, K and Jain, C and Koren, S and Logsdon, GA and Miga, KH and Mikheenko, A and Paten, B and Shumate, A and Soto, DC and Sovic, I and Wood, JMD and Zook, JM and Phillippy, AM and Rhie, A (2022) Chasing perfection: validation and polishing strategies for telomere-to-telomere genome assemblies. In: Nature Methods .

nat_met_2022.pdf - Published Version

Download (8MB) | Preview
Official URL: https://doi.org/10.1038/s41592-022-01440-3


Advances in long-read sequencing technologies and genome assembly methods have enabled the recent completion of the first telomere-to-telomere human genome assembly, which resolves complex segmental duplications and large tandem repeats, including centromeric satellite arrays in a complete hydatidiform mole (CHM13). Although derived from highly accurate sequences, evaluation revealed evidence of small errors and structural misassemblies in the initial draft assembly. To correct these errors, we designed a new repeat-aware polishing strategy that made accurate assembly corrections in large repeats without overcorrection, ultimately fixing 51 of the existing errors and improving the assembly quality value from 70.2 to 73.9 measured from PacBio high-fidelity and Illumina k-mers. By comparing our results to standard automated polishing tools, we outline common polishing errors and offer practical suggestions for genome projects with limited resources. We also show how sequencing biases in both high-fidelity and Oxford Nanopore Technologies reads cause signature assembly errors that can be corrected with a diverse panel of sequencing technologies. © 2022, This is a U.S. government work and not under copyright protection in the U.S.; foreign copyright protection may apply.

Item Type: Journal Article
Publication: Nature Methods
Publisher: Nature Research
Additional Information: The copyright for this article belongs to authors.
Department/Centre: Division of Interdisciplinary Sciences > Computational and Data Sciences
Date Deposited: 18 May 2022 09:11
Last Modified: 18 May 2022 09:11
URI: https://eprints.iisc.ac.in/id/eprint/71841

Actions (login required)

View Item View Item