ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

Co-linear chaining on pangenome graphs

Rajput, J and Chandra, G and Jain, C (2024) Co-linear chaining on pangenome graphs. In: Algorithms for Molecular Biology, 19 (1).

[img] PDF
alg_mal_bio_19_1_2024_4.pdf - Published Version
Restricted to Registered users only

Download (2MB) | Request a copy
Official URL: https://doi.org/10.1186/s13015-024-00250-w

Abstract

Pangenome reference graphs are useful in genomics because they compactly represent the genetic diversity within a species, a capability that linear references lack. However, efficiently aligning sequences to these graphs with complex topology and cycles can be challenging. The seed-chain-extend based alignment algorithms use co-linear chaining as a standard technique to identify a good cluster of exact seed matches that can be combined to form an alignment. Recent works show how the co-linear chaining problem can be efficiently solved for acyclic pangenome graphs by exploiting their small width and how incorporating gap cost in the scoring function improves alignment accuracy. However, it remains open on how to effectively generalize these techniques for general pangenome graphs which contain cycles. Here we present the first practical formulation and an exact algorithm for co-linear chaining on cyclic pangenome graphs. We rigorously prove the correctness and computational complexity of the proposed algorithm. We evaluate the empirical performance of our algorithm by aligning simulated long reads from the human genome to a cyclic pangenome graph constructed from 95 publicly available haplotype-resolved human genome assemblies. While the existing heuristic-based algorithms are faster, the proposed algorithm provides a significant advantage in terms of accuracy. Implementation (https://github.com/at-cg/PanAligner). © 2024, The Author(s).

Item Type: Journal Article
Publication: Algorithms for Molecular Biology
Publisher: BioMed Central Ltd
Additional Information: The copyright for this article belongs to author.
Department/Centre: Division of Interdisciplinary Sciences > Computational and Data Sciences
Date Deposited: 04 Mar 2024 05:33
Last Modified: 04 Mar 2024 05:33
URI: https://eprints.iisc.ac.in/id/eprint/84126

Actions (login required)

View Item View Item