» Articles » PMID: 37961628

Comprehensive Detection of Structural Variation and Transposable Element Differences Between Wild Type Laboratory Lineages of

Overview
Journal bioRxiv
Date 2023 Nov 14
PMID 37961628
Authors
Affiliations
Soon will be listed here.
Abstract

Genomic structural variations (SVs) and transposable elements (TEs) can be significant contributors to genome evolution, altered gene expression, and risk of genetic diseases. Recent advancements in long-read sequencing have greatly improved the quality of genome assemblies and enhanced the detection of sequence variants at the scale of hundreds or thousands of bases. Comparisons between two diverged wild isolates of , the Bristol and Hawaiian strains, have been widely utilized in the analysis of small genetic variations. Genetic drift, including SVs and rearrangements of repeated sequences such as TEs, can occur over time from long-term maintenance of wild type isolates within the laboratory. To comprehensively detect both large and small structural variations as well as TEs due to genetic drift, we generated genome assemblies and annotations for each strain from our lab collection using both long- and short-read sequencing and compared our assemblies and annotations with that of other lab wild type strains. Within our lab assemblies, we annotate over 3.1Mb of sequence divergence between the Bristol and Hawaiian isolates: 337,584 SNPs, 94,503 small insertion-deletions (<50bp), and 4,334 structural variations (>50bp). Further, we define the location and movement of specific DNA TEs between N2 Bristol and CB4856 Hawaiian wild type isolates. Specifically, we find the N2 Bristol genome has 20.6% more TEs from the family than the CB4856 Hawaiian genome. Moreover, we identified Zator elements as the most abundant and mobile TE family in the genome. Using specific TE sequences with unique SNPs, we also identify 38 TEs that moved intrachromosomally and 9 TEs that moved interchromosomally between the N2 Bristol and CB4856 Hawaiian genomes. By comparing the genome assembly of our lab collection Bristol isolate to the VC2010 Bristol assembly, we also reveal that lab lineages display over 2 Mb of total variation: 1,162 SNPs, 1,528 indels, and 897 SVs with 95% of the variation due to SVs. Overall, our work demonstrates the unique contribution of SVs and TEs to variation and genetic drift between wild type laboratory strains assumed to be isogenic despite growing evidence of genetic drift and phenotypic variation.

References
1.
Li H, Durbin R . Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009; 25(14):1754-60. PMC: 2705234. DOI: 10.1093/bioinformatics/btp324. View

2.
Sudmant P, Rausch T, Gardner E, Handsaker R, Abyzov A, Huddleston J . An integrated map of structural variation in 2,504 human genomes. Nature. 2015; 526(7571):75-81. PMC: 4617611. DOI: 10.1038/nature15394. View

3.
McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A . The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 2010; 20(9):1297-303. PMC: 2928508. DOI: 10.1101/gr.107524.110. View

4.
Lappalainen T, Scott A, Brandt M, Hall I . Genomic Analysis in the Age of Human Genome Sequencing. Cell. 2019; 177(1):70-84. PMC: 6532068. DOI: 10.1016/j.cell.2019.02.032. View

5.
Hodgkin J, Doniach T . Natural variation and copulatory plug formation in Caenorhabditis elegans. Genetics. 1997; 146(1):149-64. PMC: 1207933. DOI: 10.1093/genetics/146.1.149. View