» Articles » PMID: 35134925

An Improved Ovine Reference Genome Assembly to Facilitate In-depth Functional Annotation of the Sheep Genome

Abstract

Background: The domestic sheep (Ovis aries) is an important agricultural species raised for meat, wool, and milk across the world. A high-quality reference genome for this species enhances the ability to discover genetic mechanisms influencing biological traits. Furthermore, a high-quality reference genome allows for precise functional annotation of gene regulatory elements. The rapid advances in genome assembly algorithms and emergence of sequencing technologies with increasingly long reads provide the opportunity for an improved de novo assembly of the sheep reference genome.

Findings: Short-read Illumina (55× coverage), long-read Pacific Biosciences (75× coverage), and Hi-C data from this ewe retrieved from public databases were combined with an additional 50× coverage of Oxford Nanopore data and assembled with canu v1.9. The assembled contigs were scaffolded using Hi-C data with Salsa v2.2, gaps filled with PBsuitev15.8.24, and polished with Nanopolish v0.12.5. After duplicate contig removal with PurgeDups v1.0.1, chromosomes were oriented and polished with 2 rounds of a pipeline that consisted of freebayes v1.3.1 to call variants, Merfin to validate them, and BCFtools to generate the consensus fasta. The ARS-UI_Ramb_v2.0 assembly is 2.63 Gb in length and has improved continuity (contig NG50 of 43.18 Mb), with a 19- and 38-fold decrease in the number of scaffolds compared with Oar_rambouillet_v1.0 and Oar_v4.0. ARS-UI_Ramb_v2.0 has greater per-base accuracy and fewer insertions and deletions identified from mapped RNA sequence than previous assemblies.

Conclusions: The ARS-UI_Ramb_v2.0 assembly is a substantial improvement in contiguity that will optimize the functional annotation of the sheep genome and facilitate improved mapping accuracy of genetic variant and expression data for traits in sheep.

Citing Articles

Advancing the Indian cattle pangenome: characterizing non-reference sequences in Bos indicus.

Azam S, Sahu A, Pandey N, Neupane M, Van Tassell C, Rosen B J Anim Sci Biotechnol. 2025; 16(1):21.

PMID: 39915889 PMC: 11804092. DOI: 10.1186/s40104-024-01133-1.


Identifying Genetic Predisposition to Dozer Lamb Syndrome: A Semi-Lethal Muscle Weakness Disease in Sheep.

Stegemiller M, Highland M, Ewert K, Neaton H, Biller D, Murdoch B Genes (Basel). 2025; 16(1).

PMID: 39858630 PMC: 11764822. DOI: 10.3390/genes16010083.


Telomere-to-telomere sheep genome assembly identifies variants associated with wool fineness.

Luo L, Wu H, Zhao L, Zhang Y, Huang J, Liu Q Nat Genet. 2025; 57(1):218-230.

PMID: 39779954 DOI: 10.1038/s41588-024-02037-6.


Gene expression profiles in specific skeletal muscles and meat quality characteristics of sheep and goats.

Leng D, Huang Z, Bai X, Wang T, Zhang Y, Chang W Sci Data. 2024; 11(1):1390.

PMID: 39695159 PMC: 11655546. DOI: 10.1038/s41597-024-04260-6.


Telomere-to-telomere assemblies of cattle and sheep Y-chromosomes uncover divergent structure and gene content.

Olagunju T, Rosen B, Neibergs H, Becker G, Davenport K, Elsik C Nat Commun. 2024; 15(1):8277.

PMID: 39333471 PMC: 11436988. DOI: 10.1038/s41467-024-52384-5.


References
1.
Davenport K, Bickhart D, Worley K, Murali S, Salavati M, Clark E . An improved ovine reference genome assembly to facilitate in-depth functional annotation of the sheep genome. Gigascience. 2022; 11. PMC: 8848310. DOI: 10.1093/gigascience/giab096. View

2.
Ghurye J, Pop M, Koren S, Bickhart D, Chin C . Scaffolding of long read assemblies using long range contact information. BMC Genomics. 2017; 18(1):527. PMC: 5508778. DOI: 10.1186/s12864-017-3879-z. View

3.
Yardimci G, Noble W . Software tools for visualizing Hi-C data. Genome Biol. 2017; 18(1):26. PMC: 5290626. DOI: 10.1186/s13059-017-1161-y. View

4.
Bray N, Pimentel H, Melsted P, Pachter L . Near-optimal probabilistic RNA-seq quantification. Nat Biotechnol. 2016; 34(5):525-7. DOI: 10.1038/nbt.3519. View

5.
Jiang Y, Xie M, Chen W, Talbot R, Maddox J, Faraut T . The sheep genome illuminates biology of the rumen and lipid metabolism. Science. 2014; 344(6188):1168-1173. PMC: 4157056. DOI: 10.1126/science.1252806. View