» Articles » PMID: 31230722

Bioinformatics-Based Identification of Expanded Repeats: A Non-reference Intronic Pentamer Expansion in RFC1 Causes CANVAS

Abstract

Genomic technologies such as next-generation sequencing (NGS) are revolutionizing molecular diagnostics and clinical medicine. However, these approaches have proven inefficient at identifying pathogenic repeat expansions. Here, we apply a collection of bioinformatics tools that can be utilized to identify either known or novel expanded repeat sequences in NGS data. We performed genetic studies of a cohort of 35 individuals from 22 families with a clinical diagnosis of cerebellar ataxia with neuropathy and bilateral vestibular areflexia syndrome (CANVAS). Analysis of whole-genome sequence (WGS) data with five independent algorithms identified a recessively inherited intronic repeat expansion [(AAGGG)] in the gene encoding Replication Factor C1 (RFC1). This motif, not reported in the reference sequence, localized to an Alu element and replaced the reference (AAAAG) short tandem repeat. Genetic analyses confirmed the pathogenic expansion in 18 of 22 CANVAS-affected families and identified a core ancestral haplotype, estimated to have arisen in Europe more than twenty-five thousand years ago. WGS of the four RFC1-negative CANVAS-affected families identified plausible variants in three, with genomic re-diagnosis of SCA3, spastic ataxia of the Charlevoix-Saguenay type, and SCA45. This study identified the genetic basis of CANVAS and demonstrated that these improved bioinformatics tools increase the diagnostic utility of WGS to determine the genetic basis of a heterogeneous group of clinically overlapping neurogenetic disorders.

Citing Articles

Long-read sequencing revealed complex biallelic pentanucleotide repeat expansions in RFC1-related Parkinson's disease.

Liu P, Zhang F, Chen X, Zheng X, Chen M, Lin Z NPJ Parkinsons Dis. 2025; 11(1):21.

PMID: 39833204 PMC: 11747075. DOI: 10.1038/s41531-025-00868-6.


Recent Advances in the Genetics of Ataxias: An Update on Novel Autosomal Dominant Repeat Expansions.

Pellerin D, Iruzubieta P, Xu I, Danzi M, Cortese A, Synofzik M Curr Neurol Neurosci Rep. 2025; 25(1):16.

PMID: 39820740 DOI: 10.1007/s11910-024-01400-8.


Repeat expansions in gene in refractory chronic cough.

Hirons B, Cho P, Rhatigan K, Shaw J, Curro R, Rugginini B ERJ Open Res. 2025; 11(1.

PMID: 39811557 PMC: 11726589. DOI: 10.1183/23120541.00584-2024.


Triplex H-DNA structure: the long and winding road from the discovery to its role in human disease.

Hisey J, Masnovo C, Mirkin S NAR Mol Med. 2024; 1(4):ugae024.

PMID: 39723156 PMC: 11667243. DOI: 10.1093/narmme/ugae024.


The ZFHX3 GGC Repeat Expansion Underlying Spinocerebellar Ataxia Type 4 has a Common Ancestral Founder.

Chen Z, Jerez P, Anderson C, Paucar M, Lee J, Nilsson D Mov Disord. 2024; 40(2):363-369.

PMID: 39635987 PMC: 11832790. DOI: 10.1002/mds.30077.


References
1.
Yoon G, Caldecott K . Nonsyndromic cerebellar ataxias associated with disorders of DNA single-strand break repair. Handb Clin Neurol. 2018; 155:105-115. DOI: 10.1016/B978-0-444-64189-2.00007-X. View

2.
Liao Y, Smyth G, Shi W . featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics. 2013; 30(7):923-30. DOI: 10.1093/bioinformatics/btt656. View

3.
Hannan A . Tandem repeats mediating genetic plasticity in health and disease. Nat Rev Genet. 2018; 19(5):286-298. DOI: 10.1038/nrg.2017.115. View

4.
Dolzhenko E, van Vugt J, Shaw R, Bekritsky M, van Blitterswijk M, Narzisi G . Detection of long repeat expansions from PCR-free whole-genome sequence data. Genome Res. 2017; 27(11):1895-1903. PMC: 5668946. DOI: 10.1101/gr.225672.117. View

5.
Mousavi N, Shleizer-Burko S, Yanicky R, Gymrek M . Profiling the genome-wide landscape of tandem repeat expansions. Nucleic Acids Res. 2019; 47(15):e90. PMC: 6735967. DOI: 10.1093/nar/gkz501. View