» Articles » PMID: 33575650

BRAKER2: Automatic Eukaryotic Genome Annotation with GeneMark-EP+ and AUGUSTUS Supported by a Protein Database

Overview
Specialty Biology
Date 2021 Feb 12
PMID 33575650
Citations 685
Authors
Affiliations
Soon will be listed here.
Abstract

The task of eukaryotic genome annotation remains challenging. Only a few genomes could serve as standards of annotation achieved through a tremendous investment of human curation efforts. Still, the correctness of all alternative isoforms, even in the best-annotated genomes, could be a good subject for further investigation. The new BRAKER2 pipeline generates and integrates external protein support into the iterative process of training and gene prediction by GeneMark-EP+ and AUGUSTUS. BRAKER2 continues the line started by BRAKER1 where self-training GeneMark-ET and AUGUSTUS made gene predictions supported by transcriptomic data. Among the challenges addressed by the new pipeline was a generation of reliable hints to protein-coding exon boundaries from likely homologous but evolutionarily distant proteins. In comparison with other pipelines for eukaryotic genome annotation, BRAKER2 is fully automatic. It is favorably compared under equal conditions with other pipelines, e.g. MAKER2, in terms of accuracy and performance. Development of BRAKER2 should facilitate solving the task of harmonization of annotation of protein-coding genes in genomes of different eukaryotic species. However, we fully understand that several more innovations are needed in transcriptomic and proteomic technologies as well as in algorithmic development to reach the goal of highly accurate annotation of eukaryotic genomes.

Citing Articles

Chromosome-level genome assembly of the clam, Xishi tongue Coelomactra antiquata.

Shen Y, Wang Y, Kong L Sci Data. 2025; 12(1):422.

PMID: 40069159 PMC: 11897284. DOI: 10.1038/s41597-025-04734-1.


IMA GENOME - F20 A draft genome assembly of , , , , and genomic resources for and .

DAngelo D, Sorrentino R, Nkomo T, Zhou X, Vaghefi N, Sonnekus B IMA Fungus. 2025; 16:e141732.

PMID: 40052082 PMC: 11882029. DOI: 10.3897/imafungus.16.141732.


Chromosome-level genome of the brown lacewing Micromus angulatus (Stephens, 1836) (Neuroptera: Hemerobiidae).

Zhao Y, Zhan Q, Wang Y, Cao R, Jiang L, Xu Q Sci Data. 2025; 12(1):394.

PMID: 40050286 PMC: 11885443. DOI: 10.1038/s41597-025-04739-w.


A chromosome-level reference genome facilitates the discovery of clubroot-resistant gene in Chinese cabbage.

Yang S, Wang X, Wang Z, Zhang W, Su H, Wei X Hortic Res. 2025; 12(3):uhae338.

PMID: 40046320 PMC: 11879649. DOI: 10.1093/hr/uhae338.


The genome sequence of a pallopterid fly, (Harris, 1780).

Barclay M, Broad G, Sivell O Wellcome Open Res. 2025; 10:55.

PMID: 40046090 PMC: 11880760. DOI: 10.12688/wellcomeopenres.23670.1.


References
1.
Kriventseva E, Kuznetsov D, Tegenfeldt F, Manni M, Dias R, Simao F . OrthoDB v10: sampling the diversity of animal, plant, fungal, protist, bacterial and viral genomes for evolutionary and functional annotations of orthologs. Nucleic Acids Res. 2018; 47(D1):D807-D811. PMC: 6323947. DOI: 10.1093/nar/gky1053. View

2.
Stanke M, Keller O, Gunduz I, Hayes A, Waack S, Morgenstern B . AUGUSTUS: ab initio prediction of alternative transcripts. Nucleic Acids Res. 2006; 34(Web Server issue):W435-9. PMC: 1538822. DOI: 10.1093/nar/gkl200. View

3.
Hoff K, Stanke M . WebAUGUSTUS--a web service for training AUGUSTUS and predicting genes in eukaryotes. Nucleic Acids Res. 2013; 41(Web Server issue):W123-8. PMC: 3692069. DOI: 10.1093/nar/gkt418. View

4.
Stanke M, Diekhans M, Baertsch R, Haussler D . Using native and syntenically mapped cDNA alignments to improve de novo gene finding. Bioinformatics. 2008; 24(5):637-44. DOI: 10.1093/bioinformatics/btn013. View

5.
Keilwagen J, Wenk M, Erickson J, Schattat M, Grau J, Hartung F . Using intron position conservation for homology-based gene prediction. Nucleic Acids Res. 2016; 44(9):e89. PMC: 4872089. DOI: 10.1093/nar/gkw092. View