» Articles » PMID: 20691772

Detecting Novel Genes with Sparse Arrays

Abstract

Species-specific genes play an important role in defining the phenotype of an organism. However, current gene prediction methods can only efficiently find genes that share features such as sequence similarity or general sequence characteristics with previously known genes. Novel sequencing methods and tiling arrays can be used to find genes without prior information and they have demonstrated that novel genes can still be found from extensively studied model organisms. Unfortunately, these methods are expensive and thus are not easily applicable, e.g., to finding genes that are expressed only in very specific conditions. We demonstrate a method for finding novel genes with sparse arrays, applying it on the 33.9 Mb genome of the filamentous fungus Trichoderma reesei. Our computational method does not require normalisations between arrays and it takes into account the multiple-testing problem typical for analysis of microarray data. In contrast to tiling arrays, that use overlapping probes, only one 25 mer microarray oligonucleotide probe was used for every 100b. Thus, only relatively little space on a microarray slide was required to cover the intergenic regions of a genome. The analysis was done as a by-product of a conventional microarray experiment with no additional costs. We found at least 23 good candidates for novel transcripts that could code for proteins and all of which were expressed at high levels. Candidate genes were found to neighbour ire1 and cre1 and many other regulatory genes. Our simple, low-cost method can easily be applied to finding novel species-specific genes without prior knowledge of their sequence properties.

Citing Articles

The effects of extracellular pH and of the transcriptional regulator PACI on the transcriptome of Trichoderma reesei.

Hakkinen M, Sivasiddarthan D, Aro N, Saloheimo M, Pakula T Microb Cell Fact. 2015; 14:63.

PMID: 25925231 PMC: 4446002. DOI: 10.1186/s12934-015-0247-z.


Kinetic transcriptome analysis reveals an essentially intact induction system in a cellulase hyper-producer Trichoderma reesei strain.

Poggi-Parodi D, Bidard F, Pirayre A, Portnoy T, Blugeon C, Seiboth B Biotechnol Biofuels. 2015; 7(1):173.

PMID: 25550711 PMC: 4279801. DOI: 10.1186/s13068-014-0173-z.


Systems biological approaches towards understanding cellulase production by Trichoderma reesei.

Kubicek C J Biotechnol. 2012; 163(2):133-42.

PMID: 22750088 PMC: 3568919. DOI: 10.1016/j.jbiotec.2012.05.020.


A versatile toolkit for high throughput functional genomics with Trichoderma reesei.

Schuster A, Bruno K, Collett J, Baker S, Seiboth B, Kubicek C Biotechnol Biofuels. 2012; 5(1):1.

PMID: 22212435 PMC: 3260098. DOI: 10.1186/1754-6834-5-1.


Correlation of gene expression and protein production rate - a system wide study.

Arvas M, Pakula T, Smit B, Rautio J, Koivistoinen H, Jouhten P BMC Genomics. 2011; 12:616.

PMID: 22185473 PMC: 3266662. DOI: 10.1186/1471-2164-12-616.


References
1.
Kamper J, Kahmann R, Bolker M, Ma L, Brefort T, Saville B . Insights from the genome of the biotrophic fungal plant pathogen Ustilago maydis. Nature. 2006; 444(7115):97-101. DOI: 10.1038/nature05248. View

2.
Machida M, Asai K, Sano M, Tanaka T, Kumagai T, Terai G . Genome sequencing and analysis of Aspergillus oryzae. Nature. 2005; 438(7071):1157-61. DOI: 10.1038/nature04300. View

3.
Mignone F, Grillo G, Licciulli F, Iacono M, Liuni S, Kersey P . UTRdb and UTRsite: a collection of sequences and regulatory motifs of the untranslated regions of eukaryotic mRNAs. Nucleic Acids Res. 2004; 33(Database issue):D141-6. PMC: 539975. DOI: 10.1093/nar/gki021. View

4.
Ponjavic J, Ponting C, Lunter G . Functionality or transcriptional noise? Evidence for selection within long noncoding RNAs. Genome Res. 2007; 17(5):556-65. PMC: 1855172. DOI: 10.1101/gr.6036807. View

5.
Fedorova N, Khaldi N, Joardar V, Maiti R, Amedeo P, Anderson M . Genomic islands in the pathogenic filamentous fungus Aspergillus fumigatus. PLoS Genet. 2008; 4(4):e1000046. PMC: 2289846. DOI: 10.1371/journal.pgen.1000046. View