Genome-wide analysis of mouse transcripts
using exon tiling microarrays and factor graphs [PDF]

Brendan J. Frey [1,2,3,7], Naveed Mohammad [1,2], Quaid D. Morris [1,2,3], Wen Zhang [1,2,4], Mark D. Robinson [2,3], Sanie Mnaimneh[2], Richard Chang[2], Qun Pan[2], Nancy Laurin[4], Eric Sat[5], Janet Rossant[4,5], Benoit G. Bruneau[4,6], Jane Aubin[4], Benjamin J. Blencowe[2,4], and Timothy R. Hughes[2,4,7]

[1] These authors contributed equally
[2] Banting and Best Department of Medical Research, University of Toronto, 112 College St., Toronto, ON, Canada, M5G 1L6
[3] Electrical and Computer Engineering, University of Toronto, 10 King’s College Rd., Toronto, ON, Canada, M5S 3G4
[4] Medical Genetics and Microbiology, University of Toronto, 1 King's College Ct., Toronto, ON, Canada, M5S 3G4
[5] Samuel Lunenfeld Research Institute, Mount Sinai Hospital, 600 University Avenue, Toronto, Ontario M5G 1X5
[6] The Hospital for Sick Children, 555 University Ave., Toronto, ON M5G 1X8
[7] To whom correspondence should be addressed:
frey@psi.utoronto.ca VOICE: 416-978-7001 FAX: 416-978-4425
t.hughes@utoronto.ca VOICE: 416-946-8260 FAX: 416-978-8528

Click below to access an on-line GenRate genome browser, which is linked to the UCSC genoe browser:


Abstract

Recent mammalian microarray experiments have detected widespread transcription and raised the possibility that there may be a large number of undiscovered multi-exon protein-coding genes. To explore this possibility, we hybridized unamplified, polyadenylation-selected samples from 37 mouse tissues to microarrays encompassing 1.14 million exon probes. We analyzed these data using GenRate, a Bayesian algorithm that uses a genome-wide scoring function in a factor graph to infer genes. At a stringent exon false detection rate of 2.7%, GenRate detects 12,145 gene-length transcripts and confirms 81% of the 10,000 most highly-expressed known genes. Surprisingly, our analysis shows that most of the 155,839 exons detected by GenRate are associated with known genes, providing for the first time microarray-based evidence that the vast majority of multi-exon genes have already been discovered. GenRate also detects tens of thousands of potential new exons and reconciles discrepancies in current cDNA databases, by stitching novel transcribed regions into previously-annotated genes.

Supplementary Data