Search research reports:
Computational Methods for mRNA Transcriptome from RNA-Seq Data
Department of Veterinary Sciences
We are entering an era of "personalized genomics" in which individualized medical care will be based on a patient's clinical history, presenting clinical status, and the specific nucleotide sequence of her/his genome. Rapid innovation of next generation sequencing technologies (NextGen) based on massively parallel nucleotide sequencing-by-synthesis without dideoxy-based chain termination is translating this vision into a practical reality.
However, individual genomic DNA data is only the beginning; personalized transcriptomics, the genome-wide measurement of gene expression and alternative splice variants in patients, will give even greater insight into the functional basis of disease. Biopsy samples, for example, will not only be assessed microscopiccally, but can also have their mRNA transcriptome determined and analyzed. Today it is possible to use NextGen technology to sequence the transcriptome of a tissue sample for around $1000.
The biological analysis of these data, however, remains an unsolved problem. New computational resources and analytical methods must be developed in order to make the benefits of individualized medicine feasible. In a January, 2009, editorial entitled "RNA-Seq: a revolutionary tool for transcriptomics" published in Nature Reviews Genetics (Wang et.al. 2009), Wang, Gerstein, and Snyder from Yale University discuss how RNA-seq technology has the transformative potential of specifically defining the starts and ends of exons and transcripts, resolving the extent of spliced heterogeneity, and capturing the quantitative dynamics of the transcriptome.
A major challenge for RNA-seq technology was recognized, however, as the critical need for "computationally simple methods" for the analysis of these massive datasets. This project targets the challenge directly.
2010 Project Description
Results have been disseminated to people through public seminars and lectures. The seminars included presentations at one regional (Kentucky Biomedical Research Infrastructure Network) and two international (The Biology of Genomes at Cold Spring Harbor Laboratory in Massachusetts , and the Plant & Animal Genomes meeting in California) scientific meetings. The lectures included two local (University of Kentucky College of Medicine, Kentucky Association of Equine Practitioners), one regional (Kentucky Quarter Horse Annual Meeting), and two national (thoroughbred racing industry symposium on genetics, University of Minnesota) presentations.
The significance of this project for agriculture in Kentucky is based on the importance of equine-related industries. The estimated economic impact of Kentucky's horse industries is over $4 billion with direct or indirect employment of 100,000 people. Indeed, horses in general and thoroughbred horse racing in particular are a symbol of Kentucky that is recognized both nationally and internationally.
Research that determined the primary DNA base sequence of the horse genome was completed in 2007 and 2008. This major accomplishment is in many ways just a beginning. Distributed within the 2.7 billion bases of DNA that compose the equine genome are approximately 21,000 protein-encoding genes. Understanding the structure of these 21,000 genes, what tissues express which genes, when the genes are expressed, and how much they are expressed represent functional parameters studied by many scientists working on equine health and disease.
The outcome and impact of this research during the reporting period centers on new fundamental knowledge on the analytical and computational methods for data generated through a process called RNA sequencing. Sequencing of RNA enables an assessment of expression from all genes concurrently, not just individual genes or small groups of genes.
Research that leads to improved diagnostic and therapeutic strategies for equine health issues will have a direct beneficial impact on horse-related interests, and by extension, agriculture throughout the Commonwealth of Kentucky.
Mienaltowski, M.J., Huang, L., Bathke, A., Stromberg, A.J., and MacLeod, J.N. (2010). Transcriptional comparisons between articular repair tissue, neonatal cartilage, cultured chondrocytes, and mesenchymal stromal cells. Briefings in Functional Genomics & Proteomics, 9:238-250.
Wang, K., Singh, D., Zeng, Z., Coleman, S.J., Huang, Y., Savich, G.L., Xiaping, H., Mieczkowski, P., Grimm, S.A., Perou, C.M., MacLeod, J.N., Chiang, D.Y., Prins, J.F., and Liu, J. (2010). MapSplice: accurate mapping of RNA-seq reads for splice junction discovery. Nucleic Acids Research, 38(18):e178.
Coleman, S.J., Zeng, Z., Wang, K., Luo, S., Khrebtukova, I., Mienaltowski, M.J., Schroth, G.P., Liu, J., and MacLeod, J.N. (2010) Structural annotation of equine protein-coding genes determined by mRNA sequencing. Animal Genetics, 41 (Suppl. 2):121-130.
Coleman, S.J., Zeng, Z., Miller, D., Klein, C., Troedsson, M.H.T., Antczak, D.F., Liu, J., and MacLeod, J.N. 2010. Generation of a consensus protein-coding equine gene set. Plant and Animal Genome XVIII (P619).
Detlefsen, L.G., Patterson-Kane, J., and MacLeod, J.N. 2010. Differential gene expression in equine tendon as a function of maturation and loading. Plant and Animal Genome XVIII (P632).
MacLeod, J.N., Coleman, S.J., Prins, J., and Liu, J. 2010. Analyses of the equine mRNA transcriptome with RNA-seq. Plant and Animal Genome XVIII (W030).
Coleman, S.J., Zeng, Z., Liu, J., and MacLeod, J.N. 2010. Analysis of equine protein-coding gene structure and expression by mRNA sequencing. BMC Bioinformatics 11(Suppl 4):08.
Wang, K., Singh, D., Zeng, Z., Coleman, S.J., Xiaping, H., Mieczkowski, P., Perou, C.M., MacLeod, J.N., Chiang, D.Y., Prins, J.F., and Liu, J. 2010. MapSplice: mapping RNA-seq reads for splice discovery. The Biology of Genomes, Cold Spring Harbor Laboratory (54).