Research Accomplishment Reports 2010

Ag Research logo

Genome Sequence for the Apicomplexan Sarcocystis neurona

D. Howe, C. L. Schardl, J.C. Kissinger
Department of Veterinary Sciences

 

Non-Technical Summary

Sarcocystis neurona is a protozoan parasite that is the leading cause of neurologic disease in horses and an emerging pathogen of marine mammals. In addition, S. neurona is closely related to several important human parasites (e.g., Toxoplasma, Plasmodium). The primary goal of this project is to sequence the genome of Sarcocystis neurona, to characterize the genome sequence by comparing it to sequences from other organisms, and to make the information available to the research community. The S. neurona genome sequence will serve as a valuable resource for identifying important parasite genes, and it will allow researchers to better utilize state-of-the-art technologies and experimental approaches to investigate this pathogen. As well, the S. neurona genome sequence will be compared to the genomes from related human parasites, which may reveal valuable information about this important group of pathogens.

2010 Project Description

The Sarcocystis neurona genome has been sequenced to approximately 24X coverage by Roche 454 Titanium pyrosequencing of shotgun, 3-kb paired-end, and 8-kb paired-end fragment libraries. 454 sequence data has been supplemented with paired-end Sanger sequencing of about 8500 fosmid clones (17,000 reads). The most recent assembly of the sequences using Roche's Newbler software (November, 2010) compiled the data into 4889 contigs that come together into 272 scaffolds (super contigs). The assembly suggests an approximate genome size of 124 Mb.

All available ESTs have been mapped to the genome, and consensus sequences for intron donor and acceptor sites have been determined. Like its most closely related species with a genome sequence (T. gondii), S. neurona genes contain multiple introns with an average size of 1200 bp. Mapped ESTs have been used to generate the required training data sets for use with the Augustus, Twinscan, GlimmerHMM, and SNAP gene finders. A preliminary Blast-searchable database and sequence viewer database has been established and an Apollo instance has been created to display the data needed for annotation.

A cursory search of the S. neurona genome with coding sequences known to be retained in all other sequenced apicomplexan genomes (1,288 genes) revealed that 95% were detectable in S. neurona. Inspection of introns and the culled repeat sequences does not, as of yet, provide any insight into the larger genome size of S. neurona.

Once completed, the S. neurona genome information will be disseminated to the research community via a freely-accessible online database (SarcoDB).

2010 Impact

Sarcocystis neurona has value for investigating aspects of apicomplexan cell and molecular biology, but its utility as an experimental model will be greatly enhanced by the availability of a genome sequence. This will foster interest in using S. neurona as an investigation tool across a much broader cross-section of the apicomplexan research community.

Phylogenetic placement of the Apicomplexa is with ciliate and dinoflagellate protozoa and (more distantly) with multicellular organisms such as kelp, the majority of which contain a plastid organelle that was probably acquired by secondary endosymbiosis. The evolutionary history of this clade and their intriguing plastid remains somewhat enigmatic, so the S. neurona genome and plastid sequence will add to the base of information needed to help resolve uncertainties regarding the phylogeny of these diverse organisms.