James and the Giant Corn Genetics: Studying the Source Code of Nature

March 13, 2010

Sequenced Plant Genomes

Filed under: genomics,Plants,research stories — James @ 7:13 pm

Libe slope in Ithaca, NY. Behind you are student dorms. At the top of the hill, campus starts. Photo: foreverdigital, flickr (click to see in original context)

When I was an undergraduate, there were exactly two sequenced plant genomes, rice and arabidopsis. And sure maybe I didn’t have to walk “ten miles to school, barefoot, in the snow, uphill, both ways”* the one way I did have to walk uphill (sometimes in the snow but always with shoes), was very uphill. But where was I?

Oh yeah, plant genome sequences. Kids getting into plant genomics these days don’t realize how easy they’ve got it. By my count (which may be low but I’m getting to that) there are ten published plant genomes, with several more unpublished genomes that are available in various states of completion, and lots more on the way.

Which brings me to what I was doing yesterday instead of writing an update for this website: trying to document the published plant genomes, the unpublished genomes that are available, and which new genomes we can expect to see published in the near future.

Please, if you find mistakes or know of additional flowering plant genomes I should mention, let me know! jcs98 (@) jamesandthegiantcorn.com.

If you don’t work in biology, it might be interesting to see which plants have sequenced genomes and how they’re related to each other.

*An explanation of this phrase.


  1. Sometimes I am almost scared how many plant genomes will be available within such a short time frame, relative to what is currently available. Our lab just finished a project involving A. lyrata which took three years or so (for various reasons), and while I am sure I could extend the same analyses much faster for new genomes, I still think it would be a huge undertaking. Not that every project will use data from every sequenced genome, but I have wondered whether or not we will end up overloaded with data with too few people to analyze it.

    Comment by Noah Fahlgren — March 13, 2010 @ 11:47 pm

  2. I could definitely see there being a gap when sequence data starts piling up faster than people can analyze it (to some degree that might already be happening), but I wouldn’t be surprised if we don’t also start to see of a surge of people interested in studying comparative genomics.

    The people getting their PhDs today chose an area to focus on back around 2004-2005 when there really were only two complete plant genomes. The younger people applying for faculty positions were probably just starting grad school with the Arabidopsis genome was published. It may take a little time, but I think the research community will adjust.

    Comment by James — March 14, 2010 @ 4:28 pm

  3. I’m going to analyse it. I’m going to analyse my ass off. Bring all the good quality genomes you can!

    Its awesome you are putting together a comprehensive spot for this information. Here is my help for the effort so far (I will take a better look when its not sunday):

    P.patens – moss is done
    Castor bean at least has some of its information available at phytozome, although i dont know the details.
    You can also expect one more asterid genome in the (hopefully) near future. Dr. Rieseberg is heading up the sequencing of the sun flower genome here at UBC in collaboration with a few other groups.

    Comment by Greg — March 14, 2010 @ 12:00 pm

  4. Thanks Greg, I’ve added both of these to the page. The Castor Bean project actually looks pretty good. I was almost scared off by the version 0.1 release and the XXXXXX’s that replaced actual statistics on phytozome, but they’ve already got a draft assembly based on 4x sequencing of the genome.

    Comment by James — March 14, 2010 @ 4:15 pm

  5. If you want to get an idea whats coming you could check out the abstracts from the PAG conference http://www.intl-pag.org/18/abstracts/

    I didnt get to go but by the sounds of it there are lots of genomes coming.

    Comment by Greg — March 14, 2010 @ 5:36 pm

  6. Thanks for this completely awesome resource! However, I can’t see wheat anywhere – if it’s not just me missing it then this is a glaring omission…..

    Comment by Tim — March 31, 2011 @ 11:23 pm

  7. It’s not an oversight. No group has yet released a credible plan to sequence and assemble the wheat genome. It’s not an issue of people not recognizing the importance of wheat, but simply that the wheat genome is larger and more complex than any genome sequenced to date.

    Comment by James — April 1, 2011 @ 8:22 am

  8. Hello
    If its isnt a disturbance i would like to know whether how many of all these genome sequences are COMPLETE. and what would the term complete imply in such a context as genome.

    Comment by Donald — June 8, 2012 @ 9:41 am

  9. Hi Donald,

    Unfortunately your two questions are linked. Different people have different standards of what constitutes a “complete” genome and more or fewer plant genomes will meet different standards. Even the human genome still contains a (small) number of gaps made up of sequences that current technologies have a a lot of difficulty with for some reason.

    Of the current plant genomes, arabidopsis and rice are by far the highest quality. Among the remaining genomes, a genome that was sequenced using Sanger technology and/or BAC-by-BAC sequencing will almost always have fewer gaps and a more accurate assembly than genomes where I have listed the sequencing technology as primarily Illumina.

    If you have questions about individual species I could go into more detail.


    Comment by James — June 8, 2012 @ 9:49 am

RSS feed for comments on this post. TrackBack URL

Leave a comment

Powered by WordPress

%d bloggers like this: