James and the Giant Corn Genetics: Studying the Source Code of Nature

December 13, 2009

Panda Genome

Filed under: biology,Genetics — Tags: , , , , — James @ 5:11 pm
Can you imagine how much easier it would be to get funding if you too worked on panda biology?

Can you imagine how much easier it would be to get funding if you too worked on panda biology?

Nature just released a pre-publication copy of a paper detailing the sequencing of the panda genome. The genome was sequenced and assembled using entirely 2nd generation sequencing technologies (specifically the Illumina sequencer) which produced reads that averaged only 53 basepairs long.*

The panda they chose was a three year old female, and they got such resolution (the average individual base pair was sequenced 73 times!) they were even able to identify individual changes in sequence between her two copies of each chromosome.** From this they were able to estimate a difference in the DNA sequence (called a SNP***) occur once every 740 bases which is almost twice the rate of humans.

Having so much genetic diversity is surprising in a species as endangered as Giant Pandas with only 2500-3000 remaining in the wild, with a few hundred more in captive breeding programs like the one where the panda whose genome was sequenced is kept.

Imagine from a friend who spent a summer working on panda genetics a couple of years ago

Imagine from a friend who spent a summer in China working on panda genetics a couple of years ago

Much of the comparative analysis of the panda genome used the dog genome for comparison as dogs are currently the best sequenced carnivora**** species (last I heard, the cat genome was only 2x coverage). By including data from the human and mouse genomes, the authors were able to conclude that mutations accumulated more slowly in the genome of giant pandas than in any of the other three lineages (human, mouse, and dog), which just makes the high level of diversity within pandas even more interesting.

That wraps up my initial thoughts on the panda genome. Here are a couple of links for further reading:

  • The Nature paper itself.
  • AFP article on the publication of the genome
  • If you’re interested, here’s the site where you can download the panda genome itself. Edit: The site appears to be down, I imagine they’re getting hammered with traffic today.
  • And of course, we couldn’t have a post on giant pandas without at least one panda youtube video (only 20 seconds long)

In closing, consider that the pandas don’t seem very motivated to survive as a species considering all the work we humans (specifically the Chinese government) are putting into trying to keep them around (quote from the AFP article linked above):

The animals’ notoriously low libidos have frustrated efforts to boost their numbers. Breeders have resorted to tactics such as showing them “panda porn” videos of other pandas mating, and putting males through “sexercises” aimed at training up their pelvic and leg muscles for the rigours of copulation.

*For perspective, the epic novel War and Peace contains approximately 5 million letters in its english translation. Assembling the Panda genome is the equivalent of piecing together a novel 500 times as long using random fragments of text such as:

nglish ambassador’s? Today is Wednesday. I must put

adron of varicolored horsemen. Two of them rode side

I am impressed.

**Normally to avoid issues created by the fact most organisms contain to equivalent but slightly different copies of each chromosome, the first draft of an organism is created using an inbred line, where both copies of each chromosome are essentially identical after many generations of selfing, or mating siblings. I know this was the case with the mouse genome, as well as many of the currently sequenced plant genomes (arabidopsis, rice, sorghum, maize, brachy). I’m not sure how the issue was dealt with in the human genome, since creating a highly inbred human for sequencing would be both unethical and impractical (human generation times are so long it’d take as much as a century to create someone appropriately inbred).

***SNP stands for single nucleotide polymorphism.

If my DNA reads:

ATGCACGTGTAG

and at the same position yours reads:

ATGCATGTGTAG

A single nucleotide has changed between us. That site is a SNP. For humans differences like that will occur, on average, once in every 1500 base pairs.

****Carnivora doesn’t mean all carnivores (although that is where the name comes from) but is, rather, an group of related species, many of them carnivores, including cats, dogs, weasels, bears, raccoons, and even seals and walruses.

3 Comments »

  1. (eye-roll)

    Why are we prioritizing such odd species for sequencing? I mean honestly there are over 40 vertebrates sequenced and only a dozen plants! The pay-off for plant genomes is huge! Dont get me wrong, the more genomes the better, but if it comes down to obscure endangered mammals or crop plants it should be an easy choice. I have a tough time believing that sequencing the genome is going to make a serious contribution to saving these species.

    That said, if you can use an all next gen approach to sequence new genomes then the cost of sequencing is really going to drop. This is the really cool thing about this paper. And my what coverage. So cheaper genomes, Hurray! now for more plant genomes.

    Comment by Greg — December 13, 2009 @ 8:16 pm

  2. I think the Panda genome was primarily a prestige project for the Chinese government, so I don’t think this particular project took any resources away from plant genomes.

    On the other hand, funding the 10,000 vertebrate genome project as the cost of $100 million dollars or more when we’re still missing so many economically and phylogenetically significant species on the plant side of things seems a poor investment of resources.

    The cucumber genome that recently came out was also from China, and also made extensive use of Illumina sequencing, but I really wasn’t impressed with the quality of the sequence and assembly. The panda genome assembly seems of much better quality (although I’ll need to get through to download a copy to check in more detail) which makes me more optimistic about the potential to cheaply sequence plant genomes using Illumina tech.

    Back of the envelope (based on the latest pricing data I have access to), generating all the sequence used to create this version of the panda genome could be done for ~$200,000 using Illumina sequencers.

    Comment by James — December 13, 2009 @ 10:20 pm

  3. I agree that sequencing species like panda doesn’t seem like the best use of resources. However, this sort of thing will become a good investment once third generation sequencing technologies come on line.

    Here’s another example, a press release saying that the genome of the snow leopard could aid in controlled breeding and in understanding why the species seems to have more health problems than other cats:
    http://oregonstate.edu/ua/ncs/archives/2009/oct/power-genome-sequencing-turns-snow-leopards-other-endangered-species

    Comment by JS Hoyer — December 30, 2009 @ 7:17 pm

RSS feed for comments on this post.

Leave a comment

Powered by WordPress

%d bloggers like this: