James and the Giant Corn Rotating Header Image


Dichanthelium oligosanthes (One in a Thousand Series)

Inflorescence of Dichanthelium oligosanthes. Accession “Kellogg 1175”

Out of the ~12,000 known grass species, the genomes of less than one in one thousand have been sequenced. The “One in a Thousand” series focuses on these rare grass species.

Dichanthelium oligosanthes is a wild grass that grows in forest glades throughout the American midwest. It is a small plant. Doesn’t grow particularly fast. Its flowers aren’t particularly striking. And it has enough issues with seed dormancy that growing it in captivity is a major pain. Dichanthelium is a one in one-thousand grass with a sequenced reference genome.*

The reason folks are interested in Dichanthelium isn’t because of what it is, but who it’s related to. Dichanthelium occupies a spot on the grass family tree between a tribe** of grasses that includes foxtail millet and switchgrass, each one in a thousand species themselves, and another tribe of grasses that includes corn and sorghum, two more one in a thousand species. The relationship looks something like this:

Phylogenetic relationship of Dichanthelium oligosanthes to related grasses with sequenced genomes.


In which I apologize to R

R, you may be a confusing and hard to understand language where every package comes with its own set of quirks and foibles. You may make me feel less like a programmer and more like a not-very-well trained magician fumbling around for the right incantation to make magic happen.

But when you work, you do awesome things.

Sex specific splicing of a gene of unknown function of a gene syntenically conserved in all grass species.

With only four days work I was able to go from a giant pile of reads (from the still not properly appreciated Davidson 2011 The Plant Genome) to figures like the one above.

So what is the figure above showing you? One of a large number of genes which show a different pattern of splicing in male and female reproductive organs in maize.* The region “E8” is usually treated as exonic in female reproductive tissues but is spliced out like an intron in male reproductive tissues. What does it mean (if anything)? I have no idea yet! But it would have been a real pain to try to re-invent the wheel for identifying these deferentially spliced genes in python. In R, once I figured out the right incantation, it’s practically plug and play for any gene you could possibly be interested in. Including the software for the (actually quite useful) visualization shown above.

So thank you R. What you do — once I can figure out how to make you do it — you do incredibly well.

*Maize makes it easy for us by separating female and male flowers into two entirely different organs (the ear and tassel respectively).

Data from:

Davidson R. M., Hansey C. N., Gowda M., Childs K. L., Lin H., Vaillancourt B., Sekhon R. S., Leon N. de, Kaeppler S. M., Jiang N., Buell C. R., 2011  Utility of RNA Sequencing for Analysis of Maize Reproductive Transcriptomes. Plant Genome 4: 191–203. doi:10.3835/plantgenome2011.05.0015.

Analyzed using the R package DEXSeq:

Anders S, Reyes A, Huber W. 2012 Detecting differential usage of exons from RNA-Seq Data. Unpublished. (Link is to a PDF)

Transposon Mutagenesis

In yesterday’s Transposon Week post, I discussed how transposons can spread through a species by without providing any benefit to the animals, plants, fungus, or micro-organisms that host them.

Adding a little extra useless DNA doesn’t help an organism survive, but it also doesn’t cause serious harm. But in yesterday’s post I completely avoided one serious question:

When new copies of a transposon get inserted across the genome, what happens to the DNA they land in? For what matter, what kind of DNA do transposons land in in the first place?

The answer to the second question is that different kinds of transposons each have their favorite places to land in the genome. Some transposons like to land in centromeres. Some transposons like to land in other transposons. Some transposons like to land near genes.

Then there are transposons like Mutator. Mutator is a maize/corn transposon that really likes to insert itself into genes. Transposons that usually land in other parts of the genome are also sometimes found in genes.

When a transposon lands in a gene, whether because that’s where it likes to insert or simply by accident, the gene stops working. Depending on which gene has to misfortune to interrupted by a transposon, the effects can range from so-subtle-we-can’t-even-detect-them to so lethal the organism dies before we get a chance to study it. In between are a whole range of effects. From severe developmental mutants, to gorgeous and apparently random streaks of color in flowers, to the spotted corn kernels which were my first introduction to the world of transposons.* (**)

Transposons are always breaking genes. The deadliest mutations disappear from the population as quickly as their appear. More subtle mutations can linger on for generations given rise to all sorts of genetic disorders. And keep in mind many genes can be broken with no visible effect at all. Anything you eat from asparagus to zuccini has the potential to contain genes broken by transposons. And depending on the gene, you’d probably never even know it.

Sorry to put this post up so late (it’s technically already Thursday) and in such a poor shape. I had some craziness in lab today and was waiting (unfortunately without any luck) to hear back about some more interesting stories I could tell about the Mutator transposon.

*To be fair, the last two are actually caused by transposons jumping OUT of genes allowing them to resume their normal function. The original mutations caused by transposons inserting into genes were to break the biochemical pathways used by Dahlia’s to make red pigment in their petals and by corn to produce purple pigment (anthocyanin) in its kernels.

**I really wish there was a good source of freely usable pictures things like transposon sectors in flowers and corn kernels. I can usually find pictures of normal plants on Flickr with creative common licenses. But I really want to be able to show you guys the cool mutants that make genetics so exciting.

Transposons: The Difference Between Junk DNA and Selfish DNA

Tranposons are one of those really cool features of genomes that never really seem to make the jump into the public eye. Most people at least have some conception of what a gene is. It’s a piece of DNA that contains the instructions for making a protein plays some role in the cell. A lot of other people can recall hearing an off-hand statistic only some tiny fraction of the human genome is made up of genes, with the rest being “junk DNA”. The question of why most of our genomes have no apparent function is why there’s a slow trickle of scientific research that gets picked up in the popular press as “scientistists discover junk DNA not junk after all!”.

But the reason most of genetics-genomics people aren’t in a huge rush to discover the hidden function behind most of this “junk DNA” is because we KNOW what most of it does and where it comes from. It’s not junk, it’s selfish DNA. <– although there’s certainly lots of cool stuff remaining to be discovered in the much smaller fractions of genomes we can’t classify at all. (more…)

Welcome to transposon week here at James and the Giant Corn!

I’m just about wrapped up with the big project I’ve been working on recently. Hope to be able to say more about it in the not-too-distant future. Having to be secretive in science sucks.

But there’s a lot of be happy about! I’m done teaching for a long time. As much as I enjoyed working with the kids in my class, the other responsibilities of teaching (grading, sitting through lectures without the chance to break in for the discussions and arguments that make academia so fun, grading, designing assignments, grading) were really starting to wear me down.

And I’m only three weeks (June 22nd) from either passing my qualifying exam or becoming a beaten and broken shell of a man. For three hours four professors will question me on everything I’ve learned (or should have learned but didn’t) in my education up to this point, and everything I propose to spend the next few years of my life doing. This may not sound like a good thing, but it is. Because my qualifying exam has been hanging over my head all semester,

The lab has a new paper in press, having run the sequential gauntlets of Peer Review, Editorial Evaluation, and finally (and perhaps most dreaded) Your-Figures-Aren’t-High-Resolution-Enough e-mails from the journal’s publication department. But more on the details of that whenever the paper actually shows up.

But what was the point of this entry again? Oh yeah. Transposons. I have a soft spot from transposons (I’m guessing most people who work with maize genetics do). Today we may know that transposons are found in practically every genome under the sun, but they were discovered first in maize using old school genetics (breeding plants together and counting traits in the offspring), before DNA sequencing was a gleam in its inventor’s eye.

And on top of that, some delightfully high-copy number transposons are in the middle of proving a major scientific point for me, so I figured the least I could do was devote a week to them here on the site.

If you’re not a geneticist, should you still care about transposons? Absolutely! Transposons are one of the best arguments, not for why genetic engineering is safe, but for why, if anyone worried about hypothetical unintended consequences of genetic engineering should be worried about any food with DNA in it (and as far as I know, that’s all food.) To paraphrase a seinfield character: “No food for you!”

The week’s schedule: (more…)

The Peach Genome Is Out

1.1 pound peach from the Berkeley Farmer's market.

Here. I had no idea anyone was even considering sequencing the peach genome until I heard a single off-hand comment at the maize meeting last month, and all of the sudden here it is. And in better shape in its first release than some genomes are even after they’re published.

This is a pre-publication release, so the Fort Lauderdale Convention is still in effect,* but the peach genome looks really great from the quick and dirty analysis I have already run. They’ve already got the genome assembled into pseudomolecules (chromosomes), unlike some genomes I could mention that have already been published, and marked the locations and structures of genes in the geneome (there was a weird period last summer when there were pre-release versions of the maize genome organized into chromosomes, and pre-release versions with the genes marked, but none that had both.)

*In short, you or I can download the peach genome, play around and study it to our hearts content, but we can’t publish anything on it until the people who actually sequenced the peach genome publish a paper describing their work.

Helitron Capture Creating New Genes?

One of the things that has made annotating genes in the maize genome so difficult (there are currently two sets of gene models one with only 32,000 genes, which is low estimate, and the other with 100,000 is far too many) is the presence of large numbers of gene fragments that have been captured and duplicated by a class of transposon called helitrons (yes I know that sounds like a character from Transformers).

The helitron captured fragments are copied from real genes (often multiple pieces are captured from different genes) which is why many gene annotation programs (trained to recongize the difference between genes and non-coding DNA) will identify the fragments being genes themselves.

What if some of those fragments actually are genes? By combining pieces from completely different genes, helitrons could be a whole new source of crazy new genes that natural selection could act upon.

That is the question the authors of this poster are trying to get at, by identifying more helitron fragments and checking to see if those fragments were actually expressed in the genome.

Allison Barbaglia et al. “Accessing the transcriptional activity of Helitron-captured genes of maize” Poster #243 2010 Maize Meeting

Missing Genes on a Massive Scale

Edit: stripped out all the numbers as they clearly applied to an earlier version of the data and I don’t know if the new ones are intended for public release yet.

Last november when the maize genome was published, one of the companion papers looked at genes where a different number of copies were found in different breds of maize (this is called Copy Number Variation) and genes found in B73 (the variety of maize that was sequenced) but completely missing from the genomes of other varietes. There’s a great post on that paper written up by Mary at OpenHelix.

A few months later, it sounds like this dataset has grown substantially. Over XXXX B73 genes (that’s X% of the filtered B73 gene set!) that appear to be lost (or have sequences so different they no longer register) in at least some varities of maize. And because the new dataset incorporates data from XX different maize breds and XX different teosinte* lines they’re able to identify some of the losses as older because they’re found in multiple comparisons, while some appear to be lost in only a single breed, and might represent more recent losses.

Sit back and think about that for a second. At least X% of the genes in corn sometimes go missing. This could have implications for everything from inbreeding depressions and hybrid vigor, to the kind of basic research I’m actually working on myself.

As you can imagine I’d love to get my hands on this dataset myself, but the next best thing will be to take furious notes when Nathan Springer talks about the project on Friday morning**, and being sure to swing by Steven Eichten’s poster soak in the awesomeness.

Ruth A. Swanson-Wagner et al. “Combined Analysis of genomic structural variation and gene expression variation between maize and teosinte populations” Talk #1 2010 Maize Meeting (Presented by Nathan Spinger)

Steven R. Eichten et al. “Extenisve Copy Number Variation Among Maize Lines” Poster #139 2010 Maize Meeting

*Teosinte is the wild species from which maize/corn was domesticated.

**And he’s talking at 8:30 AM on a day when I still plan on being heavily jet lagged.

Abnormal Chromosome 10

There is a piece of DNA that is sometimes found on the end of the tenth maize chromosome. In plants that possess this extra chromosome segment, chromosome knobs* (including one that’s a part of the extra segment included in abnormal chromosome 10) start to act like centromeres**. But this story graduates from odd to downright weird when I tell you that possessing this extra centromere-like activity gives a chromosome an unfair advantage in being passed on to the next generation.

Plants, like animals, possess two complete genome copies, one from each parent. They’ll only pass on one copy (mixtures of pieces from each parent) to their offspring. Any given sequence has a 50% chance of being passed on which seems fair given the plant is passing on 50% of its total genetic material. But abnormal chromosome ten cheats (using those extra centromere-like sequences I mentioned earlier). It has up to an 83% chance of being passed on.

Since the breed of corn (B73) the maize genome was based on has the normal version of chromosome 10, we know very little about the extra DNA found in abnormal chromosome 10. The authors of this poster are going to correct that oversight, by sequencing the region, figuring out how (and how long ago) abnormal chromosome 10 came into being, and hopefully identifying the genes within the region that make chromosome-knobs act like centromeres.

*Knobs are dense segments of DNA that scientists have been able to spot visually within chromosomes since before we knew for sure that chromosomes carried genetic information.

**Centromeres are the part of the chromosomes that bind together during cell division (the center of the X in the traditional drawing of a chromosome). They’re also the place where the molecular machinery that pulls chromosomes apart at the end of the process of cell division.

Lisa Kanizay and Kelly R. Dawe “Uncovering the sequence and structure of maize abnormal chromosome 10” Poster #165 2010 Maize Meeting.


Who could have predicted maize geneticists would be so interested in maize genes? The entry I posted last night on Purple plant1 and Colored aleurone1 easily received more traffic in its first day on the site (it’s still got a long way to go before it catches long term readership attractors like water chestnuts and the NIPGR tomatoes), than any entry since the heady days of the maize genome release back in November.

The relationships of the four grass species with sequenced genomes. The branches are NOT to scale with how long ago the species split apart. Green stars represent whole genome duplications. The most important one to notice in the one in the ancestry of maize/corn. That duplication means that every region in sorghum, rice, or brachypodium is equivalent to two different places in the maize genome, one descended from each of the two copies of the genome that existed after the duplication.

And this morning the dataset I drew that example from, 464 classical maize genes mapped onto the maize genome assembly plus syntenic orthologs in up to four grass species: sorghum, rice, brachypodium, and the other region of the maize genome created by the maize whole genome duplication (technically syntenic homeologs since we started in maize to begin with, by the principle is the same), went out to the maize genetics community (thank you MaizeGDB!).

A postdoc in our lab tells me more people have visited CoGe today than any day on record (and we hit that mark before noon!).

Anyway, thank you guys, it’s great to feel appreciated!