James and the Giant Corn Genetics: Studying the Source Code of Nature

December 19, 2022

The Paspalum Genome

Filed under: agriculture,biology,genomics,Plants — James @ 4:41 pm

A paper eight years in the making and sixteen months in review. A real credit to Guangchao. I don’t think it ever would have come out without his dedication above and beyond what anyone could expect from a postdoc – James.

A team of researchers, led by Dr. Guangchao Sun and Prof. James Schnable of the University of Nebraska just published the genome of paspalum (Paspalum vaginatum) alongside evidence that paspalum may employ some tricks that could help its relative, corn, grow better with less fertilizer. 

Illustration of a paspalum vaginatum inflorescence.
A flowering paspalum plant growing in the University of Nebraska’s Beadle
Center greenhouses.


The Nebraska team, in collaboration with researchers from the Department of Energy’s Joint Genome Institute, the University of Georgia, and the HudsonAlpha Institute for Biotechnology, wanted to use the new genome sequenced this resilient grass to understand what makes the grass so much more stress tolerant that closely related crops, including corn and sorghum.

Using comparative transcriptomic and metabolomic analyses of paspalum, corn, and sorghum under optimal and stress conditions they identified a specific metabolic pathway — trehalose — that was being produced in paspalum, but not in corn and sorghum, in response to stress. 

The team used a strategy called chemical genetics to convince corn plants to also start producing and accumulating trehalose and showed that these corn plants grew faster and larger in conditions without enough fertilizer than corn plants without extra trehalose. Finally, the team used a combination of experiments to show that the reason these corn plants were able to grow more with less fertilizer was because of a process called autophagy, essentially a recycling program within plant cells that breaks down old, damaged, and unneeded proteins into spare parts that can be used to make new proteins. 

Guangchao Sun working with Aime Valentin Nishimwe measure nitrogen stressed plants in the Schnable lab at the University of Nebraska-Lincoln
Dr. Guangchao Sun working with Aime Valentin Nishimwe to collect data from
plants grown without enough nitrogen fertilizer.

“I’m so excited to see this story come out,” said Prof. Schnable, who is currently taking leave from the University of Nebraska while working at Google. “Paspalum is vegetatively propagated which means we cannot just save seeds, we always have to keep living plants for our research. There was a period where no one remembered to water the paspalum plant for a couple of months. But the plant was completely fine. In fact it usually grows so fast it’ll try to invade the pots of neighboring plants and the greenhouse manager has to yell at me or folks in my lab to come down and trim it. With this genome sequence and all the great work Guangchao Sun and the team have done, we finally are starting to understand just what makes this plant so
resilient.”

Sun, G., Wase, N., Shu, S. et al. Genome of Paspalum vaginatum and the role of trehalose mediated autophagy in increasing maize biomass. Nature Communications 13, 773 (2022) doi: 10.1038/s41467-022-35507-8

December 29, 2021

Resequencing the sorghum association panel

A really nice thing about many crop plants is that through natural self pollination it is possible to create true breeding inbred lines. Inbred lines plants that are homozygous across all or nearly all of their genomes. If the same inbred plant is the used as the mother and father to produce new seeds, all those seeds will be genetically identical to the parent plant. Just like identical twins. And like identical twins, inbred lines make it possible to understand a LOT more about the interplay of genetics and environment since we have a chance to see how different or similar the characteristics of genetically identical individuals turn out to be.

(more…)

April 3, 2017

I’ve been saying it for nearly a decade: pineapples really are awesome

Filed under: genomics,Plants — James @ 9:17 am

With all these new third generation sequencing technologies coming out in 2010, hopefully someone will sequence the pineapple genome. If not, maybe the cost of sequencing will drop enough while I’m in grad school that I can sequence the genome myself ( a guy can dream).

An incredibly overused graph, but the reason it’s so overused is that it really is a remarkably useful dataset. Source: https://www.genome.gov/sequencingcostsdata/

Although I was a bit overly optimistic back in 2010 about how fast the cost of sequencing (and critically assembling) genomes would decline. Back then we are all talking about sequencing prices dropping 10x every 1-2 years. This turned out of be a quick burst of innovation brought about by second generation sequencing technologies (primarily 454 at first, then Solexa which became Illumina later on). Like many technologies, there was a lot more low hanging fruit for optimization early on, and the cost of sequencing essentially plateaued from 2011 to 2015.

Of course now we’re finally starting to get those economically viable 3rd generation sequencing technologies I though were right around the corner in 2010. And they still have lots and lots of headspace for optimization (pacbio and oxford nanopore being the two most successful ones at the moment) that maybe in another 6-7 years grad students really will be able to generate genome assemblies on a whim.

In the meantime, hey, we did get a pretty cool pineapple genome assembly a couple of years ago.

Ming R., VanBuren R., Wai C. M., Tang H., Schatz M. C., et al., 2015 The pineapple genome and the evolution of CAM photosynthesis. Nat Genet 47: 1435–1442.
Also, here’s a fun video of a 3D scan of the internal structure of a pineapple:

https://twitter.com/CygnusPlantXray/status/848882758692724736

Evidence of my ongoing obsession with pineapples.

Science is fun.

Editor’s note: Robert VanBuren, second author on the pineapple genome, and first author on at least one of dozen or so published grass genome sequences got his own research group out at MSU working on CAM photosynthesis and drought. Check it out!

March 30, 2017

Dichanthelium oligosanthes (One in a Thousand Series)

Filed under: biology,evolution,genomics,One in One Thousand — James @ 8:46 am

Inflorescence of Dichanthelium oligosanthes. Accession “Kellogg 1175”

Out of the ~12,000 known grass species, the genomes of less than one in one thousand have been sequenced. The “One in a Thousand” series focuses on these rare grass species.

Dichanthelium oligosanthes is a wild grass that grows in forest glades throughout the American midwest. It is a small plant. Doesn’t grow particularly fast. Its flowers aren’t particularly striking. And it has enough issues with seed dormancy that growing it in captivity is a major pain. Dichanthelium is a one in one-thousand grass with a sequenced reference genome.*

The reason folks are interested in Dichanthelium isn’t because of what it is, but who it’s related to. Dichanthelium occupies a spot on the grass family tree between a tribe** of grasses that includes foxtail millet and switchgrass, each one in a thousand species themselves, and another tribe of grasses that includes corn and sorghum, two more one in a thousand species. The relationship looks something like this:

Phylogenetic relationship of Dichanthelium oligosanthes to related grasses with sequenced genomes.

(more…)

March 28, 2017

STAG-CNS: Finding smaller conserved promoter regions by throwing more genomes at the problem

Filed under: genomics,Plants — Tags: , , — James @ 9:15 am

Functionless DNA changes more rapidly, functional DNA more slowly. This is one of the fundamental principles of comparative genomics. It’s why people look at the ratio of synonymous nucleotide changes to nonsynonymous nucleotide changes within the coding sequence of genes. It’s why the exons of two related genes will still have strikingly similar sequences after the sequence of the introns have diverged to the point where it’s impossible to even detect homology. It’s also a way to identify which parts of the noncoding sequence surrounding a set of exons are functionally constrained. The bits of noncoding sequence that determine where, and when, and how much, a gene is expressed are by definition, functional, and should diverge more slowly between even related species than the big soup of functionless noncoding sequence that the functional bits of a genome float in. These conserved, functional, noncoding sequences are called, unimaginatively, conserved noncoding sequences (CNS).*

Comparison of a single syntenic orthologous gene pair in the genomes of peach and chocolate. Coding sequence marked in yellow, introns in gray, annotated UTRs in blue. Red boxes are regions of detectably similar sequence between the same genomic region in these two species. Taken from CoGePedia.

I’ve been playing with CNS since I first opened a command line window back as a first year grad student. The smallest CNS we’d consider “real” were 15 base pair exact matches between the same gene in two species. On the one hand, this seemed a bit too big, because I know lots of transcription factors bound to motifs as short as 6-10 base pairs long. On the other hand this seemed a bit too short because I’d see 15 base pair exact matches that couldn’t be real a bit too often (for example a match between a sequence in the intron of one gene, and the sequence after the 3′ UTR of another).

15 bp represented a compromise between the two concerns pushing in opposite directions. Then, in the fall of 2014, a computer science PhD student walked into my office and asked if I had any interesting bioinformatics problems he could work on. The result was a new algorithm (STAG-CNS) which was both more stringent at identifying conserved noncoding sequences and able identify shorter conserved sequences than was previously possible. It achieved both of these goals through the expedient of throwing genomes from more and more species at the problem.

(more…)

March 26, 2017

Correcting genotyping errors when constructing genetic maps from genotyping by sequencing — GBS — data.

When doing anything even vaguely related to quantitative genetics I would chose more missing data over more genotyping errors any day of the week. There are lots of approaches to making missing data less of a pain. The most straightforward of these is called imputation. Imputation essentially means using the genetic markers where you do have information to guess what the most likely genotypes would be at the markers where you don’t have any direct information on what the genotype is. This is possible because of a phenomenon known as linkage disequilibrium or “LD.” Both imputation and LD deserve their own entire write ups and they are on the list of potential topics for when I have another slow Sunday afternoon. For now the  only thing you have to know about them is that, when information on a specific genetic marker is missing, it is often possible to guess with fairly high accuracy what that missing information SHOULD be. But when the information on a specific genetic marker is WRONG… well it’s usually a bit more of a mess (but I think the software solutions for this are getting better! Details at the end of the post.)

Figure 1: Genotype calls along chromosome 1 for six recombinant inbred lines (RILs).

(more…)

November 27, 2010

Hybrid vigor and missing genes

Filed under: Genetics,genomics — James @ 5:51 pm

Thinking about defining the number of genes present in the maize genome reminded me of an old* story about the trouble of defining what truly represents a gene and how really awesome ideas can sometimes come years before the data needed to support them.

The year is 2002. The first complete version of the human genome is still a year away. The genomes of two plant species have already been published (rice and arabidopsis) but in terms of shere genome size, both species are a drop in the bucket compared to the human genome, or other plant genomes like corn or wheat. But none of this is particularly important except to set the stage.

Two researchers at Rutgers University were sequencing a tiny piece of the maize genome (~0.01%) that surrounded a single gene call bronze1 — the fifth most studied gene in maize — when they found something unexpected.

They had previously 10 identified genes in a single stretch of 32-kb of the maize genome. (A similar gene density throughout the remainder of the maize genome would have resulted in a maize genome containing more than 700,000 genes!) However it was already known that the maize genome was split between small gene-rich islands and vast desolate expanses of transposons (referred to as transposon nests**), and in fact the same study identified a couple of these nests of transposons on either side of their gene rich island (see part A of the second picture in this post).

Below I'll use cartoons, but here's a real and to scale example of a gene rich island I picked at random from maize chromosome 3. Genes and intergenic spaces are to scale. Base image generated with GenomeViewer, part of the CoGe toolkit. http://www.genomevolution.org/CoGe/

Their initial sequencing used DNA from a breed of corn called McC, which I must admit I’ve only ever read about in this particular paper. However, when they decided to sequenced the same region from the genome of B73*** they made three discoveries which I’ve listed in increasing order of strangeness: (more…)

June 3, 2010

Transposon Mutagenesis

Filed under: biology,genomics — James @ 12:30 am

In yesterday’s Transposon Week post, I discussed how transposons can spread through a species by without providing any benefit to the animals, plants, fungus, or micro-organisms that host them.

Adding a little extra useless DNA doesn’t help an organism survive, but it also doesn’t cause serious harm. But in yesterday’s post I completely avoided one serious question:

When new copies of a transposon get inserted across the genome, what happens to the DNA they land in? For what matter, what kind of DNA do transposons land in in the first place?

The answer to the second question is that different kinds of transposons each have their favorite places to land in the genome. Some transposons like to land in centromeres. Some transposons like to land in other transposons. Some transposons like to land near genes.

Then there are transposons like Mutator. Mutator is a maize/corn transposon that really likes to insert itself into genes. Transposons that usually land in other parts of the genome are also sometimes found in genes.

When a transposon lands in a gene, whether because that’s where it likes to insert or simply by accident, the gene stops working. Depending on which gene has to misfortune to interrupted by a transposon, the effects can range from so-subtle-we-can’t-even-detect-them to so lethal the organism dies before we get a chance to study it. In between are a whole range of effects. From severe developmental mutants, to gorgeous and apparently random streaks of color in flowers, to the spotted corn kernels which were my first introduction to the world of transposons.* (**)

Transposons are always breaking genes. The deadliest mutations disappear from the population as quickly as their appear. More subtle mutations can linger on for generations given rise to all sorts of genetic disorders. And keep in mind many genes can be broken with no visible effect at all. Anything you eat from asparagus to zuccini has the potential to contain genes broken by transposons. And depending on the gene, you’d probably never even know it.

Sorry to put this post up so late (it’s technically already Thursday) and in such a poor shape. I had some craziness in lab today and was waiting (unfortunately without any luck) to hear back about some more interesting stories I could tell about the Mutator transposon.

*To be fair, the last two are actually caused by transposons jumping OUT of genes allowing them to resume their normal function. The original mutations caused by transposons inserting into genes were to break the biochemical pathways used by Dahlia’s to make red pigment in their petals and by corn to produce purple pigment (anthocyanin) in its kernels.

**I really wish there was a good source of freely usable pictures things like transposon sectors in flowers and corn kernels. I can usually find pictures of normal plants on Flickr with creative common licenses. But I really want to be able to show you guys the cool mutants that make genetics so exciting.

June 1, 2010

Transposons: The Difference Between Junk DNA and Selfish DNA

Filed under: biology,evolution,genomics — Tags: — James @ 3:47 pm

Tranposons are one of those really cool features of genomes that never really seem to make the jump into the public eye. Most people at least have some conception of what a gene is. It’s a piece of DNA that contains the instructions for making a protein plays some role in the cell. A lot of other people can recall hearing an off-hand statistic only some tiny fraction of the human genome is made up of genes, with the rest being “junk DNA”. The question of why most of our genomes have no apparent function is why there’s a slow trickle of scientific research that gets picked up in the popular press as “scientistists discover junk DNA not junk after all!”.

But the reason most of genetics-genomics people aren’t in a huge rush to discover the hidden function behind most of this “junk DNA” is because we KNOW what most of it does and where it comes from. It’s not junk, it’s selfish DNA. <– although there’s certainly lots of cool stuff remaining to be discovered in the much smaller fractions of genomes we can’t classify at all. (more…)

May 31, 2010

Welcome to transposon week here at James and the Giant Corn!

Filed under: biology,Genetics,genomics — Tags: , — James @ 2:23 pm

I’m just about wrapped up with the big project I’ve been working on recently. Hope to be able to say more about it in the not-too-distant future. Having to be secretive in science sucks.

But there’s a lot of be happy about! I’m done teaching for a long time. As much as I enjoyed working with the kids in my class, the other responsibilities of teaching (grading, sitting through lectures without the chance to break in for the discussions and arguments that make academia so fun, grading, designing assignments, grading) were really starting to wear me down.

And I’m only three weeks (June 22nd) from either passing my qualifying exam or becoming a beaten and broken shell of a man. For three hours four professors will question me on everything I’ve learned (or should have learned but didn’t) in my education up to this point, and everything I propose to spend the next few years of my life doing. This may not sound like a good thing, but it is. Because my qualifying exam has been hanging over my head all semester,

The lab has a new paper in press, having run the sequential gauntlets of Peer Review, Editorial Evaluation, and finally (and perhaps most dreaded) Your-Figures-Aren’t-High-Resolution-Enough e-mails from the journal’s publication department. But more on the details of that whenever the paper actually shows up.

But what was the point of this entry again? Oh yeah. Transposons. I have a soft spot from transposons (I’m guessing most people who work with maize genetics do). Today we may know that transposons are found in practically every genome under the sun, but they were discovered first in maize using old school genetics (breeding plants together and counting traits in the offspring), before DNA sequencing was a gleam in its inventor’s eye.

And on top of that, some delightfully high-copy number transposons are in the middle of proving a major scientific point for me, so I figured the least I could do was devote a week to them here on the site.

If you’re not a geneticist, should you still care about transposons? Absolutely! Transposons are one of the best arguments, not for why genetic engineering is safe, but for why, if anyone worried about hypothetical unintended consequences of genetic engineering should be worried about any food with DNA in it (and as far as I know, that’s all food.) To paraphrase a seinfield character: “No food for you!”

The week’s schedule: (more…)

Older Posts »

Powered by WordPress