James and the Giant Corn Genetics: Studying the Source Code of Nature

December 30, 2021

It turns out genetics (and plant breeding) actually work

So I did a thing. For those who don’t want to click the link, it describes the results farmers are seeing in their first year of growing two new varieties of proso millet developed by a company called Dryland Genetics. Many farmers are getting 20% more grain from the same land as they did with the varieties they grew in the past. Since proso millet is grown in close to half a million acres in the USA (two hundred thousand hectares or three million mu (亩) for those of you reading internationally), that means these new varieties have the potential to produce a lot more calories from the same land, using the same water and the same nitrogen.

I helped found Dryland Genetics in 2014. At the beginning that meant reading a lot. Then writing a business plan. Then pitching that business plan. Winning over investors. Wrangling logistics. Hiring a full time breeder. Crunching numbers and datasets. Losing sleep over logistics and seed processing and cleaning and inspections and sales. More recently hiring more people who take over the job of wrangling and lose sleep over logistics and seed processing and cleaning and inspections and sales.

(more…)

December 29, 2021

Resequencing the sorghum association panel

A really nice thing about many crop plants is that through natural self pollination it is possible to create true breeding inbred lines. Inbred lines plants that are homozygous across all or nearly all of their genomes. If the same inbred plant is the used as the mother and father to produce new seeds, all those seeds will be genetically identical to the parent plant. Just like identical twins. And like identical twins, inbred lines make it possible to understand a LOT more about the interplay of genetics and environment since we have a chance to see how different or similar the characteristics of genetically identical individuals turn out to be.

(more…)

March 9, 2019

What a good day looks like

Filed under: Uncategorized — James @ 8:03 am

While yesterday was draining, it had some really high points and I want to get these written down to remember on harder days.

On my way up the stairs to my office, one of my newest colleagues stopped me to ask a few questions and ended the conversation to tell me how much she likes the mentoring style she’s seen me exhibit with my students and we sympathized with each other on how hard it can be to thread the needle between leaving students without enough support, or just doing everything for them so they don’t have a chance to solve problems or figure out how to answer questions on their own.

Then an hour later I got an email from a former sandwich student (got his PhD at a chinese university, but got a fellowship to come do two years of his thesis research in my lab). He is just starting up his own lab in Sichuan.

Congratulation in advance for your promotion and I believe you will be an extraordinary scientist in the short future (such as the guy building the atom bomb, hahaha). … I finally realized the hardships of building a laboratory as you told us before. We are now training some undergraduate students in our lab but the process is very hard and I have to do every experiment in person to make these students do not “blow up” the lab.

Sent me a couple of photos of his students and lab and the fieldwork they’re doing that I was showing off to everyone I met with the rest of the day.

Planting potatoes in the mountains of southeast China.
Lang Yan (second from the right) and her first batch of undergraduate student trainees.

So yes. This is what a good day looks like as an assistant professor.

January 27, 2019

The Last Genome I’ll (Probably) Ever Publish: Proso millet (Panicum miliaceum)

Filed under: Uncategorized — James @ 4:05 pm
Proso millet growing in western Nebraska.

I’ve now been a part of the publication of three genomes, all grasses. One as a grad student (Brachypodium distachyon). One as a postdoc (Dichanthelium oligosanthes). And now one as a PI (Panicum miliaceum). Each species had different motivations: Brachypodium was intended to be a genetic model selected because it belonged to the same part of the grass family as wheat, barley, rye, and oats, but had a genome that was 1-2 orders of magnitude smaller. Dichanthelium was a comparative grade genome picked because stood between two groups of C4 grasses with sequenced genomes (maize and sorghum on one side, foxtail millet and pearl millet on the other) yet still used C3 photosynthesis, the ancestral state. Panicum miliaceum (proso millet or broomcorn millet) was sequenced because it’s an actual crop people grow in some of the driest cultivated land in the world (like inner Mongolia and western Nebraska), and having a reference genome sequence really does help with things like genomic selection, marker assisted selection, and QTL mapping. And each was sequenced using completely different technologies: Sanger sequencing (Brachypodium), Illumina short reads and mate pairs “next gen sequencing” (Dichanthelium), and PacBio long-reads combined with HiC “third gen sequencing” (proso millet). PacBio assemblies are SO MUCH BETTER than what we could manage with Illumina + mate pairs (I realize this is not news to most of you, but it’s one thing to hear it, it’s another to see it for yourself).

Differently colors of proso millet grain, all sourced from the USDA NPGS’s amazing germplasm collections.

If I’ve learned one thing from these three experiences it is that it makes sense to work together with a whole team of people to put together a genome. The Dichanthelium genome project I was mostly working with a single other postdoc who also thought the potential for comparative genomics/biology of the species was cool, and in retrospect we bit off way more than we could chew, and were lucky to make it across the finish line to a paper. For both proso millet and brachypodium, I had the joy of working with big teams of people including folks whose whole job was genome assembly and annotation, and they were really REALLY good at it.

A single proso millet spikelet flowering.

So what can I tell you about proso millet? It produces grain more efficiently per unit of water transpired than any other grain crop studied. It can produce grain in fewer days than any other crop I’ve worked with (some varieties are ready for harvest 50-60 days after planting!) It’s an allotetraploid, although so far we’ve only found a diploid lineage related to one of its subgenomes, not the other. One early approach we tried (see Ott et al below) was to use a technology designed to separate and phase the haplotypes of a diploid human to separate and phase the two subgenomes of an inbred tetraploid individual of proso millet. I’ve actually met farmers in both China and the USA who grow the crop, which is a really nice feeling. With one of my private sector hats on, I’ll get to use this genome to try to make higher yielding varieties of proso millet for those exact farmers. With my main public sector hat on, I’m excited to have a model for NAD-ME C4 photosynthesis that is easier to germinate, grow, and propagate than Panicum hallii or Panicum virgatum. There is nothing like working with wild grasses to make you appreciate the work all of our ancestors did to select against seed dormancy and photoperiod sensitivity while they were domesticating crops from wild species over dozens and hundreds of generations.

Zou C, Miki D, Li D, Tang Q, Xiao L, Rajput S, Deng P, Peng L, Huang R, Zhang M, Sun Y, Hu J, Fu X, Schnable PS, Li F, Zhang H, Feng B, Zhu X, Liu R, Schnable JC, Zhu JK, Zhang H. (2019) “The genome of broomcorn millet.” Nature Communications doi: 10.1038/s41467-019-08409-5

Ott A, Schnable JC, Yeh CT, Wu L, Liu C, Hu HC, Dolgard CL, Sarkar S, Schnable PS. (2018) “Linked read technology for assembling large complex and polyploid genomes.” BMC Genomics doi: 10.1186/s12864-018-5040-z

July 6, 2018

Repost: Correcting genotyping errors when constructing genetic maps from genotyping by sequencing — GBS — data.

Filed under: Uncategorized — James @ 12:15 pm

Editor’s note: this is a repost of an article which originally ran on James and the Giant Corn March 26th, 2017. I’m choosing to post this new, slightly amended version a little more than a year later to mark the publication of the paper describing Genotype Corrector. All told it took approximately 18 months from initial submission to final publication. However, to be fair a lot of that time was spent waiting for a single round of peer review at a different journal from the one in which the paper finally appeared.  

When doing anything even vaguely related to quantitative genetics I would chose more missing data over more genotyping errors any day of the week. There are lots of approaches to making missing data less of a pain. The most straightforward of these is called imputation. Imputation essentially means using the genetic markers where you do have information to guess what the most likely genotypes would be at the markers where you don’t have any direct information on what the genotype is. This is possible because of a phenomenon known as linkage disequilibrium or “LD.” Both imputation and LD deserve their own entire write ups and they are on the list of potential topics for when I have another slow Sunday afternoon. For now the  only thing you have to know about them is that, when information on a specific genetic marker is missing, it is often possible to guess with fairly high accuracy what that missing information SHOULD be. But when the information on a specific genetic marker is WRONG… well it’s usually a bit more of a mess (but I think the software solutions for this are getting better! Details at the end of the post.)

Figure 1: Genotype calls along chromosome 1 for six recombinant inbred lines (RILs).

(more…)

May 3, 2017

Play to your strengths!

Filed under: food,research stories — James @ 12:19 pm

A friend tries her own hand at chopping open coconuts on Hainan Island.

I have a confession to make here. I suck at organic chemistry. Chemistry in general (general chemistry, organic chemistry, biochemistry) was by far my weakest subject in college (Cs the whole way). I even managed to fail organic chemistry lab one semester which brought me down below a full course load that semester and I had to organize an appeal to avoid being involuntarily suspended the following semester. It’s always fun to tell this story to new undergrads and or grad students and watch their eyes get wider and wider as the tale goes on.

The reason I tell them that story — besides to try to help put things into perspective when a kid is worried about getting their first B and that their own imagined future is crumbing before their eyes — is to make the point that it’s okay to be really really good at some things, and suck terribly at others. That’s why we come together as a society. If I’m really good at climbing trees to harvest coconuts, but suck at spearing fish, and you have the opposite skill set, one solution would be for me to spend all my time practicing fish spearing, and you to spend all your time practicing tree climbing. Or I could trade you some of my coconuts for some of your fish, and we’d both have a lot more to eat when we sit down to a delicious feast on the beach as the waves roll in.

I have no idea what these even are, let along how to make them, but I remember them being really delicious (Beijing 2014).

There is also such as thing as over-specialization. If I’m so focused on harvesting a particular type of coconut that I develop my whole own coconut focused vocabulary, to the point I cannot even communicate with people who spear fish, or farm taro, I’m going to have a bad time of it out in our hypothetical island world.

Thus ends this fable/analogy/whatever it is.

….also I’ve sucked at spelling since I first learned to write.

April 16, 2017

I may have a slightly skewed idea of normal work habits are

Filed under: Uncategorized — James @ 9:24 am

I’m now worked at four different scientific institutions in some capacity or another, and I’m always surprised how empty buildings are when I come in on Saturdays or Sundays. To be clear, I’m certainly not at work every weekend day myself, and I don’t expect the students or collaborators to work weekends.* I’m just realizing that, after 13 years of thinking “wow, people at University X really have a more relaxed approach to research than most places” maybe my idea of how many hours it is normal for a researcher to log in a week might be a tiny bit skewed.**

 

Although, to be fair, 9:15 AM on Easter Sunday might be the MOST representative time point. 😉

*I always say that my mentoring style is to focus on productivity, not hours worked in lab. I’m still working out what that means in practice. For an entertaining — as long as the person writing the e-mail isn’t your boss — glimpse of what the opposite sounds like, be sure to read this classical e-mail from 2002. 

**Growing up, I thought every family had dinner around 8 pm once everyone got home from the office, and that once you got a real job, “weekend” actually meant “sunday morning.”

April 3, 2017

A new chapter

Filed under: Feeding the world,Site Business — James @ 2:38 pm

Whatever anyone tells you, remember to play to your greatest strengths, not your weaknesses.

I’ve been saying it for nearly a decade: pineapples really are awesome

Filed under: genomics,Plants — James @ 9:17 am

With all these new third generation sequencing technologies coming out in 2010, hopefully someone will sequence the pineapple genome. If not, maybe the cost of sequencing will drop enough while I’m in grad school that I can sequence the genome myself ( a guy can dream).

An incredibly overused graph, but the reason it’s so overused is that it really is a remarkably useful dataset. Source: https://www.genome.gov/sequencingcostsdata/

Although I was a bit overly optimistic back in 2010 about how fast the cost of sequencing (and critically assembling) genomes would decline. Back then we are all talking about sequencing prices dropping 10x every 1-2 years. This turned out of be a quick burst of innovation brought about by second generation sequencing technologies (primarily 454 at first, then Solexa which became Illumina later on). Like many technologies, there was a lot more low hanging fruit for optimization early on, and the cost of sequencing essentially plateaued from 2011 to 2015.

Of course now we’re finally starting to get those economically viable 3rd generation sequencing technologies I though were right around the corner in 2010. And they still have lots and lots of headspace for optimization (pacbio and oxford nanopore being the two most successful ones at the moment) that maybe in another 6-7 years grad students really will be able to generate genome assemblies on a whim.

In the meantime, hey, we did get a pretty cool pineapple genome assembly a couple of years ago.

Ming R., VanBuren R., Wai C. M., Tang H., Schatz M. C., et al., 2015 The pineapple genome and the evolution of CAM photosynthesis. Nat Genet 47: 1435–1442.
Also, here’s a fun video of a 3D scan of the internal structure of a pineapple:

Evidence of my ongoing obsession with pineapples.

Science is fun.

Editor’s note: Robert VanBuren, second author on the pineapple genome, and first author on at least one of dozen or so published grass genome sequences got his own research group out at MSU working on CAM photosynthesis and drought. Check it out!

March 30, 2017

Dichanthelium oligosanthes (One in a Thousand Series)

Filed under: biology,evolution,genomics,One in One Thousand — James @ 8:46 am

Inflorescence of Dichanthelium oligosanthes. Accession “Kellogg 1175”

Out of the ~12,000 known grass species, the genomes of less than one in one thousand have been sequenced. The “One in a Thousand” series focuses on these rare grass species.

Dichanthelium oligosanthes is a wild grass that grows in forest glades throughout the American midwest. It is a small plant. Doesn’t grow particularly fast. Its flowers aren’t particularly striking. And it has enough issues with seed dormancy that growing it in captivity is a major pain. Dichanthelium is a one in one-thousand grass with a sequenced reference genome.*

The reason folks are interested in Dichanthelium isn’t because of what it is, but who it’s related to. Dichanthelium occupies a spot on the grass family tree between a tribe** of grasses that includes foxtail millet and switchgrass, each one in a thousand species themselves, and another tribe of grasses that includes corn and sorghum, two more one in a thousand species. The relationship looks something like this:

Phylogenetic relationship of Dichanthelium oligosanthes to related grasses with sequenced genomes.

(more…)

« Newer PostsOlder Posts »

Powered by WordPress