James and the Giant Corn Genetics: Studying the Source Code of Nature

May 31, 2015

Chocolate and Vaccines

Filed under: Uncategorized — James @ 12:28 pm

A colleague of mine from grad school has some thoughts on the recent “chocolate causes weight-loss” bruhahaha. If I had to try to distill his piece down to a single sentence it would be:

Teaching people to ask critical questions about things reported as scientific discoveries in the popular press is good, but it is way too easy of falling into the trap of doing so in a way that undermines people’s confidence in science itself.

Three line sentence, arg. Must simplify more.

Making people feel stupid by abusing their trust isn’t a good way to encourage them to believe you in the future.

Maybe it’d be best just to go read the original post.

May 30, 2015

Species capitalization

Filed under: Uncategorized — James @ 11:19 am

How would you format the phrase “maize, sorghum, setaria, rice, and arabidopsis”?

If you don’t understand why this is a question that comes up in publishing scientific papers you can stop reading now, never come back, and go on to live a happy and fulfilling life without every revisiting this issue.

(more…)

May 29, 2015

Step 1: Grow Plants

Filed under: Uncategorized — James @ 12:20 pm

Comparative genomics (what I mostly did in grad school) just required a whole bunch of well assembled genome sequences from different species. Comparative gene regulatory studies (one of the things I do now) requires actual living plants. Ideally you can just repurpose datasets which others have already generated and published online, but eventually you hit a wall where the only way to move forward to grow some plants of your own. And not just one or two plants, but trays and trays of them.

Why do we need so many plants to address even simple research questions? Because even with completely identical genomes, grown in carefully controlled environments, different individual plants will grow slightly differently,* and those growth differences will translate into variation the levels at which different genes are expressed. So to make sure we’re actually identify the differences in gene expression that result from [a mutation of a particular transcription factor/differences in growing conditions/different tissues/different species] we need to look at data across lots of individual plants so we can tell which differences have absolutely nothing to do with the thing we are actually trying to study.

All of which is made an awful lot harder when the majority of your seeds are killed by fungus before they even break the surface of the soil!

Each of those little pots should have a happy little corn, sorghum, or setaria plant growing in it. Normally I'd consider >90% good germination and <70% poor germination. This I would call "abysmal."

Each of those little pots should have a happy little corn, sorghum, or setaria plant growing in it. Normally I’d consider >90% good germination and <70% poor germination. This I would call “abysmal.” The corn, (top right) on the other hand seems to be fine.

*Micro-environmental variation, differences in seed size and quality, variation in soil mix, spooky epigenetic “stuff.” Take your pick. I’m inclined to blame it mostly on the first two.

May 27, 2015

James and the Tiny Corn

Filed under: Uncategorized — James @ 9:14 am
Mini maize, 33 days after planting.

Mini maize, 33 days after planting.

Today is the day proposals are due for NSF Plant Genome. Well organized scientists submitted their proposals back on Friday, before memorial day weekend. Scientists like me worked through the weekend and pulled a couple of late nights, to finish up the proposal on the day of submission.

But this isn’t a story about grant writing. This is a story of feeling tired and burned out, waiting for people who are proofing said grant before we hit the final “submit button” and wandering down to my greenhouse to check on my plants. And there I discovered our mini-maize plants*, already silking and with the very first anthers starting to emerge in the tassel! These plants were planted on April 24th and as I write this, it is the 27th. That is a time to flowering even the very fastest millet species we work with (japanese and proso) would be hard pressed to match!

Now I could decide to be upset that we didn’t catch it in time to put a shootbag over the emerging silks, but instead this tiny little plant just makes me very happy. For anyone else growing this variety, be aware that the ears really sneak up on you (in this plant the ear shoot never made it past the leaf ligule, it looks like just a bunch of silks). Honestly I’m not sure HOW we’re going to shootbag these plants in the future. We may just have to grow them in greenhouses without any other corn (which is actually reasonably feasible here, there is a lot of dedicated space for sorghum).

*The mini-maize pictured here comes from seed provided by Morgan McCaw, a member of Jim Birchler’s group at Mizzou. For a detailed descriptions of its genetic history see his abstract “Fast-Flowering Mini-Maize: Seed to Seed in 60 Days Update” from the 2015 Maize Genetics conference.

Editor’s note: if you’re curious, here’s an update from a month later at the end of the mini-maize lifecycle.

Now, here have some more mini-maize photos:

(more…)

May 12, 2015

Two scientific cultures: publication driven vs citation driven

Filed under: Uncategorized — James @ 5:44 pm

Author’s note: found in my “Unpublished Drafts” folder from April 12th 2012. Published May 12th 2015 without edits so as to accurately reflect my mindset at the time. Reflections of a much older and (if possible even balder) scientist forthcoming in a separate post.

I would much rather graduate with three papers cited twenty times each than twenty papers cited three times each.*

That fact drives how I do think about publishing my results:

If I wanted to published the maximum number of papers per dataset, I’d be worried about including too much data in any given paper because, once it was published other researchers might take that data and do the same analyses I was planning to do in a followup paper.

If I want my paper to be cited as much as possible though the opposite is true. I WANT my data to be as useful and accessible as possible because it will increase the number of other groups who will use that data, and cite my work when they publish their next paper.

It also changes the dynamics of when to publish. If I was trying to maximize my own publications, I would want to make sure I published before anyone else who could scoop me, but I also wouldn’t want to publish earlier than absolutely necessary to avoid being scooped. The longer I can go without publishing the data and analysis of paper #1, the larger the headstart I have at paper #2 which builds upon those data and analyses.

Since I want to be cited as much as possible, I want to publish as soon as possible. Full stop. Every month I don’t publish people go ahead with research projects without whatever small additional benefit my data and analysis could provide and that means fewer final citations for my papers.

*I don’t expect to achieve either goal in the time remaining to me (well I might hit the first if I count the giant genome paper where I was one or more than 100 authors and go off the much more rapidly updated citation counts of google scholar).

May 11, 2015

My post-graduation career in one figure

Filed under: Uncategorized — James @ 10:13 am
James's travels Dec 2012-May 2015. Click to zoom in. Some trips not shown to increase legibility.

James’s travels Dec 2012-May 2015. Some trips not shown to increase legibility. Click to zoom in.

Figure generated in R starting from this tutorial at flowing data. Frankly I still prefer writing python code to produce R code algorithmically to programming in R directly, but I tend to be stubborn like that.

Powered by WordPress