How would you format the phrase “maize, sorghum, setaria, rice, and arabidopsis”?
If you don’t understand why this is a question that comes up in publishing scientific papers you can stop reading now, never come back, and go on to live a happy and fulfilling life without every revisiting this issue.
Comparative genomics (what I mostly did in grad school) just required a whole bunch of well assembled genome sequences from different species. Comparative gene regulatory studies (one of the things I do now) requires actual living plants. Ideally you can just repurpose datasets which others have already generated and published online, but eventually you hit a wall where the only way to move forward to grow some plants of your own. And not just one or two plants, but trays and trays of them.
Why do we need so many plants to address even simple research questions? Because even with completely identical genomes, grown in carefully controlled environments, different individual plants will grow slightly differently,* and those growth differences will translate into variation the levels at which different genes are expressed. So to make sure we’re actually identify the differences in gene expression that result from [a mutation of a particular transcription factor/differences in growing conditions/different tissues/different species] we need to look at data across lots of individual plants so we can tell which differences have absolutely nothing to do with the thing we are actually trying to study.
All of which is made an awful lot harder when the majority of your seeds are killed by fungus before they even break the surface of the soil!
Each of those little pots should have a happy little corn, sorghum, or setaria plant growing in it. Normally I’d consider >90% good germination and <70% poor germination. This I would call “abysmal.” The corn, (top right) on the other hand seems to be fine.
*Micro-environmental variation, differences in seed size and quality, variation in soil mix, spooky epigenetic “stuff.” Take your pick. I’m inclined to blame it mostly on the first two.
Mini maize, 33 days after planting.
Today is the day proposals are due for NSF Plant Genome. Well organized scientists submitted their proposals back on Friday, before memorial day weekend. Scientists like me worked through the weekend and pulled a couple of late nights, to finish up the proposal on the day of submission.
But this isn’t a story about grant writing. This is a story of feeling tired and burned out, waiting for people who are proofing said grant before we hit the final “submit button” and wandering down to my greenhouse to check on my plants. And there I discovered our mini-maize plants*, already silking and with the very first anthers starting to emerge in the tassel! These plants were planted on April 24th and as I write this, it is the 27th. That is a time to flowering even the very fastest millet species we work with (japanese and proso) would be hard pressed to match!
Now I could decide to be upset that we didn’t catch it in time to put a shootbag over the emerging silks, but instead this tiny little plant just makes me very happy. For anyone else growing this variety, be aware that the ears really sneak up on you (in this plant the ear shoot never made it past the leaf ligule, it looks like just a bunch of silks). Honestly I’m not sure HOW we’re going to shootbag these plants in the future. We may just have to grow them in greenhouses without any other corn (which is actually reasonably feasible here, there is a lot of dedicated space for sorghum).
*The mini-maize pictured here comes from seed provided by Morgan McCaw, a member of Jim Birchler’s group at Mizzou. For a detailed descriptions of its genetic history see his abstract “Fast-Flowering Mini-Maize: Seed to Seed in 60 Days Update” from the 2015 Maize Genetics conference.
Editor’s note: if you’re curious, here’s an update from a month later at the end of the mini-maize lifecycle.
Now, here have some more mini-maize photos:
Author’s note: found in my “Unpublished Drafts” folder from April 12th 2012. Published May 12th 2015 without edits so as to accurately reflect my mindset at the time. Reflections of a much older and (if possible even balder) scientist forthcoming in a separate post.
I would much rather graduate with three papers cited twenty times each than twenty papers cited three times each.*
That fact drives how I do think about publishing my results:
If I wanted to published the maximum number of papers per dataset, I’d be worried about including too much data in any given paper because, once it was published other researchers might take that data and do the same analyses I was planning to do in a followup paper.
If I want my paper to be cited as much as possible though the opposite is true. I WANT my data to be as useful and accessible as possible because it will increase the number of other groups who will use that data, and cite my work when they publish their next paper.
It also changes the dynamics of when to publish. If I was trying to maximize my own publications, I would want to make sure I published before anyone else who could scoop me, but I also wouldn’t want to publish earlier than absolutely necessary to avoid being scooped. The longer I can go without publishing the data and analysis of paper #1, the larger the headstart I have at paper #2 which builds upon those data and analyses.
Since I want to be cited as much as possible, I want to publish as soon as possible. Full stop. Every month I don’t publish people go ahead with research projects without whatever small additional benefit my data and analysis could provide and that means fewer final citations for my papers.
*I don’t expect to achieve either goal in the time remaining to me (well I might hit the first if I count the giant genome paper where I was one or more than 100 authors and go off the much more rapidly updated citation counts of google scholar).
James’s travels Dec 2012-May 2015. Some trips not shown to increase legibility. Click to zoom in.
Figure generated in R starting from this tutorial at flowing data. Frankly I still prefer writing python code to produce R code algorithmically to programming in R directly, but I tend to be stubborn like that.
This will be my third year attending the Plant and Animal Genome conference in sunny San Diego. I’ve been fortunate enough to get to experience the conference in a bunch of different roles.
- My first year I was an overwhelmed young grad student with a poster and the silly idea that I could pack my schedule full of sessions all day every day without suffering melting of the brain. (You really need to pick and choose at PAG. It’s like an all you can eat buffet of science, it is all to easy to go overboard.)
- My second year I returned to PAG as an actual presenter giving two talks to packed sessions (which isn’t an endorsement of my own science I was sandwiched between successful scientists who also happened to be gifted speakers both times).
- And now in my third year I’ll get to see PAG through the eyes of an exhibitor. No, this doesn’t represent my post-PhD career path. This year PAG happened to fall in the break between filling my dissertation and the start of my next “real” job.
Anyway, my plane is about to board so I should wrap this up. To all the rest of you who are coming to the conference, hope you have a great conference, don’t push yourselves too hard, and drop me a line if you’d like me to hook you up with a free t-shirt. 😉
An old PhDComics explains the change in perspective which comes with graduating:
My transformation obviously isn’t complete yet though. Lab meetings with pizza sounds like a wonderful idea.
Over the last couple of years my posts here have really dropped off. It hasn’t been because I ran out of material or lost interest in blogging but simply because more and more of my time and energy have been consumed by a single goal… graduating.
So it gives me great pleasure to report that, as of December 14th (last Friday), I have reached that goal.
Behold! The lollipop handed to every newly minted Berkeley PhD when their thesis is accepted.
What was my thesis about you ask? Well I still don’t have a good elevator speech, so let me simply say that the first part of my thesis has to do with how plant genomes change over time and the second part demonstrated a new method for learning the function of pieces of DNA which don’t code for proteins but instead determine where and when neighboring genes will be turned on or off.
So what’s next? This whole site traces its origins back to travel posts I put up to let friends and family know how I was doing as I interviewed as various graduate schools. So I suppose there would be a fair bit of symmetry to shutting it down as I leave grad school, but I don’t want to do that. Now that I’m finished with my PhD, I’m looking forward to rediscovering the things I used to do for fun, and I remember writing updates here used to be a lot of fun.
On a more practical level, what comes next for me is a 2000 mile drive from California to the midwest (with all my worldly possessions packed into the back of my car) to visit family for the holidays. I am suddenly very conscious of the fact I haven’t driven on snow in more than four years. After that it’ll be onward to a post-doc.
If you’ve left an unanswered comment in the last six months or so and are still interested in me getting back to you, let me know.
For now… it is good to be back.
Because I get so many questions about this step in one of my published papers. (Well more accurately, my PI gets questions about this step and he sometimes forwards them on to me for an answer). The paper referred to in this guide is this one.
There are two completely different steps to reconstructing maize subgenomes: 1) putting together ancestral chromosome pairs 2) grouping one copy of each ancestral chromosome together into subgenome 1 and the other copy of each ancestral subgenome 2.
Ancestral chromosome pair reconstruction: (more…)
Success in grad school doesn’t come from working incredibly hard.
It comes from setting unrealistically fast deadlines for yourself. And then meeting them.
Sometimes that means working early mornings, late nights, and weekends. Sometimes it means coming up with a new approach, getting the results in three hours, and sneaking out of lab at 3:30. But the point is the results are what matter. If you can find ways to be unexpectedly productive you’re much less likely to burn out entirely than if you can only ever meet your own deadlines by burning the midnight oil at both ends (mixed metaphor intended).
Working hard for the sake of appearing to work hard (either to others or to yourself) is the surest road to burnout and lack of results.
P.S. Productivity goes up at least 5-fold when not also teaching. 😀
P.P.S. If the reagents you are working with are as old as you are, you need to worry. 😉 (That falls into the working hard but not getting results category.)