Happy Thanksgiving to my fellow residents of the US out there. Been meaning to post on something that's been in the background as I've been thinking about a project lately, and holiday post-meal beached football time is as good as any to do a little relaxed writing.
The question, then: What exactly is it that we mean when we talk about the ability of a scientific theory or model to offer predictions?
This is by no means a new question, but let me say something about why it's been on my mind lately. In a paper draft I've been hacking around on, and a recent talk at Mississippi State, I've been thinking about a distinction that's become very common in the philosophy of biology, one between what's been called "predictive" and "vernacular" notions of fitness (originally from Matthen and Ariew 2002).
In some cases, it seems pretty obvious what we'd mean when we say that a theory offers predictions. The Ptolemaic theory tells you that if you look over here, you'll see Mars, and you do.1 But predictive fitness doesn't seem to be one of those obvious cases. It's intended to track a common usage of fitness in population genetics, where we model change in a population (say, the ratio of two types p and q over time) with an equation like
pt / qt = wt p0 / q0
The predictive fitness is this w variable – the parameter that, pretty directly, lets you make predictions about future trait ratios (pt / qt) on the basis of past trait ratios (p0 / q0).
That does, of course, seem pretty predictive. But that basic formalization basically never applies in the natural world – it requires asexual reproduction, haploid organisms, and non-overlapping generations (as well as the absence of a million other complicating factors).
You can, of course, complicate the model (and I've done some of that work myself). What you're doing when you do that, then, is trying to describe the central tendency of the massive diversity of evolutionary outcomes that could occur for any particular organism or population, that central tendency being the best guess you'd have, the prediction you should offer were you asked about what would happen to that organism or population over time.
In a nutshell, that's predictive fitness. But the more I think about this, the weirder it seems. While that central tendency is surely informative about what will happen to organisms, it doesn't seem to me like it's actually a good predictor. Most of the time, it will be inferred from a tiny sample of data, unrepresentative of what those evolutionary outcomes will be like. And even when we have a strong inferential basis (say, the data about evolutionary outcomes in Richard Lenski and colleagues' Long-Term Evolution Experiment), we can still have evolutionary influences (e.g., chaotic population dynamics) that render these predictions pretty well meaningless.2
But predictions can always fail – as Lakatos taught us, every scientific theory says at least a little something false, is "born falsified." So to figure out what to do with this, we need to think about how and why it is that predictions can fail. To that end, I want to start thinking my way through a taxonomy of predictive failure, one that I admittedly don't think is yet complete. Thinking through this is much of the reason I wanted to write up this material.
1. You might not have a model available to cover the empirical case you're worried about. This certainly has happened some in the history of the debate over fitness. For example, a variety of counterexamples to the propensity interpretation of fitness had to do with the inability of the mathematical model offered by proponents of the PIF to encompass certain kinds of evolutionary scenarios, such as those (famously, and much overused in philosophy circles) described by Gillespie. That's plausibly less of a concern after the publication of the model that Ramsey and I offered a couple years ago, though I'm still working through some of the details there and trying to tighten up the mathematical and metaphysical nuts and bolts. But let's keep going.
2. You might just not be able to get enough data to make good predictions. That does seem to be a common problem with predictive fitness. If we take fitness properties to be relative to and defined by things like environmental influences, pleiotropic effects (the effect of traits on other traits, or the sensitivity of a particular trait to the rest of the genetic background), and so forth, then it's highly unlikely we'll ever be able to observe enough organisms (again, cases like the Long-Term Evolution Experiment excepted) to really make a solid prediction. But that's not a special kind of predictive failure specific to predictive fitness – it's just the same as not being able to predict where a thrown ball will land if you can't take enough data points.
3. You might not be able to compute predictions even if you had enough data. This is the case, for example, in some areas of quantum mechanics, where we won't have the computational power to derive predictions from any very complicated quantum wave functions, even if we were to have enough data. It doesn't seem to be an issue with respect to fitness, because the models simply aren't computationally problematic in that way.
4. You might not be able to get good predictions even if you had enough data and could compute the best available prediction. This seems to me to be a problem in at least some cases with predictive fitness. For example, if long-term population dynamics are chaotic, then whatever the central tendency of evolutionary outcomes might be, it won't be at all meaningful for actually understanding what would happen to a particular population. (There's some simulation-based evidence that this could occur in natural populations.)
So I think the question – for another time and another post – has to be this: if predictive fitness isn't actually useful for predicting things, then why do we analyze fitness using this dichotomy in the first place? And how could we do better?
But what other forms of prediction failure am I missing? Are there other kinds of prediction success or failure that are relevant here? Do biologists actually use this central-tendency prediction in some way that I'm not taking into account?
1. Perhaps apocryphally, I've heard (maybe from my undergraduate history of science instructor, the late Michael Mahoney?) that for a long time, Ptolemaic astronomy was taught to Navy sailors as the mathematical fundamentals for navigating using the stars. The mathematics is so much easier, and the precision close enough, that it makes no sense to actually teach people Newtonian mechanics. (Much less general relativity, of course.)
2. I'm also not dealing here with the question of whether or not biologists actually do this. They don't, as far as I can tell, in almost any context except for something like building approximated computer models.