Reproducibility Project - Ecology and Evolutionary Biology

By Hannah Fraser | December 22, 2017

[This post has been originally posted on ecoevotransparency.org]

The problem

As you probably already know, researchers in some fields are finding that it’s often not possible to reproduce others’ findings. Fields like psychology and cancer biology have undertaken large-scale coordinated projects aimed at determining how reproducible their research is. There has been no such attempt in ecology and evolutionary biology.

A starting point

Earlier this year Bruna, Chazdon, Errington and Nosek wrote an article citing the need to start this process by reproducing foundational studies. This echoes early research undertaken in psychology and cancer biology reproducibility projects attempting to reproduce the fields’ most influential findings. Bruna et al’s focus was on tropical biology but I say why not the whole of ecology and evolutionary biology!

There are many obstacles to this process, most notably obtaining funding and buy-in from researchers, but it is hard to obtain either of these things without a clear plan of attack. First off, we need to decide on which ‘influential’ findings we will try to replicate and how we are going to replicate them.

Deciding on what qualifies as an influential finding is tricky and can be controversial. In good news, this year an article came out that has the potential to (either directly or indirectly) answer this question for us. Courchamp and Bradshaw (2017)’s “100 articles every ecologist should read” provides a neat list of candidate influential articles/findings. There are some issues with biases in the list which may make it unsuitable for our purposes but at least one list is currently being compiled with the express purpose of redressing these biases. Once this is released it should be easy to use some combination of the two lists to identify – and try to replicate – influential findings.

What is unique about ecology and evolutionary biology?

In psychology and cancer biology where reproducibility has been scrutinised, research is primarily conducted inside and based on experiments. Work in ecology and evolutionary biology is different in two ways: 1) it is often conducted outside, and 2) a substantial portion is observational.

Ecology and evolutionary biology are outdoor activities

Conducting research outdoors means that results are influenced by environmental conditions. Environmental conditions fluctuate through time, influencing the likelihood reproducing a finding in different years. Further, climate change is causing directional changes in environmental conditions, which may mean that you might not expect to reproduce a finding from 20 years ago this year. I’ve talked to a lot of ecologists about this troublesome variation and have been really interested to find two competing interpretations:

trying to reproduce findings is futile because you would never know whether any differences were reflective of the reliability of the original result or purely because of changes in environmental conditions
trying to reproducing findings is vital because there is so much environmental variation that findings might not generalise beyond the exact instance in space and time in which the data were collected – and if this is true the findings are not very useful.

Ecology and evolutionary biology use observation

Although some studies in ecology and evolutionary biology involve experimentation, many are based on observation. This adds even more variation and can limit and bias how sites/species are sampled. For example, in a study on the impacts of fire, ‘burnt’ sites are likely to be clustered together in space and share similar characteristics that made them more susceptible to burning that the ‘unburnt’ sites, biasing the sample of sites. Also, the intensity of the fire may have differed even within a single fire, introducing uncontrolled variation. In some ways, the reliance on observational data is one of the greatest limitations in ecology and evolutionary biology. However, I think it is actually a huge asset because it could make it more feasible to attempt reproducing findings.

Previous reproducibility projects in experimental fields have either focussed on a) collecting and analysing the data exactly according to the methods of the original study, or b) using the data collected for the original analysis and re-running the original analysis. While ‘b’ is quite possible in ecology and evolutionary biology, this kind of test can only tell you whether the analyses are reproducible… not the pattern itself. Collecting the new data required for ‘a’ is expensive and labour intensive. Given limited funding and publishing opportunities for these ‘less novel’ studies, it seems unlikely that many researchers will be willing or able to collect new data to test whether a finding can be reproduced. In an experimental context, examining reproducibility is tied to these two options. However, in observational studies there is no need to reproduce an intervention, so only the measurements and the context of the study need to be replicated. Therefore, it should be possible to use data collected for other studies to evaluate how reproducible a particular finding is.

Even better, many measurements are standard and have already been collected in similar contexts by different researchers. For example, when writing the lit review for my PhD I collated 7 Australian studies that looked at the relationship between the number of woodland birds and treecover, collected bird data using 2ha 20 minute bird counts and recorded the size of the patches of vegetation. It should be possible to use the data from any one of these studies to test whether the findings of another study are reproducible.

Matching the context of the study is a bit more tricky. Different inferences can be made from attempts to reproduce findings in studies with closely matching contexts than those conducted in distinctly different contexts. For example, you might interpret failure to reproduce a finding differently if it was in a very similar context (e.g. same species in the same geographic and climatic region) than if the context was more different (e.g. sister species in a different country with the same climatic conditions). In order to test the reliability of a finding you should match the context closely. In order to test the generalisability of a finding should match the context less closely. However, determining what matches a study’s context is difficult. Do you try to match the conditions where the data were collected or the conditions that the article specifies it should generalise to? My feeling is that trying to replicate the latter is more relevant but potentially problematic.

In a perfect world, all articles would provide a considered statement about which conditions they would expect their results to generalise to (Simons et al 2017). Unfortunately, many articles overgeneralise to increase their probability of publication which may mean that findings appear less reproducible than they would have if they’d been more realistic about their generalisability.

Where to from here?

This brings me to my grand plan!

I intend to wait a few months to allow the competing list (or possibly lists) of influential ecological articles to be completed and published.

I’ll augment these lists with information on the studies’ data requirements and (where possible) statements from the articles about the generalisability of their findings. I’ll share this list with you all via a blog (and a page that I will eventually create on the Open Science Framework).

Once that’s done I will call for people to check through their datasets to see whether they have any data that could be used to test whether the findings of these articles can be reproduced. I’m hoping that we can all work together to arrange reproducing these findings (regardless of whether you have data and/or the time and inclination to re-analyse things).

My dream is to have the reproducibility of each finding/article tested across a range of datasets so that we can 1) calculate the overall reproducibility of these influential findings, 2) combine them using meta-analytic techniques to understand the overall effect, and 3) try to understand why they may or may not have been reproduced when using different datasets. Anyway, I’m very excited about this! Watch this space for further updates and feel free to contact me directly if you have suggestions or would like to be involved. My email is hannahsfraser@gmail.com.

[The opinions expressed in this blog post are those of the authors and are not necessarily endorsed by SORTEE.]