Why Meta-Analysis?

Why conduct a meta-analysis (MA)?

For a quick overview on what are meta-analysis (MA), please watch this 3-minute video:

Meta-analyses (MAs) can be useful several purposes, including: (1) theory building and evaluation and (2) practical decisions during study design. This section starts with some basics on why MAs are more useful than single studies for those two purposes.

Why are single studies not enough?

When thinking about development, we often look at published experiments testing whether infants have specific abilities, for example whether infants treat native vowels differently from non-native ones, and how those abilities change with age (for more details on this specific topic see [here] (/dataset/inphondb-native.html).

The results of a single experiment cannot answer those questions once and for all: Each experiment measures behavior of a set of infants in a very specific situation, which might not be generalizable to other situations. Moreover, there might be a measurement error in this one-time snapshot of reality. Finally, the literature likely contains some false positives and false negatives, simply due to the way we conduct statistics. With a significance threshold alpha set to .05, every study we run has a 5% chance of telling us that infants can do something when this is not true - this is a false positive. This likelihood becomes bigger when researchers engage in seemingly innocent and possibly common practices that increase the chance of a false positive, such as running multiple analyses until one is significant. Some people even propose that most published literature consists of false positives! With a beta set to .2, every study we run has a 20% chance of telling us that infants cannot do something when they actually can - this is a false negative. In fact, many studies are underpowered, meaning they test too few participants, so that a non-significant result is not due to a true lack of effect, but rather to lack of power to detect it. None of these necessarily are due to bad intentions, wrongdoing, or even poor research practices. Reality is complex and thus any one study can only give us a single, noisy estimate.

How can MAs help?

MAs may be the cheapest way to assess generalizability and test whether a certain factor matters. Instead of running 10 experiments, 1 on each vowel contrast, we collect 10 studies in the literature into a single analysis! If effects for e.g. native versus non-native vowels differ significantly in the literature as a whole, then we can be more confident that results will generalize to unobserved vowel contrasts.

MAs are therefore a tool to collaborate across space and time, instead of having one lab invest a lot of resources. This has the automatic benefit that MAs often also cover more varied participant populations than a single study usually can (for notable exceptions, see for example the ManyBabies Studies.

How can MAs help with false positives?

Collecting many study results from different researchers is a way to try and make up for the possibility that biases influenced the outcome. We can even use MAs to check for biases, such as asking whether a suspicious number of p-values is just below the significance threshold or whether results are systematically skewed in one direction. Why biases matter is wonderfully illustrated here: http://www.alltrials.net/news/the-economist-publication-bias/. Checking for biased results is a whole literature on its own, but as a start tools such as p-curving apps are easily available for every researcher. http://www.p-curve.com/ or http://shinyapps.org/apps/p-checker/ are two well-documented examples.

How can MAs help with false negatives?

MAs help in three ways. By pooling data together, we may be able to bring out a small effect that was too difficult to detect.

Additionally, we often do not know about these non-significant findings because it is quite difficult to publish them. Community-augmented MAs like those in MetaLab provide a home for unpublished results, and allow researchers to benefit from the experience of others.

Finally, MAs help us in experiment design so we can avoid false negatives due to low power. When the size of an effect is known and with a fixed significance threshold, calculating power is straightforward. Here is a simulation of how all ingredients fit together: <rpsychologist.com/d3/NHST/>.

Can MAs help in other ways?

The MAs in MetaLab can also help with study design, because often many design variables have been coded. Examples include the stimuli used, how long trials were, etc. Instead of doing a tiresome literature review, you can find out what is the most common procedure or which is associated with the biggest effect.

Previous