Effect Size – What makes our data meaningful?

While metrics like the p value will tell us if the results are significant or not, the effect size tells us if the values are meaningful.

Say we went and measured the toe length from 2 populations of lizards. After collecting thousands of measurements, we find a significant difference between the two populations. 

But are the difference between populations actually meaningful? 

Say our control group has a toe length of 50 mm. If the second population has a mean toe length of 51 mm, that may not make much of a difference biologically. In essence, the results are significantly different, but not biologically meaningful. 

But lets say instead that the population differences were still significant, but the mean toe length was instead 60mm. That difference might actually be meaningful! In the context of lizard toe length, generally speaking, the more arboreal a lizard is, the longer their toes. We might actually be able to detect that these lizards with longer toes are indeed more arboreal!

While both results allow us to reject the null hypothesis, because they are both significant, we would use our understanding of the biology to say that one is biologically meaningful while the other is not. With enough data, we will almost always detect significant differences between populations. 

Given this, how do we determine how meaningful a value is?

That is where effect size comes into play.

Effect size measures the magnitude of difference between populations or the strength of a relationship between two variables. It helps us understand whether a statistically significant result is also meaningful. Understanding effect size can help us interpret statistical results in the context of the research question and real-world implications.

Cohen’s d is a measure of effect size that describes the difference between two means. Generally, a Cohen’s d of 0.3-0.5 is considered a small effect size, 0.5-0.8 a moderate effect size, and 0.8 or higher a large effect size. Cohen’s d is calculated by taking the difference between the means of two groups and dividing it by the pooled standard deviation. 

With that calculation, we can say that a Cohen’s D of 1, indicates that the mean difference between populations is equal to 1 standard deviation. A Cohen’s D of .5, tells us that the mean difference is only half a standard deviation.

There are many metrics that can be used for determining effect size and they depend on which statistical test you are performing. Another common one is the correlation coefficient. We’ll cover it in depth during our correlation section, but it is a value from -1 to 1 that tells us the direction and strength of the relationship between 2 continuous variables. We can show that relationship according to how tightly clustered points are to a line of best fit. 

The closer that r value is to 0, the weaker (and less meaningful) the relationship, and naturally the closer the value is to 1 (or to -1 for a negative correlation) the stronger (and more meaningful) the relationship. For example, a value of .2 indicates a pretty weak relationship, even if the data are significant. Conversely a value of .8 (closer to the max of 1) indicates a rather strong relationship!

Scroll to Top