Statistically significant. Show
It’s a phrase that’s packed with both meaning, and syllables. It’s hard to say and harder to understand. Yet it’s one of the most common phrases heard when dealing with quantitative methods. While the phrase statistically significant represents the result of a rational exercise with numbers, it has a way of evoking as much emotion. Bewilderment, resentment, confusion and even arrogance (for those in the know). I’ve unpacked the most important concepts to help you the next time you hear the phrase. Not Due to ChanceIn principle, a statistically significant result (usually a difference) is a result that’s not attributed to chance. More technically, it means that if the Null Hypothesis is true (which means there really is no difference), there’s a low probability of getting a result that large or larger. Statisticians get really picky about the definition of statistical significance, and use confusing jargon to build a complicated definition. While it’s important to be clear on what statistical significance means technically, it’s just as important to be clear on what it means practically. Consider these two important factors.
What it Means in PracticeLet’s look at a common scenario of A/B testing with, say, 435 users. During a week, they are randomly served either website landing page A or website landing page B.
Do we have evidence that future users will click on landing page A more often than on landing page B? Can we reliably attribute the 5-percentage-point difference in click-through rates to the effectiveness of one landing page over the other, or is this random noise? How Do We Get Statistical Significance?The test we use to detect statistical difference depends on our metric type and on whether we’re comparing the same users (within subjects) or different users (between subjects) on the designs. To compare two conversion rates in an A/B test, as we’re doing here, we use a test of two proportions on different users (between subjects). These can be computed using the online calculator or downloadable Excel calculator. Below is a screenshot of the results using the A/B test calculator To determine whether the observed difference is statistically significant, we look at two outputs of our statistical test:
What it Doesn’t MeanStatistical significance does not mean practical significance. The word “significance” in everyday usage connotes consequence and noteworthiness. Just because you get a low p-value and conclude a difference is statistically significant, doesn’t mean the difference will automatically be important. It’s an unfortunate consequence of the words Sir Ronald Fisher used when describing the method of statistical testing. To declare practical significance, we need to determine whether the size of the difference is meaningful. In our conversion example, one landing page is generating more than twice as many conversions as the other. This is a relatively large difference for A/B testing, so in most cases, this statistical difference has practical significance as well. The lower boundary of the confidence interval around the difference also leads us to expect at LEAST a 1% improvement. Whether that’s enough to have a practical (or a meaningful) impact on sales or website experience depends on the context. Sample SizeAs we might expect, the likelihood of obtaining statistically significant results increases as our sample size increases. For example, in analyzing the conversion rates of a high-traffic ecommerce website, two-thirds of users saw the current ad that was being tested and the other third saw the new ad.
The difference in conversion rates is statistically significant (p = 0.039) but, at 0.0006%, tiny, and likely of no practical significance. However, since the new ad now exists, and since a modest increase is better than none, we might as well use it (oh and just in case you thought a lot of people clicked on ads, let this remind you of how they don’t!) Conversely, small sample sizes (say fewer than 50 users) make it harder to find statistical significance; but when we do find statistical significance with small sample sizes, the differences are large and more likely to drive action. Some standardized methods express differences, called effect sizes, which help us interpret the size of the difference. Here, too, the context determines whether the difference warrants action. Conclusion and SummaryHere’s a recap of statistical significance:
Now say statistically significant three times fast. What does it mean when research results are statistically significant?If a result is statistically significant, that means it's unlikely to be explained solely by chance or random factors. In other words, a statistically significant result has a very low chance of occurring if there were no true effect in a research study.
What does it mean if the mean is statistically significant?Here's a recap of statistical significance: Statistically significant means a result is unlikely due to chance. The p-value is the probability of obtaining the difference we saw from a sample (or a larger one) if there really isn't a difference for all users.
What does it mean if a result is statistically significant quizlet?"Statistical significance" means that the results of the sample are so extreme, that it is unlikely that the sample results are due to chance (unlikely),when the null hypothesis is true.
|