ANOVA for All Occasions
Authors: Diana Martinez, Ph.D. and Satya Kudapa
Analysis of Variance – commonly shortened to ANOVA – was introduced by the British statistician Sir Ronald Fisher (1890-1962) in 1925. An extremely useful tool for LSS practitioners, ANOVA is a type of hypothesis test to determine whether the means of different populations are the same or different, based on sample data. Note that, despite its name, ANOVA is NOT used to tell if there is a difference in the variance of different populations.
When is ANOVA appropriate to use, what questions can it help us answer, and what are the alternatives when Classical ANOVA cannot be used? It turns out there is another option which works just fine under many conditions – Welch’s ANOVA. More on that later.
First – a review of Classical Analysis of Variance. Using terminology from statistical textbooks, ANOVA determines if changes in a factor level (input X) results in changes in the response (output Y). The X is a categorical variable and the Y a continuous variable. Unlike the 2-sample T-test, ANOVA works with three or more levels.
Consider the case of a Black Belt who must select a provider for a metal part from four different possible suppliers. In this case the factor (X) is supplier. The levels are specific values of the factor – suppliers A, B, C, and D. And the response (Y) is a critical customer requirement, in this case the part diameter in inches.
The null hypothesis for ANOVA is that all population means are the same and the alternative hypothesis is that at least one population mean is different. In terms of our example, the null is that all suppliers have the same mean population diameter. In other words, they are equivalent – one supplier has the same mean as the other three. The alternative hypothesis is that at least one of them has a different population mean. Other examples include comparing the impact of marketing strategies on sales, evaluating the performance of check-in processes at different hospitals, and assessing the reliability of equipment from different manufacturers.
For the sake of this blog we will limit our discussion to One-Way (One-Factor) ANOVA, which assumes a single categorical variable (X) with three or more levels and one continuous dependent variable (Y).
Now to the question of when to use Analysis of Variance. As with many statistical tools, ANOVA has some assumptions that need to be confirmed before being used. The assumptions shown below are for the Response (Y) data for each factor level (X):
- Data are normally distributed – (not a rigid assumption). This assumption can be checked using a Normality Test and probability plot. Can also be visually checked with histograms.
- Data have equal variance – (a more necessary assumption). This assumption can be checked with the Test for Equal Variance.
- Data are independent – (a required assumption). Data should be randomly selected from the corresponding probability distribution and are independent of the responses for any other “X” (i.e., other factor levels). This can be checked with time series plots. Independent data should display a random pattern on such charts.
Note: If the assumption of independence is violated, there is no solution for ANOVA. The appropriate approach to follow would be to find the source of non-independence, correct it and re-sample the data.
In many practical applications, when using ANOVA, it is common to not meet one or more of these assumptions. As noted above, the requirement of data independence (assumption #3) must be met. But what if your project has data that are not normally distributed (assumption #1) for one or more factor levels?
Normality Requirement: If one or more data sets are non-normal, ANOVA will still work since it is quite robust to the assumption of normality due to the Central Limit Theorem. The requirement of normality is less important for balanced designs. A balanced design occurs when the sample size is the same for all factor levels. So, if you want to compare, say, the output for three different shifts in a factory then as long as the number of data points is the same for all three shifts then you should be OK.
There is another option for the situation of non-normal data: a non-parametric test. These hypothesis tests work by analyzing the differences in medians – not means. The equivalent tests for ANOVA are the Mood’s Median or Kruskal Wallis tests.
Equal Variance Requirement: When the equal variance assumption is violated then ANOVA results might not be accurate since the likelihood of making a Type I error could be higher. Such errors result in a false positive – concluding there is a difference between factor levels when in reality there is no difference. A safer statistical approach is to use Welch’s ANOVA. This test also works when data are not normally distributed.
Welch’s ANOVA was created by Bernard Lewis Welch (1911-1989) – yes, another British statistician – who developed Welch’s T-test in 1947. It is used when the categorical variable (X) has only two levels and when the Y data have unequal variances. Welch’s T-test was later generalized to more than two levels – hence, Welch’s ANOVA.
The benefits of using Welch’s ANOVA are:
- Robust to unequal variances
- Robust to unbalanced design (different sample sizes)
- Robust to non-normal data
- Lower risk of Type I errors
In other words, Welch’s ANOVA works well in many practical situations likely to be encountered by LSS Practitioners. Hence, it is a valuable tool in your stats toolkit.
To summarize – Classical ANOVA works great for situations where the assumptions are met. And it may be acceptable for some situations where the assumptions aren’t strictly met (data not normal). The power of Welch’s ANOVA is that it can be used even if you violate (or are unsure about meeting) these assumptions. Finally, it is important to remember – after running ANOVA you should always validate your results using residual analysis.
More about Diana Martinez, Ph.D.:
Diana is a Lean Six Sigma Master Black Belt with 12+ years of experience working with dozens of companies in a variety of industries, from healthcare to government to aerospace. She has taught more than 30 LSS classes and conducted dozens of coaching sessions for Green Belts and Black Belts. She has also worked closely with many firms in applying lean best practices resulting in reduced costs, higher productivity, and a more skilled workforce. Diana has a PhD in Industrial Engineering.
More about Satya Kudapa:
Satya is a Lean Six Sigma Master Black Belt with 15+ years of experience working with over 100 firms, from small companies to large enterprises. He has taught dozens of LSS classes and conducted over 500 coaching sessions for Green Belts, Black Belts, and managers. He has also worked with several organizations to develop and maintain their ISO 9001/AS 9100 quality management systems. Satya is a leader in TMAC’s Smart Manufacturing and Cybersecurity Management Programs.