Analysis of Variance (ANOVA)

What is ANOVA - Using Some Common Sense to Explore!

Initially, it may appear strange that we want to use variances to test if population means are equal or different. To shed light into the concept, consider the following example: Suppose that you want to test if athletes in different sports are, on an average, equally tall. You measure the height of, say, 30 marathon runners with heights varying from 5'0" to 5'4" (measurement in feet and inches), and 30 basketball players with heights varying from 6'6" to 6'10". The height variability within each group is four inches. The variability, however, in-between the groups is 18 inches, from 5'2" to 6'8" (from midpoint of range to midpoint of range). Clearly, the variability in-between the groups appears much greater than the variability within the groups. Also, observe that the total variability in the data is from 5'0" to 6'10" or 22 inches (from the shortest person to the tallest person in the combined sample). The source for most of this total variability is the variability in-between the groups. Therefore one might conclude that, based on the samples, the mean heights of athletes in these two sports may be different. Please note, that the conclusion is reached by looking at the total variability, as well as the variability within and in-between the groups rather than simply comparing the means. In this case there are only two categories, and hence a two-population t-test would be an appropriate test, and would result into the same conclusion.

Let's take a closer look at the above example. We will add some features into it to demonstrate how total variability in the data, divided into the in-between group and within group variability can be used to conduct hypothesis testing about the equality of population means.

Note: The in-between group variability is considered to be caused by systematic differences between the groups, while the within group variability is considered to be randomness.

The below animation is built from 15 small data sets, labeled as DATA 1 through DATA 15. The graph on the left reflects the data set on the right. The graph shows two vertical bars. The bar on the left represents a range of heights of a sample of marathon runners, while the bar on the right represents, correspondingly, a range of heights of a sample of basketball players. As the data set changes, the graph changes accordingly. The longer the bar, the wider the range of heights measured in the sample.

Can you tell with some confidence, and by looking at the three measures of variability (total-, in-between-, within-) only, which data sets suggest a significant difference between the heights of the two groups of athletes, and which data sets do not. Sit back and watch the below animation!!!

DATA 1 shows a starting situation, in which both sample groups of athletes are the same in terms their height. In both groups the tallest person measures 6.8 feet and the shortest 6.2 feet. The variability (here, the range) of measurements within each group is the same, or 0.6 feet. The variability in-between the groups is zero (i.e. measurement range midpoints are the same). The DATA 1 suggest that the mean population heights of both the marathon runners and basketball players may be the same.

DATA 2 through DATA 6 show samples in which marathon runner heights are changing while at the same time the within group height measurement variability (here, the range) remains at 0.6 feet.

DATA 4 show an interesting borderline situation, in which the height of the tallest marathon runner equals the height of the shortest basketball player. Note that the within group variability for both groups, and the in-between group variability are the same (0.6 feet).

DATA 5 and DATA 6 show clearly that majority of the grand total variability is from variability in-between the groups.

DATA 7 through DATA 10 show four data sets in which the within group variability decreases (increases), while at the same time increasing (decreasing) the proportion of the in-between group variability of the total variability. In DATA 10 the within group variability of the marathon runners collapses to zero indicating that all runners in the sample were exactly equally tall.

DATA 11 through DATA 15 demonstrate situations in which the within group variability includes the majority of the total variability suggesting that population means may be equal. Note also the overlapping ranges of the bars.

Note: Observe that the total variability was divided above into within group- (commonly called within treatment) variability and in-between group- (commonly called a in-between treatment) variability. A comparison of these components of variability forms the basis for ANOVA.

Note: The in-between treatment variability is considered to be caused by systematic differences between the treatments, while the within treatment variability is considered to be randomness.

Note: The relationship between regression and ANOVA can be simplistically stated as follows: If, in the above animation, the bars are 'at the same level', e.g. like DATA 1, then total variability in the data is only randomness. This corresponds to the situation in regression where SST=SSE. If, however, in the above animation, the bars are 'at significantly different levels (without overlap)', e.g. like DATA 6, then total variability in the data can be divided into randomness (variability within) and systematic difference (variability in-between). This corresponds to the situation in regression where SST=SSR+SSE. Or more simplistically, if you draw a ('regression') line connecting the midpoints of the data sets, and the line has a significant slope, then there is 'regression', and there is a significant difference between the treatment means.