3.5: Partioning the Sum of Squares

Last updated
Save as PDF

Page ID: 33173

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

While we are not going into calculating the ANOVA table in factorial ANOVA, we will look at how to partition the sums of squares. ANOVAs are all about partitioning the sums of squares. We already did some partitioning in the last chapter. What do we mean by partitioning?

Imagine you had a big empty house with no rooms in it. What would happen if you partitioned the house? What would you be doing? One way to partition the house is to split it up into different rooms. You can do this by adding new walls and making little rooms everywhere. That’s what partitioning means, to split up.

The act of partitioning, or splitting up, is the core idea of ANOVA. To use the house analogy. Our total sums of squares (SS Total) is our big empty house. We want to split it up into little rooms. Before we partitioned SS Total using this formula:

\[SS_\text{TOTAL} = SS_\text{Effect} + SS_\text{Error} \nonumber \]

Remember, the \(SS_\text{Effect}\) was the variance we could attribute to the means of the different groups, and \(SS_\text{Error}\) was the leftover variance that we couldn’t explain. \(SS_\text{Effect}\) and \(SS_\text{Error}\) are the partitions of \(SS_\text{TOTAL}\), they are the little rooms.

Now let's see how we can split up the house in factorial ANOVA. We will still use the example in section 3.1, the experiment on the effect of cell phone use (yes vs. no) and time of day (day vs. night) on driving ability. This is a 2 x 2 ANOVA. Cell phone use is IV1. Time of day is IV2. Remember the logic of the ANOVA is to partition the variance into different parts. The SS formula for the between-subjects 2 x 2 ANOVA looks like this:

\[SS_\text{Total} = SS_\text{Effect IV1} + SS_\text{Effect IV2} + SS_\text{Effect IV1xIV2} + SS_\text{Error} \nonumber \]

Now we split up the house into a lot more little rooms, \(SS_\text{Effect IV1}\), \(SS_\text{Effect IV2}\), \(SS_\text{Effect IV1xIV2}\), and \(SS_\text{Error}\).

SS Total

We calculate the grand mean (mean of all of the score). Then, we calculate the differences between each score and the grand mean. We square the difference scores, and sum them up. That is \(SS_\text{Total}\), as always.

SS Cell Phone Use (IV1)

We need to compute the SS for the main effect for cell phone use. This step is essentially the same as how we calculated the SS effect in single-factor between-subjects ANOVA. the key is when we calculate the main effect of cell phone use, we ignore the other IV time of day. We calculate the grand mean (mean of all of the scores). Then, we calculate the means for the two cell phone use conditions (yes vs. no). Then we treat each score as if it was the mean for it’s respective cell phone use condition. We find the differences between each cell phone use condition mean and the grand mean. Then we square the differences and sum them up. That is \(SS_\text{Cell phone use}\). Again, the key is when we calculate the main effect of one independent variable, we ignore the other independent variable or pretend it doesn't exist.

SS Time of Day (IV2)

We need to compute the SS for the main effect for time of day. Similarly, when we calculate the main effect of time of day, we ignore the other IV cell phone use. We calculate the grand mean (mean of all of the scores). Then, we calculate the means for the two time of day conditions (day vs. night). Then we treat each score as if it was the mean for it’s respective time of day condition. We find the differences between each time of day condition mean and the grand mean. Then we square the differences and sum them up. That is \(SS_\text{Time of day}\). Again, this step is essentially the same as how we calculated the SS effect in single-factor between-subjects ANOVA. The key is when we calculate the main effect of one independent variable, we ignore the other independent variable or pretend it doesn't exist.

SS Cell Phone Use by Time

We need to compute the SS for the interaction effect between cell phone use and time of day. This is the new thing that we do in an ANOVA with more than one IV. How do we calculate the variation explained by the interaction?

The heart of the question is something like this. Do the individual means for each of the four conditions do something a little bit different than the group means for both of the independent variables.

For example, let's say the overall mean for all of the scores in the no cell phone group to be 6.6. Now, was the mean for each no cell phone group in the whole design a 6.6? For example, in the day group, was the mean for no cell phone condition also 6.6? Let's say the answer is no, it was 9.6. How about the night group? Was the mean for the night condition in the no cell phone group 6.6? Let's say the answer is no, it was 3.6. The mean of 9.6 and 3.6 is 6.6. If there was no hint of an interaction, we would expect that the means for the no cell phone condition in both levels of the time of day group would be the same, they would both be 6.6. However, when there is an interaction, the means for the no cell phone group will depend on the levels of the group from another IV. In this case, it looks like there is an interaction because the means are different from 6.6, they are 9.6 for the day condition and 3.6 for the night conditions. This is extra-variance that is not explained by the mean for the no cell phone condition. We want to capture this extra variance and sum it up. Then we will have measure of the portion of the variance that is due to the interaction between the cell phone use and time of day conditions.

What we will do is this. We will find the four condition means. Then we will see how much additional variation they explain beyond the group means for cell phone use and time of day. To do this we treat each score as the condition mean for that score. Then we subtract the mean for the cell phone use group, and the mean for the time of day group, and then we add the grand mean. This gives us the unique variation that is due to the interaction.

Here is a formula to describe the process for each score:

\[\bar{X}_\text{condition} -\bar{X}_\text{IV1} - \bar{X}_\text{IV2} + \bar{X}_\text{Grand Mean} \nonumber \]

We would apply this formula to the calculation of each of the differences scores. We then square the difference scores, and sum them up to get \(SS_\text{Interaction}\).

SS Error

The last thing we need to find is the SS Error. We can solve for that because we found everything else in this formula:

\[SS_\text{Total} = SS_\text{Effect IV1} + SS_\text{Effect IV2} + SS_\text{Effect IV1xIV2} + SS_\text{Error} \nonumber \]

Even though this textbook meant to explain things in a step by step way, you are probably tired from reading how to work out the 2x2 ANOVA by hand. I have already shown you how to compute the SS for error before, so we will not do the full example here. In essence, not every score in a particular condition group is the same. We subtract each score (from a particular condition) and from the condition mean, square the differences, and add them up. Then we do this same step for each condition group, and combined, we will get SS Error.

Like mentioned earlier, we are not going into details of ANOVA calculations here. Please refer to the lecture for those. The key is to know the difference between one-way ANOVA and factorial ANOVA. The advantage of factorial ANOVA over multiple one-way ANOVA is its ability to examine the potential Interaction effects.