# 3.5: SAS Output for ANOVA

$$\newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} }$$

$$\newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}}$$

$$\newcommand{\id}{\mathrm{id}}$$ $$\newcommand{\Span}{\mathrm{span}}$$

( \newcommand{\kernel}{\mathrm{null}\,}\) $$\newcommand{\range}{\mathrm{range}\,}$$

$$\newcommand{\RealPart}{\mathrm{Re}}$$ $$\newcommand{\ImaginaryPart}{\mathrm{Im}}$$

$$\newcommand{\Argument}{\mathrm{Arg}}$$ $$\newcommand{\norm}[1]{\| #1 \|}$$

$$\newcommand{\inner}[2]{\langle #1, #2 \rangle}$$

$$\newcommand{\Span}{\mathrm{span}}$$

$$\newcommand{\id}{\mathrm{id}}$$

$$\newcommand{\Span}{\mathrm{span}}$$

$$\newcommand{\kernel}{\mathrm{null}\,}$$

$$\newcommand{\range}{\mathrm{range}\,}$$

$$\newcommand{\RealPart}{\mathrm{Re}}$$

$$\newcommand{\ImaginaryPart}{\mathrm{Im}}$$

$$\newcommand{\Argument}{\mathrm{Arg}}$$

$$\newcommand{\norm}[1]{\| #1 \|}$$

$$\newcommand{\inner}[2]{\langle #1, #2 \rangle}$$

$$\newcommand{\Span}{\mathrm{span}}$$ $$\newcommand{\AA}{\unicode[.8,0]{x212B}}$$

$$\newcommand{\vectorA}[1]{\vec{#1}} % arrow$$

$$\newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow$$

$$\newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} }$$

$$\newcommand{\vectorC}[1]{\textbf{#1}}$$

$$\newcommand{\vectorD}[1]{\overrightarrow{#1}}$$

$$\newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}}$$

$$\newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}}$$

$$\newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} }$$

$$\newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}}$$

The first output of the ANOVA procedure as shown below, gives useful details about the model.

ANOVA of Greenhouse Data: The Mixed Procedure
Model Information
Data Set WORK.GREENHOUSE
Dependent Variable Height
Covariance Structure Diagonal
Estimation Method Type 3
Residual Variance Method Factor
Fixed Effects SE Method Model-Based
Degrees of Freedom Method Residual

The output below titled ‘Type 3 Analysis of Variance’ is similar to the ANOVA table we are already familiar with. Note that it does not include the Total SS, however it can be computed as the sum of all SS values in the table.

Type 3 Analysis of Variance
Sources DF Sum of Squares Mean Square Expected Mean Square Error Term Error DF F Value Pr > F
fert 3 251.440000 83.813333 Var(Residual)+Q(fert) MS(Residual) 20 27.46 <.0001
Residual 20 61.033333 3.051667 Var(Residual)

The output above titled “Type 3 Tests of Fixed Effects” will display the $$F_{calculated}$$ and p-value for the test of any variables that are specified in the model statement. Additional information can also be requested. For example, the method = type 3 option will include the Expected Mean Squares for each source, which will prove to be useful and will be seen in Chapter 6.

The Mixed Procedure also produces the following diagnostic plots:

The following display is a result of the LSmeans statement in the PLM procedure which was included in the programming code.

Differences of fert Least Squares Means
fert Estimate Standard Error DF t Value Pr > |t| Alpha Lower Upper
Control 21.0000 0.7132 20 29.45 <.0001 0.05 19.5124 22.4876
F1 28.6000 0.7132 20 40.10 <.0001 0.05 27.1124 30.0876
F2 25.8667 0.7132 20 36.27 <.0001 0.05 24.3790 27.3543
F3 29.2000 0.7132 20 40.94 <.001 0.05 27.7124 30.6876

In the "Least Squares Means" table above, note that the $$t$$-value and $$Pr >|t|$$ are testing null hypotheses that each group mean= 0. (These tests usually do not provide any useful information). The Lower and Upper values are the 95% confidence limits for the group means. Note also that the least square means are the same as the original arithmetic means that were generated in the Summary procedure in Section 3.3 because all 4 groups have the same sample sizes. With unequal sample sizes or if there is a covariate present, the least square means can differ from the original sample means.

Next, the Plot= mean plot option in the LSmeans statement yields a mean plot and also a diffogram, shown below. The confidence intervals in the mean plot are commonly used to identify the significantly different treatment levels or groups. If two confidence intervals do not overlap, then the difference between the two associated means is statistically significant, which is a valid conclusion. However, if they overlap, it may be the case that the difference might still be significant. Consequently, conclusions made based on the visual inspection of the mean plot may not match with those arrived at using the table of "Difference of Least Square Means", another output of the Tukey procedure, and is displayed below.

Notice that this is different from the previous table because it displays the results of each pairwise comparison. For example, the first row shows the comparison between the control and F1. The interpretation of these results is similar to any other confidence interval for the difference in two means—if the confidence interval does not contain zero, then the difference between the two associated means is statistically significant.

Differences of fert Least Squares Means

fert _fert Estimate Standard Error DF t Value Pr > |t| Adj P Alpha Lower Upper Adj Lower Adj Upper
Control F1 -7.6000 1.0086 20 -7.54 <.0001 <.0001 0.05 -9.7038 -5.4962 -10.4229 -4.7771
Control F2 -4.8667 1.0086 20 -4.83 0.0001 0.0006 0.05 -6.9705 -2.7628 -7.6896 -2.0438
Control F3 -8.2000 1.0086 20 -8.13 <.0001 <.0001 0.05 -10.3038 -6.0962 -11.0229 -5.3771
F1 F2 2.7333 1.0086 20 2.71 0.0135 0.0599 0.05 0.6295 4.8372 -0.08957 5.5562
F1 F3 -0.6000 1.0086 20 -0.59 0.5586 0.9324 0.05 -2.7038 1.5038 -3.4229 2.2229
F2 F3 -3.3333 1.0086 20 -3.30 0.0035 .0171 0.05 -5.4372 -1.2295 -6.1562 -0.5104

This discrepancy between the mean plot and the "Difference of Least Square Means" results occurs because the testing is done in terms of the difference of two means, using the standard error of the difference of the two-sample means, but the confidence intervals of the mean plot are computed for the individual means which are in terms of the standard error of individual sample means. Consistent results can be achieved by using the diffogram as discussed below or the confidence intervals displayed in the "difference in mean plot" available in SAS 14, but not included here.

The diffogram has two useful features. It allows one to identify the significant mean pairs and also gives estimates of the individual means. The diagonal line shown in the diffogram is used as a reference line. Each group (or factor level) is marked on the horizontal and vertical axes and has vertical and horizontal reference lines with their intersection point falling on the diagonal reference line. The $$x$$ or the $$y$$ coordinates of this intersection point which are equal is the sample mean of that group. For example, the sample mean for the Control group is about 21, which matches with the estimate provided in the "Least Squares Means" table displayed above. Furthermore, each slanted line represents a mean pair. Start with any group label from the horizontal axis and run your cursor up, along the associated vertical line until it meets a slanted line, and then go across the intersecting horizontal line to identify the other group (or factor level). For example, the lowermost solid line (colored blue) represents the Control and F2. As stated at the bottom of the chart, the solid (or blue) lines indicate significant pairs, and the broken (or red) lines correspond to the non-significant pairs. Furthermore, a line corresponding to a nonsignificant pair will cross the diagonal reference line.

The non-overlapping confidence intervals in the mean plot above indicate that the average plant height due to control is significantly different from those of the other 3 fertilizer levels and that the F2 fertilizer type yields a statistically different average plant height from F3. The diffogram also delivers the same conclusions and so, in this example, conclusions are not contradictory. In general, the diffogram always provides the same conclusions as derived from the confidence intervals of difference of least-square means shown in the "Difference of Least Square Means" table, but the conclusions based on the mean plot may differ.

There are two contrasts of interest: contrast to compare the control and F3 with F1 (i.e. $$\mu_{control} - \mu_{F1} - \mu_{F2} + \mu_{F3}$$) and the contrast to compare control and F2 with F1 (i.e., $$\mu_{control} - 2 \mu_{F1} + \mu_{F2}$$). Since we are testing for two contrasts, we should adjust for multiple comparisons. We use Bonferroni adjustment. In SAS, we can use the estimate  command under proc plm to make these computations.

In general, the estimate command estimates linear combinations of model parameters and performs t-tests on them. Contrasts are linear combinations that satisfy a special condition. We will discuss the model parameters in Chapter 4.

Estimates
SAS returns both unadjusted and adjusted $$p$$-values. Suppose we wanted to make the comparisons at 1% level. If we ignored the multiple comparisons (i.e. using unadjusted $$p$$-values), the both comparisons are statistically significant. However, if we consider the adjusted $$p$$-values, we will fail to reject the hypothesis corresponding to the first contrast at the 1% level.