Simultaneous Inference

We want to find confidence interval for more than one parameter simultaneously. For example we might want to find confidence interval for $$\beta_0$$ and $$\beta_1$$.

Bonferroni Joint Confidence Intervals

The confidence coefficients for individual parameters are adjusted to the higher 1 - $$\alpha$$ so that the confidence coefficient for the collection of parameters must be at least 1 - $$\alpha$$. This is based on the following inequality:

Theorem (Bonferroni's Inequality)

$P( \beta_0\cap\beta_1)\geq1-P(\beta_0^c)-P(\beta_1^c) \label{Bonferroni}$

for any two events $$\beta_0$$ and $$\beta_1$$, where $$\beta_0^c$$ and $$\beta_1^c$$ are complements of events $$\beta_0$$ and $$\beta_1$$, respectively.

We take, $$\beta_0 =$$ the event that confidence interval for $$\beta_0$$ covers $$\beta_0$$; and, $$\beta_1 =$$ the event that confidence interval for $$\beta_1$$ covers $$\beta_1$$;

So, if $$P(\beta_0) = 1-\alpha_1$$, and $$P(\beta_1) = 1-\alpha_2$$, then $$P(\beta_0\cap\beta_1)\geq1-\alpha_1-\alpha_2$$, by Bonferroni's inequality (Equation \ref{Bonferroni}). Note that $$\beta_0\cap\beta_1$$ is the event that confidence intervals for both the parameters cover the respective parameters. Therefore we take $$\alpha_1 = \alpha_2 = \alpha/2$$ to get joint confidence intervals with confidence coefficient at least $$1 - \alpha$$,

$$b_0 \pm t(1-\alpha/4;n-2) s(b_0)$$ and $$b_1 \pm t(1-\alpha/4;n-2) s(b_1)$$ for $$\beta_0$$ and $$\beta_1$$, respectively.

Bonferroni Joint Confidence Intervals for Mean Response

We want to find the simultaneous confidence interval for $$E(Y|X = X_h) = \beta_0 + \beta_1X_h$$ for g different values of $$X_h$$. Using Bonferroni's inequality for the intersection of g different events, the confidence intervals with confidence coefficient (at least) $$1-\alpha$$ are given by

$\widehat{Y_h} \pm t(1-\alpha/2g; n-2)s(\widehat{Y_h}).$

Confidence band for regression line : Working-Hotelling procedure

The confidence band

$\widehat{Y_h} \pm\sqrt{2F(1-\alpha;2,n-2)}s(\widehat{Y_h})$

contains the entire regression line (for all values of $$X$$) with confidence level $$1-\alpha$$. The Working-Hotelling procedure for obtaining the $$1-\alpha$$ simultaneous confidence band for the mean responses, therefore, is to use these confidence limits for the g different values of $$X_h$$.

Simultaneous prediction intervals

Recall that, the standard error of prediction for a new observation $$Y_{h(new)}$$ with $$X = X_h$$, is $$s(Y_{h(new)}-\widehat{Y_h}) = \sqrt{MSE(1+\frac{1}{n}+\frac{(X_h-\overline{X})^2}{\sum_i(X_i-\overline{X})^2})}$$ In order to predict the new observations for g different values of X, we may use one of the two procedures:

• Bonferroni procedure : $$\widehat{Y_h}\pm t(1-\alpha/2g;n-2)s(Y_{h(new)}-\widehat{Y_h})$$.
• Scheffe procedure : $$\widehat{Y_h}\pm \sqrt{gF(1-\alpha;g,n-2)}s(Y_{h(new)}-\widehat{Y_h})$$.

Remark : A point to note is that except for the Working-Hotelling procedure for finding simultaneous confidence intervals for mean response, in all the other cases, the confidence intervals become wider as g increases.

Which method to choose : Choose the method which leads to narrower intervals. As a comparison between Bonferroni and Working-Hotelling (for finding confidence intervals for the mean response), the following can be said :

• If g is small, Bonferroni is better.
• If g is large, Working-Hotelling is better (the coefficient of $$s(\widehat{Y_h})$$ in the confidence limits remains the same even as g becomes large).

Housing data as an example

Fitted regression model : $$\widehat{Y_h} = 28.981 + 2.941X, n=19, s(b_0) = 8.5438, s(b_1) = 0.5412, MSE = 11.9512.$$

• Simultaneous confidence intervals for $$\beta_0$$ and $$\beta_1$$ : For 95% simultaneous C.I., $$t(1-\alpha/4;n-2)=t(0.9875;17) = 2.4581$$. The intervals are (for $$\beta_0 and \beta_1$$, respectively)

$$28.981 \pm 2.4581 \times 8.5438 \equiv 28.981 \pm 21.002, 2.941 \pm 2.4581 \times 0.5412 \equiv 2.941 \pm 1.330$$

• Simultaneous inference for mean response at g different values of X : Say g = 3.

And the values are

 $$X_h$$ 14 16 18.5 $$\widehat{Y_h}$$ 70.155 76.037 83.39 $$s(\widehat{Y_h})$$ 1.2225 0.8075 1.7011

$$t(1-0.05/2g;n-2) = t(0.99167;17) = 2.655, \sqrt{2F(0.95;2,n-2)} = \sqrt{2 \times 3.5915} = 2.6801$$

The 95% simultaneous confidence intervals for the mean responses are given in the following table:

 $$X_h$$ 14 16 18.5 Bonferroni $$70.155 \pm 3.248$$ $$76.037 \pm 2.145$$ $$83.390 \pm 4.520$$ Working-Hotelling $$70.155 \pm 3.276$$ $$76.037 \pm 2.164$$ $$83.390 \pm 4.559$$
• Simultaneous prediction intervals for g different values of X : Again, say g = 3 and the values of 14,16 and 18.5. In this case, $$\alpha = 0.05, t(1-\alpha/2g;n-2) = t(0.99167; 17) = 2.655$$. And $$\sqrt{gF(1-\alpha;g,n-2)} = \sqrt{3F(0.95;3,17)} = \sqrt{3 \times 3.1968} = 3.0968$$. The standard errors and simultaneous 95% C.I. are given in the following table:
 $$X_h$$ 14 16 18.5 $$\widehat{Y_h}$$ 70.155 76.037 83.390 $$s(Y_{h(new)}-\widehat{Y_h})$$ 3.6668 3.5501 3.8529 Bonferroni $$70.155 \pm 9.742$$ $$76.037 \pm 9.432$$ $$83.390 \pm 10.237$$ Scheffe $$70.155 \pm 11.355$$ $$76.037 \pm 10.994$$ $$83.390 \pm 11.932$$