Skip to main content
Statistics LibreTexts

Simultaneous Inference

We want to find confidence interval for more than one parameter simultaneously. For example we might want to find confidence interval for \(\beta_0 \) and \( \beta_1 \).

Bonferroni joint confidence intervals

The confidence coefficients for individual parameters are adjusted to the higher 1 - \( \alpha \) so that the confidence coefficient for the collection of parameters must be at least 1 - \( \alpha \). This is based on the following inequality:

Theorem (Bonferroni's Inequality)

\[P( \beta_0\cap\beta_1)\geq1-P(\beta_0^c)-P(\beta_1^c)\]

for any two events \(\beta_0 \) and \( \beta_1 \), where \(\beta_0^c \) and \(\beta_1^c \) are complements of events \( \beta_0 \) and \( \beta_1 \), respectively.

We take, \( \beta_0 =\) the event that confidence interval for \(\beta_0 \) covers \(\beta_0 \); and, \( \beta_1 =\) the event that confidence interval for \( \beta_1 \) covers \( \beta_1 \);

So, if \(P(\beta_0) = 1-\alpha_1 \), and \(P(\beta_1) = 1-\alpha_2 \), then \(P(\beta_0\cap\beta_1)\geq1-\alpha_1-\alpha_2 \), by Bonferroni's inequality. Note that \(\beta_0\cap\beta_1\) is the event that confidence intervals for both the parameters cover the respective parameters. Therefore we take \(\alpha_1 = \alpha_2 = \alpha/2 \) to get joint confidence intervals with confidence coefficient at least \(1 - \alpha \),

\(b_0 \pm t(1-\alpha/4;n-2) s(b_0) \) and \(b_1 \pm t(1-\alpha/4;n-2) s(b_1) \) for \(\beta_0\) and \(\beta_1\), respectively.

Bonferroni joint confidence intervals for mean response

We want to find the simultaneous confidence interval for \(E(Y|X = X_h) = \beta_0 + \beta_1X_h\) for g different values of \(X_h\). Using Bonferroni's inequality for the intersection of g different events, the confidence intervals with confidence coefficient (at least) \(1-\alpha\) are given by \[\widehat{Y_h} \pm t(1-\alpha/2g; n-2)s(\widehat{Y_h})\].

Confidence band for regression line : Working-Hotelling procedure

The confidence band \(\widehat{Y_h} \pm\sqrt{2F(1-\alpha;2,n-2)}s(\widehat{Y_h}\)) contains the entire regression line (for all values of X) with confidence level \(1-\alpha\). The Working-Hotelling procedure for obtaining the \(1-\alpha\) simultaneous confidence band for the mean responses, therefore, is to use these confidence limits for the g different values of \(X_h\).

Simultaneous prediction intervals

Recall that, the standard error of prediction for a new observation \(Y_{h(new)}\) with \(X = X_h\), is $$s(Y_{h(new)}-\widehat{Y_h}) = \sqrt{MSE(1+\frac{1}{n}+\frac{(X_h-\overline{X})^2}{\sum_i(X_i-\overline{X})^2})}$$ In order to predict the new observations for g different values of X, we may use one of the two procedures:

  • Bonferroni procedure : $$\widehat{Y_h}\pm t(1-\alpha/2g;n-2)s(Y_{h(new)}-\widehat{Y_h})$$.
  • Scheffe procedure : $$\widehat{Y_h}\pm \sqrt{gF(1-\alpha;g,n-2)}s(Y_{h(new)}-\widehat{Y_h})$$.

Remark : A point to note is that except for the Working-Hotelling procedure for finding simultaneous confidence intervals for mean response, in all the other cases, the confidence intervals become wider as g increases.

Which method to choose : Choose the method which leads to narrower intervals. As a comparison between Bonferroni and Working-Hotelling (for finding confidence intervals for the mean response), the following can be said :

  • If g is small, Bonferroni is better.
  • If g is large, Working-Hotelling is better (the coefficient of \(s(\widehat{Y_h})\) in the confidence limits remains the same even as g becomes large).

Housing data as an example

Fitted regression model : \(\widehat{Y_h} = 28.981 + 2.941X, n=19, s(b_0) = 8.5438, s(b_1) = 0.5412, MSE = 11.9512.\)

  • Simultaneous confidence intervals for \(\beta_0\) and \(\beta_1\) : For 95% simultaneous C.I., \(t(1-\alpha/4;n-2)=t(0.9875;17) = 2.4581\). The intervals are (for \(\beta_0 and  \beta_1\), respectively)

\(28.981 \pm 2.4581 \times 8.5438 \equiv 28.981 \pm 21.002,   2.941 \pm 2.4581 \times 0.5412 \equiv 2.941 \pm 1.330\)

  • Simultaneous inference for mean response at different values of X : Say g = 3.

And the values are

$$X_h$$ 14 16 18.5
$$\widehat{Y_h}$$ 70.155        76.037 83.390
$$s(\widehat{Y_h})$$ 1.2225        0.8075 1.7011

\(t(1-0.05/2g;n-2) = t(0.99167;17) = 2.655, \sqrt{2F(0.95;2,n-2)} = \sqrt{2 \times 3.5915} = 2.6801 \)

The 95% simultaneous confidence intervals for the mean responses are given in the following table:

              $$X_h$$ 14 16 18.5
         Bonferroni  $$70.155 \pm 3.248$$ $$76.037 \pm 2.145$$ $$83.390 \pm 4.520$$
       Working-Hotelling $$70.155 \pm 3.276$$ $$76.037 \pm 2.164$$ $$83.390 \pm 4.559$$
  • Simultaneous prediction intervals for different values of X : Again, say g = 3 and the values of 14,16 and 18.5. In this case, \(\alpha = 0.05, t(1-\alpha/2g;n-2) = t(0.99167; 17) = 2.655\). And \(\sqrt{gF(1-\alpha;g,n-2)} = \sqrt{3F(0.95;3,17)} = \sqrt{3 \times 3.1968} = 3.0968\). The standard errors and simultaneous 95% C.I. are given in the following table:
$$X_h$$ 14 16 18.5
$$\widehat{Y_h}$$ 70.155 76.037 83.390
 $$s(Y_{h(new)}-\widehat{Y_h})$$ 3.6668 3.5501 3.8529
Bonferroni  $$70.155 \pm 9.742$$          $$76.037 \pm 9.432$$ $$83.390 \pm 10.237$$
Scheffe  $$70.155 \pm 11.355$$         $$76.037 \pm 10.994$$ $$83.390 \pm 11.932$$

Contributors

  • Anirudh Kandada(UCD)