Loading [MathJax]/jax/output/HTML-CSS/jax.js
Skip to main content
Library homepage
 

Text Color

Text Size

 

Margin Size

 

Font Type

Enable Dyslexic Font
Statistics LibreTexts

Concepts Related to Hypothesis Tests

( \newcommand{\kernel}{\mathrm{null}\,}\)

1 Review of concepts related to hypothesis tests

1.1 Type I and Type II errors

In hypothesis testing, there are two types of errors

  • Type I error: reject null hypothesis when it is true
  • Type I error rate

P(reject H0 | H0 true)

  • When testing H0 at a pre-specified level of significance α, the Type I error rate is controlled to be no larger than α.
  • Type II error: accept the null hypothesis when it is wrong.
  • Type II error rate

P(accept H0 | H0 wrong).

  • Power : probability of rejecting H0 when it is wrong

Power = P(reject H0 | H0 wrong)

= 1 - Type II error rate.

1.2 What determines the power?

The power of a testing procedure depends on

  • Significance level α - the maximum allowable Type I error - the larger α is , the higher is the power.
  • Deviation from H0 - the strength of signal - the larger the deviation is, the higher is the power.
  • Sample size: the larger the sample size is, the higher is the power.

2 Power of an F-test

2.1 Power calculation for F-test

Test H0 : μ1 = = μr under a single factor ANOVA model: given the significance level α :

  • Decision rule

{rejectH0ifF>F(1α;r1,nTr)acceptH0ifFF(1α;r1,nTr)

  • The Type I error rate is at most α.
  • Power depends on the noncentrality parameter

ϕ=1σri=1ni(μiμ)2r.

Note ϕ depends on sample size (determined by the ni's) and signal size (determined by the (μiμ.)2's).

2.2 Distribution of F-ratio under the alternative hypothesis

The distribution of F* under an alternative hypothesis.

  • When the noncentrality parameter is ϕ, then

FFr1,nTr(ϕ),

i.e., a noncentral F-distribution with noncentrality parameter ϕ.

  • Power = P(Fr1,nTr(ϕ) > F(1 - α;r - 1, nTr)).
  • Example: if α = 0.01, r = 4, nT = 20 and ϕ = 2, then Power = 0.61. (Use Table B.11 of the textbook.)

2.3 How to calculate power of the F test using R

  • The textbook defines the noncentrality parameter for a single factor ANOVA model as

ϕ=1σri=1ni(μiμ)2r

where r is number of treatment group (factor levels), μi's are the factor level means, ni is the sample size (number of replicates) corresponding to the i-th treatment group, and σ2 is the variance of the measurements.

  • For a balanced design, i.e., when n1 = = nr = n, the formula for ϕ reduces to

ϕ=1σ(n/r)ri=1(μiμ)2 .

Table B.11 gives the power of the F test given the values of the numerator degree of freedom v1 = r - 1, denominator degree of freedom v2 = nTr, level of significance α and noncentrality parameter ϕ.

  • Example: For r = 3, n = 5, (so that v1 = 2 and v2 = 12), α = 0.05 and ϕ = 2, the value of power from Table B.11 is 0.78.

However, if you want to use R to compute the power of the F-test, you need to be aware that the noncentrality parameter for F distribution in R is defined differently. Indeed, compared to the above setting, the noncentrality parameter to used in the function in R will be r x ϕ2 instead of ϕ. Here is the R code to be used for computing the power in the example described above: r = 3, n = 5, α = 0.05 and ϕ = 2:

  • Critical value for the F-test when α = 0.05, vi = r - 1 = 2 and v2 = nT - r = 12 is

F.crit = qf(0.95,2,12)

  • Then the power of the test, when will be computed as

F.power = 1 - pf(F.crit, 2, 12, 32^2)

  • Note that the function qf is used to compute the quantile of the central F distribution. Its second and third arguments are the numerator and denominator degrees of freedom of the F distribution.
  • The function pf is used to calculate the probability under the noncentral F- density curve to the left of a given value (in this case F.crit). Its second and third arguments are the numerator and denominator degrees of freedom of the F distribution, while the fourth argument is the noncentrality parameter r x ϕ2 (we specify this explicitly in the above example).
  • The values of F.crit and F.power are 3.885294 and 0.7827158, respectively.

3 Calculating sample size

God: find the smallest sample size needed to achieve

  • a pre-specified power γ;
  • with a pre-specified Type I error rate α;
  • for at least a pre-specifiec signal leval s.

The idea behind the sample size calculation is as follows:

  • On one hand, we want the sample size to be large enough to detect practically important deviations ( with a signal size to be at least s) from H0 with high probability (with a power at least γ), and we only allow for a pre-specified low level of Type I error rate (at most α) when there is no signal.
  • On the other hand, the sample size should not be unnecessarily large such that the cost of the study is too high.

3.1 An example of sample size calculation

  • For a single factor study with 4 levels and assuming a balanced design, i.e., the n1=n2=n3=n4 (=n, say), the goal is to test H0: all the factor level means μi are the same.
  • Question: What should be the sample size for each treatment group under a balanced design, such that the F-test can achieve γ = 0.85 power with at most α = 0.05 Type I error rate when the deviation from H0 has at least s=ri=1(μiμ)2=40 ?
  • One additional piece of information needed in order to answer this question is the residual variance σ2.
  • Suppose from a pilot study, we know the residual variance is about σ2 = 10.
  • Use a trial-and-error strategy to search Table B.11. This means, for a given n (starting with n = 1),

(i) calculate ϕ=(1/σ)(n/r)ri=1(μiμ)2=(1/σ)(n/r)s;
(ii) fix the numerator degree of freedom v1 = r - 1 = 3;

(iii) check the power of the test when the denominator degree of freedom v2=nTr (where nT = nr), with the given ϕ and α ;

(iv) keep increasing n until the power of the test is closest to (equal or just above) the given value of γ.

3.2 An alternative approach to sample size calculation

Suppose that we want to determine the minimum sample size required to attain a certain power of the test subject to a specified value of the maximum discrepancy among the factor level means. In other words, we want the test to attain power γ (= 1 - β, where β is the probability of Type II error) when the minimum range of the treatment group means

Δ=max1irμimin1irμi .

  • Suppose we have a balanced design, i.e., n1==nr = n, say. We want to determine the minimum value of n such that the power of the F test for testing H0 : μ1==μr is at least a prespecified value γ=1β.
  • We need to also specify the level of significance α and the standard deviation of the measurements σ.
  • Table B.12 gives the minimum value of n needed to attain a given power 1 - β for a given value of α, for a given number of treatments r and a given "effect size" Δ/σ.
  • Example : For r = 4, α = 0.05, in order that the F-test achieves the power 1 - β = 0.9 when the effect size is Δ/σ = 1.5, we need n to be at least 14. That is, we need a balanced design with at least 14 experimental units in each treatment group.

Contributors

  • Yingwen Li (UCD)
  • Debashis Paul (UCD)


This page titled Concepts Related to Hypothesis Tests is shared under a not declared license and was authored, remixed, and/or curated by Debashis Paul.

Support Center

How can we help?