12.4: Robustness Simulation

• • Contributed by David Lane
• Associate Professor (Psychology, Statistics, and Management) at Rice University

Skills to Develop

• State the effect of heterogeneity of variance on the Type I error rate
• State when heterogenety of variance can lead to a very high Type I error rate
• State the effect of skew of on the Type I error rate

Instructions

This demonstration allows you to explore the effects of violating the assumptions of normality and homogeneity of variance. When the simulation starts you see the distributions of two populations. By default, they are both normally distributed, have means of $$0$$ and standard deviations of $$1$$. The default sample size for the simulations is $$5$$ per group. If you push the "simulate" button, $$2,000$$ simulated experiments are conducted. You can adjust the number of simulations from $$2,000$$ to $$10,000$$. A $$t$$-test is computed for each experiment and the number of tests that were significant, not significant, and the Type I error rate (the proportion significant) are displayed.

Since the null hypothesis is true and all assumptions are met with these default values, the Type I error rate should be close to $$0.05$$, especially if you ran a large number of simulations. It will not equal $$0.05$$ because of random variation. However, the larger the number of simulations you run, the closer the Type I error rate should come to $$0.05$$.

You can explore the effects of violating the assumptions of the test by making one or both of the distributions skewed and/or by making the standard deviations of the distributions different. You can also explore the effects of sample size and of the significance level used ($$0.05$$ or $$0.01$$).

By exploring various distributions, sample sizes, and significance levels, you can get a feeling for how well the test works when its violations are violated. A test that is relatively unaffected by violations of its assumptions is said to be "robust."

Illustrated Instructions

The video below begins by running $$2000$$ simulations with the two populations each with means of $$0$$, standard deviations of $$2$$ no skewness and sample sizes of $$5$$. The video continues by varying different aspect of the distributions and running more simulations. Note the number of significant tests after each set of simulations.

Contributor

• Online Statistics Education: A Multimedia Course of Study (http://onlinestatbook.com/). Project Leader: David M. Lane, Rice University.