I am a strong beleiver in the use of computer simulations to understand statistical concepts, and in later sessions we will dig deeply into their use. Here we will introduce the idea by asking whether we can confirm the need to subtract 1 from the sample size in computing the sample variance.
Let’s treat the entire sample of children from the NHANES data as our “population”, and see how well the calculations of sample variance using either or in the denominator will estimate variance of this population, across a large number of simulated random samples from the data. We will return to the details of how to do this in a later chapter.
|Variance estimate using n||715|
|Variance estimate using n-1||730|
This shows us that the theory outlined above was correct: The variance estimate using as the denominator is very close to the variance computed on the full data (i.e, the population), whereas the variance computed using as the denominator is biased (smaller) compared to the true value.