This book starts by presenting an overview of the statistical thought process. By the end of chapter 2, students are already familiar with concepts such as hypotheses, level of significance, p-values, errors. Normally these topics are not introduced until after a discussion of probability and sampling distributions. My approach to probability and sampling distributions is also very different.
- There are two types of data, categorical and quantitative. Categorical datais data that can be put into categories. Examples include yes/no responses, or categories such as color, religion, nationality, pass/fail, win/lose, etc. Quantitative data is data that consists of numbers resulting from counts or measurements. Examples include, height, weight, time, amount of money, number of crimes, heart rate, etc. how we understand the data, graphs and statistics, are dependent upon the type of data.
- A primary role of statistics is to use evidence from stochastic populations to improve our understanding of the world. Deciding what evidence will be collected is an essential part of the process. Research design is that portion of the statistical process in which planning is done so that the conclusions are drawn with confidence and can be supported under scrutiny. Three research designs willed be explored: observational studies, observational experiments, and manipulative experiments.
- When faced with a decision that will be based on data, it is the production of graphs and statistics that will be analogous to dating and interviews. The data that is collected must be useful to answer the questions that were asked. Previously, we focused on both the planning of the experiment and the random selection process that is important for producing good sample data. This chapter will now focus on what to do with the data once you have it.
- The objective of this chapter is to develop the theory that helps us understand why a relatively small sample size can actually lead to conclusions about a much larger population. The explanation is different for categorical and quantitative data. We will begin with categorical data.
- Best decisions can be made if they are based on the best available evidence. While the ideal situation would be to get data from the entire population, the reality is that data will almost always come from a sample. Because sample data varies based on the random process that was used to select it, the researcher is forced to use sample data to draw a conclusion about the entire population. This is inference. It is using specific partial evidence to make a more general conclusion.
- The inferences that were discussed in chapters 5 and 6 were based on the assumption of an a priori hypothesis that the researcher had about a population. However, there are times when the researchers do not have a hypothesis. In such cases they would simply like a good estimate of the parameter.
- In some cases, random variables could be sampled and compared for two different populations, but that still makes it univariate data. In this chapter, we will explore bivariate quantitative data. This means that for each unit in our sample, two quantitative variables will be determined. The purpose of collecting two quantitative variables is to determine if there is a relationship between them.
- In chapter 5, the inferential theory for categorical data was developed based upon the binomial distribution. Recall that the binomial distribution shows the probability of the possible number of successes in a sample of size n when there were only two possible independent outcomes, success and failure. What happens if there are more than two possible outcomes however?