1.1 What is Statistics?
- What is the definition of data? Give an example from everyday life.
- Explain the difference between a dataset and a single observation.
- What are statistics?
- A question with a variety of answers.
- A way to measure the entire population
- The science of collecting, organizing, analyzing and interpreting data.
- A question from a survey.
- Describe the process of statistics in your own words.
- What are some examples of sources of data that are commonly used in research?
- Describe one way you’ve encountered data in your life outside of school.
- What is the goal of statistics as a discipline?
- Why is variability important when working with data?
- Give one real-world example of a statistical investigation. State what was studied and what the researchers were trying to learn.
- A cognitive psychologist is interested in comparing two ways of presenting stimuli on subsequent memory. Twelve subjects are presented with each method and a memory test is given. What would be the roles of descriptive and inferential statistics in the analysis of these data?
- A teacher wishes to know whether the full time students have better study habits than the part time students. A questionnaire is distributed assessing study skills and the full time and part time students are compared. Is this an example of descriptive or inferential statistics?
- Describe the difference between descriptive statistics and inferential statistics. Illustrate with an example.
- When inferential statistics are used, the goal is to _________________ results from samples to the populations from which they were drawn.
- ________________ statistics include calculations such as the mean, median, and mode to summarize data from samples univariately.
1.2 Data Types: Categorical vs. Quantitative
- Define the term "variable" in a statistical context.
- For each of the following, say whether the variable is categorical or numerical:
- Age
- Favorite color
- Number of siblings
- Grade level
- State whether the variable is discrete or continuous.
- A cat’s weight.
- The height of a building.
- A person’s age.
- The number of floors of a skyscraper.
- The number of clothing items available for purchase.
- State whether the variable is discrete or continuous.
- Temperature in degrees Celsius.
- The number of cars for sale at a car dealership.
- The time it takes to run a marathon.
- The amount of mercury in a tuna fish.
- The weight of a hummingbird.
- Identify the type of data that would be used to describe a response (numerical discrete, numerical continuous, or categorical), and give an example of the data
- Household income
- Movie rating (G, PG, PG-13, R)
- Eye color
- Number of classes taken
- Daily temperature
- Age of executives in Fortune 500 companies
- Number of tickets sold to a concert
- Favorite baseball team
- Time in line waiting to buy groceries
- Number of students enrolled at Red Rocks Community College
- Most-watched television show
- Brand of toothpaste
- Distance to the closest movie theater
- Number of competing computer spreadsheet software packages
Use the following information to answer the next two exercises: A study was done to determine the age, number of times per week, and the duration (amount of time) of resident use of a local park in Lakewood. The first house in the neighborhood around the park was selected randomly and then every 8th house in the neighborhood around the park was interviewed.
- “Number of times per week” is what type of data?
- categorical
- numerical discrete
- numerical continuous
- “Duration (amount of time)” is what type of data?
- categorical
- numerical discrete
- numerical continuous
- The Well-Being Index is a survey that follows trends of U.S. residents on a regular basis. There are six areas of health and wellness covered in the survey: Life Evaluation, Emotional Health, Physical Health, Healthy Behavior, Work Environment, and Basic Access. Some of the questions used to measure the Index are listed below. Identify the type of data obtained from each question used in this survey: categorical, numerical discrete, or numerical continuous.
- Do you have any health problems that prevent you from doing any of the things people your age can normally do?
- During the past 30 days, for about how many days did poor health keep you from doing your usual activities?
- In the last seven days, on how many days did you exercise for 30 minutes or more?
- Do you have health insurance coverage?
- Select the measurement scale Nominal, Ordinal, Interval or Ratio for each scenario.
- A person’s age.
- A person’s race.
- Age groupings (baby, toddler, adolescent, teenager, adult, elderly).
- Clothing brand.
- A person’s IQ score.
- Temperature in degrees Celsius.
- The amount of mercury in a tuna fish.
- Select the measurement scale Nominal, Ordinal, Interval or Ratio for each scenario.
- Temperature in degrees Kelvin.
- Eye color.
- Year in school (freshman, sophomore, junior, senior).
- The weight of a hummingbird.
- The height of a building.
- The amount of iron in a person’s blood.
- A person’s blood type.
- Find a data set on the kaggle page (opens in new window) that includes both categorical and numerical variables. Describe the dataset as a whole and then list all the categorical and all the numerical variables. For the numerical variables, include if they are continuous or discrete.
- What is the difference between a categorical variable and a numerical variable?
- Explain the difference between a discrete and continuous variable. Give an example of each.
- Explain the difference between a nominal and ordinal variable. Give an example of each.
- "Star ratings" on a product review site are what kind of variable? Explain.
- Someone collects data on the breed, age, and weight of 50 dogs. Classify each variable.
- Give your own example for each type of variable:
- Nominal
- Ordinal
- Discrete
- Continuous
- Why does classifying your variables correctly matter when performing statistical analysis?
- Can a variable be categorical in one context and numerical in another? Give an example.
- Describe a situation where categorical and numerical variables appear together in a study.
1.3 Populations, Samples, and Sampling Methods
For each of the following exercises, identify: a. the population, b. the sample, c. the parameter, d. the statistic, e. the variable, and f. the data. Give examples where appropriate.
- A fitness center is interested in the mean amount of time a client exercises in the center each week.
- Ski resorts are interested in the mean age that children take their first ski and snowboard lessons. They need this information to plan their ski classes optimally.
- A marriage counselor is interested in the proportion of clients she counsels who stay married.
- A cardiologist is interested in the mean recovery period of their patients who have had heart attacks.
- Insurance companies are interested in the mean health costs each year of their clients, so that they can determine the costs of health insurance.
- A politician is interested in the proportion of voters in their district who think the politician is doing a good job.
- Political pollsters may be interested in the proportion of people who will vote for a particular cause.
- A marketing company is interested in the proportion of people who will buy a particular product.
Use the following information to answer the next three exercises: A Red Rocks Community College instructor is interested in the mean number of days Red Rocks Community College math students are absent from class during a quarter.
- What is the population she is interested in?
- all Red Rocks Community College students
- all Red Rocks Community College English students
- all Red Rocks Community College students in the instructor's classes
- all Red Rocks Community College math students
- Consider the following: \(X\) = number of days a Red Rocks Community College math student is absent. In this case, X is an example of:
- variable
- population
- statistic
- data
- The instructor’s sample produces a mean number of days absent of 3.5 days. This value is an example of:
- parameter
- data
- statistic
- variable
- Suppose you want to determine the mean number of students per statistics class in your state. Describe a possible sampling method in three to five complete sentences. Make the description detailed.
- Suppose you want to determine the mean number of cans of soda drunk each month by students in their twenties at your school. Describe a possible sampling method in three to five complete sentences. Make the description detailed.
- The instructor takes a sample by gathering data on five randomly selected students from each Red Rocks Community College math class. The type of sampling used is
- cluster sampling
- stratified sampling
- simple random sampling
- convenience sampling
- A study was done to determine the age, number of times per week, and the duration (amount of time) of residents using a local park in San Jose. The first house in the neighborhood around the park was selected randomly and then every eighth house in the neighborhood around the park was interviewed. The sampling method was:
- simple random
- systematic
- stratified
- cluster
- Name the sampling method used in each of the following situations:
- A person in the airport is handing out questionnaires to travelers asking them to evaluate the airport’s service. The person does not ask travelers who are hurrying through the airport with their hands full of luggage, but instead asks all travelers who are sitting near gates and not taking naps while they wait
- A teacher wants to know if students are doing homework so they randomly select rows two and five and then call on all students in row two and all students in row five to present the solutions to homework problems to the class
- The marketing manager for an electronics chain store wants information about the ages of its customers. Over the next two weeks, at each store location, 100 randomly selected customers are given questionnaires to fill out asking for information about age, as well as about other variables of interest
- The librarian at a public library wants to determine what proportion of the library users are children. The librarian has a tally sheet on which they mark whether books are checked out by an adult or a child. The librarian records this data for every fourth patron who checks out books
- A political party wants to know the reaction of voters to a debate between the candidates. The day after the debate, the party’s polling staff calls 1,200 randomly selected phone numbers. If a registered voter answers the phone or is available to come to the phone, that registered voter is asked whom they intend to vote for and whether the debate changed their opinion of the candidates.
- Give an example of a population and two different characteristics that may be of interest.
- Identify which value represents the sample mean and which value represents the claimed population mean.
- A recent article in a college newspaper stated that college students get an average of 5.5 hrs of sleep each night. A student who was skeptical about this value decided to conduct a survey by randomly sampling 25 students. On average, the sampled students slept 6.25 hours per night
- American households spent an average of about $52 in 2007 on Halloween merchandise such as costumes, decorations and candy. To see if this number had changed, researchers conducted a new survey in 2008 before industry numbers were reported. The survey included 1,500 households and found that average Halloween spending was $58 per household.
- The average GPA of students in 2001 at a private university was 3.37. A survey on a sample of 203 students from this university yielded an average GPA of 3.59 in Spring semester of 2025.
- In a study, the sample is chosen by separating all cars by size, and selecting 10 of each size grouping. What is the sampling method? Is the sample representative?
- In a study, the sample is chosen by writing everyone’s name on a playing card, shuffling the deck, then choosing the top 20 cards. What is the sampling method? Is the sample representative?
- In a study, the sample is chosen by asking people on the street. What is the sampling method? Is the sample representative?
- In a study, the sample is chosen by selecting a room of the house, and appraising all items in that room. What is the sampling method? Is the sample representative?
- In a study, the sample is chosen by surveying every third driver coming through a tollbooth. What is the sampling method? Is the sample representative?
- Define the term population in a statistical study.
- What’s a sample? How is it different from the population?
- Why do we usually collect data from a sample, not the entire population?
- Explain the concept of a representative sample. Why is it important?
- Which sampling method is being used in each case:
- A random number generator selects 200 voters from a list
- A company emails a survey to all its customers and records whoever replies
- A school surveys every third student entering the cafeteria
- A study randomly picks five neighborhoods, then surveys every household there
- Compare and contrast simple random sampling and stratified sampling.
- When might cluster sampling be more practical than other methods?
- Which sampling method might best ensure racial diversity in your city’s resident sample?
- A teacher surveys students in her own class to evaluate school-wide stress. Identify any concerns.
- You need to survey employees at a nationwide company. Suggest a good sampling method and explain why.
- What are the characteristics of a “bad” sample?
1.4 Sources of Bias and Randomness
- Define bias in the context of statistics.
- Give an example of a biased sample and explain the effect it might have on conclusions.
- Define each of the following:
- Selection bias
- Convenience bias
- Nonresponse bias
- Voluntary response bias
- A researcher collects data by placing a QR code on a receipt that links to a survey. What types of bias might occur?
- How does random sampling help reduce bias?
- What’s the difference between random error and bias?
- Explain the value of randomness when selecting a sample. What are we trying to protect against?
- Airline companies are interested in the consistency of the number of babies on each flight, so that they have adequate safety equipment. Suppose an airline conducts a survey. Over Thanksgiving weekend, it surveys six flights from Denver to Salt Lake City to determine the number of babies on the flights. It determines the amount of safety equipment needed by the result of that study.
- Using complete sentences, list three things wrong with the way the survey was conducted.
- Using complete sentences, list three ways that you would improve the survey if it were to be repeated.
- A “random survey” was conducted of 3,274 people of the “microprocessor generation” (people born since 1971, the year the microprocessor was invented). It was reported that 48% of those individuals surveyed stated that if they had $2,000 to spend, they would use it for computer equipment. Also, 66% of those surveyed considered themselves relatively savvy computer users.
- Do you consider the sample size large enough for a study of this type? Why or why not?
- Based on your “gut feeling,” do you believe the percents accurately reflect the U.S. population for those individuals born since 1971? If not, do you think the percents of the population are actually higher or lower than the sample statistics? Why?
- Additional information: The survey, reported by Intel Corporation, was filled out by individuals who visited the Los Angeles Convention Center to see the Smithsonian Institute's road show called “America’s Smithsonian.”
- With this additional information, do you feel that all demographic and ethnic groups were equally represented at the event? Why or why not?
- With the additional information, comment on how accurately you think the sample statistics reflect the population parameters.
- In advance of the 1936 Presidential Election, a magazine titled Literary Digest released the results of an opinion poll predicting that the republican candidate Alf Landon would win by a large margin. The magazine sent post cards to approximately 10,000,000 prospective voters. These prospective voters were selected from the subscription list of the magazine, from automobile registration lists, from phone lists, and from club membership lists. Approximately 2,300,000 people returned the postcards. Think about the state of the United States in 1936. Explain why a sample chosen from magazine subscription lists, automobile registration lists, phone books, and club membership lists was not representative of the population of the United States at that time.
- What effect does the low response rate have on the reliability of the sample?
- Are these problems examples of sampling error or non-sampling error?
- During the same year, George Gallup conducted his own poll of 30,000 prospective voters. These researchers used a method they called "quota sampling" to obtain survey answers from specific subsets of the population. Quota sampling is an example of which sampling method described in this module?
- A scholarly article about response rates begins with the following quote: "Crime-related and demographic statistics for 47 US states in 1960 were collected from government agencies, including the FBI's Uniform Crime Report. One analysis of this data found a strong connection between education and crime indicating that higher levels of education in a community correspond to higher crime rates". Which of the potential problems with samples could explain this connection?
- Imagine you work for a polling company and a member of your team has proposed the following survey question: “Do you feel happy paying your taxes while some politicians are allowed to use loopholes and avoid paying their fair share of taxes?”. As part of preliminary data collection, 11 people responded to this question. Each participant answered “NO!”. Which of the potential problems with samples could explain this connection?
- “Declining contact and cooperation rates in random digit dial (RDD) national telephone surveys raise serious concerns about the validity of estimates drawn from such research.” (Scott Keeter et al., “Gauging the Impact of Growing Nonresponse on Estimates from a National RDD Telephone Survey,” Public Opinion Quarterly 70 no. 5 (2006), link(opens in new window) [poq.oxfordjournals.org] (accessed May 1, 2013)). The Pew Research Center for People and the Press admits: “The percentage of people we interview – out of all we try to interview – has been declining over the past decade or more.” (Frequently Asked Questions, Pew Research Center for the People & the Press, source(opens in new window) [www.people-press.org] (accessed May 1, 2013)).
- What are some reasons for the decline in response rate over the past decade.
- Explain why researchers are concerned with the impact of the declining response rate on public opinion polls.
- The Gallup Poll uses a procedure called random digit dialing, which creates phone numbers based on a list of all area codes in America in conjunction with the associated number of residential households in each area code. Give a possible reason the Gallup Poll chooses to use random digit dialing instead of picking phone numbers from the phone book.
- A statistics student who is curious about the relationship between the amount of time students spend on social networking sites and their performance at school decides to conduct a survey. Three research strategies for collecting data are described below. In each, name the sampling method proposed and any bias you might expect.
- They randomly sample 40 students from the study's population, gives them the survey, asks them to fill it out and bring it back the next day.
- They give out the survey only to their friends, and makes sure each one of them fills out the survey.
- They post a link to an online survey on their social media feed and asks friends to fill out the survey.
- Suppose we want to estimate family size, where family is defined as one or more parents living with children. If we select students at random at an elementary school and ask them what their family size is, will our average be biased? If so, will it overestimate or underestimate the true value?
- Identify the flaw in reasoning in the following scenarios. Explain what the individuals in the study should have done differently if they wanted to make such strong conclusions.
- Students at an elementary school are given a questionnaire that they are required to return after their parents have completed it. One of the questions asked is, "Do you nd that your work schedule makes it difficult for you to spend time with your kids after school?" Of the parents who replied, 85% said "no". Based on these results, the school officials conclude that a great majority of the parents have no difficulty spending time with their kids after school.
- A survey is conducted on a simple random sample of 1,000 women who recently gave birth, asking them about whether or not they smoked during pregnancy. A follow-up survey asking if the children have respiratory problems is conducted 3 years later, however, only 567 of these women are reached at the same address. The researcher reports that these 567 women are representative of all mothers.
- A orthopedist administers a questionnaire to 30 of his patients who do not have any joint problems and finds that 20 of them regularly go running. He concludes that running decreases the risk of joint problems.
- Imagine you want to know how many hours per week students at your school spend studying, on average. Determine if the following sampling methods will reasonably produce representative samples. Justify your answers.
- Select 40 students randomly using a list of all student IDs and a random number generator.
- Select 80 students as they enter the library.
- Select 150 students randomly using a list of all student IDs and a random number generator.
- Select the 200 students enrolled in calculus III at the college this semester.
- Which of the methods above would produce the most representative sample and the best estimate? Explain.
- A professor is curious what motivates students to cheat. They want to know if cheating is more pervasive in STEM classes compared to other disciplines. They randomly select 50 students from various STEM classes and 50 students from various non-STEM classes. They ask participants if they have cheated in the class and compare the proportion of students who say yes in each group. What type of bias should the professor be concerned about? Explain.
- A market research company wants to gauge interest in a bingo facility in a small city. The researchers send out a text containing a link to the survey to randomly selected phone numbers in the city. What type of bias should the researchers be concerned about? Explain.
- Compare and contrast random sampling and random assignment.
- In practice, wording effects are often an extremely strong influence on the answers people give when surveyed. Suppose you were doing a survey of American voters opinions of the president. Think of a way of asking a question which would tend to maximize the number of people who said they approved of the job he is doing. Then think of another way of asking a question which would tend to minimize that number [who say they approve of his job performance].
- Think of a survey question you could ask in a survey of the general population of Americans in response to which many [most?] people would lie. State what would be the issue you would be investigating with this survey question, as a clearly defined, formal variable and parameter on the population of all Americans. Also tell exactly what would be the wording of the question you think would get lying responses.
1.5 Exploring Statistical Questions
- What is a statistical question?
- A question where you expect to get a variety of answers and you are interested in the distribution and tendency of those answers
- A question using reported statistics.
- A question on a survey.
- A question on a census.
- Which of the following are statistical questions? Select all that apply.
- How old are you?
- What is the weight of a mouse?
- How tall are all 3-year-olds?
- How tall are you?
- What is the average blood pressure of adults?
- Which of the following are statistical questions? Explain your reasoning:
- How tall is Mount Everest?
- How much sleep do teenagers get on weekdays?
- What is my shoe size?
- Do students who drink coffee study more?
- Identify the explanatory and response variables for a study of whether playing music affects test scores.
- What makes a good study design? List three key features.
- In your own words, why is it important to have a plan before collecting data?
- Identify the steps in the statistical process in the context of the studies described below.
- Researchers collected data to examine the relationship between pollutants and preterm births in Southern California. During the study air pollution levels were measured by air quality monitoring stations. Specifically, levels of carbon monoxide were recorded in parts per million, nitrogen dioxide and ozone in parts per hundred million, and coarse particulate matter (\(PM_{10}\)) in \(\mu g=m^3\). Length of gestation data were collected on 143,196 births between the years 1989 and 1993, and air pollution exposure during gestation was calculated for each birth. The analysis suggested that increased ambient PM10 and, to a lesser degree, CO concentrations may be associated with the occurrence of preterm births.
- The Buteyko method is a shallow breathing technique developed by Konstantin Buteyko, a Russian doctor, in 1952. Anecdotal evidence suggests that the Buteyko method can reduce asthma symptoms and improve quality of life. In a scientific study to determine the effectiveness of this method, researchers recruited 600 asthma patients aged 18-69 who relied on medication for asthma treatment. These patients were split into two research groups: one practiced the Buteyko method and the other did not. Patients were scored on quality of life, activity, asthma symptoms, and medication reduction on a scale from 0 to 10. On average, the participants in the Buteyko group experienced a significant reduction in asthma symptoms and an improvement in quality of life.
- In a study of the relationship between socio-economic class and unethical behavior, 129 University of California undergraduates at Berkeley were asked to identify themselves as having low or high social-class by comparing themselves to others with the most (least) money, most (least) education, and most (least) respected jobs. They were also presented with a jar of individually wrapped candies and informed that they were for children in a nearby laboratory, but that they could take some if they wanted. Participants completed unrelated tasks and then reported the number of candies they had taken. It was found that those in the upper-class rank condition took more candy than did those in the lower-rank condition.
- The Buteyko method is a shallow breathing technique developed by Konstantin Buteyko, a Russian doctor, in 1952. Anecdotal evidence suggests that the Buteyko method can reduce asthma symptoms and improve quality of life. In a scientific study to determine the effectiveness of this method, researchers recruited 600 asthma patients aged 18-69 who relied on medication for asthma treatment. These patients were split into two research groups: one practiced the Buteyko method and the other did not. Patients were scored on quality of life, activity, asthma symptoms, and medication reduction on a scale from 0 to 10. On average, the participants in the Buteyko group experienced a significant reduction in asthma symptoms and an improvement in quality of life. [McGowan. “Health Education: Does the Buteyko Institute Method make a difference?” In: Thorax 58 (2003).] Which of the following is the main research question?
- The Buteyko method causes shallow breathing.
- The Buteyko method can reduce asthma symptoms and an improvement in quality of life.
- Effectiveness of the Buteyko method.
- The patients score on quality of life, activity, asthma symptoms and medication reduction.
Read the following study description and answer the following questions: The United States Government recommends that to stay physically fit, middle-aged adults (ages 40 to 60) need to burn 150 to 400 calories per day doing exercise. Researchers at Minnesota State University, Mankato, wanted to learn whether middle-aged adults who used the Wii Fit video game exercised enough to meet the government’s fitness recommendations.2 The Wii Fit is a video game that includes exercises. The researchers taught 20 middle-aged adult volunteers how to use the Wii Fit video game. On the day after they were trained, the adults exercised for 20 minutes with the Wii Fit. Researchers measured the total amount of energy each of the adults in the study used in calories. They found that the average energy used was 116 calories for the 20 minute session. Based on the results of the study, the researchers concluded the Wii Fit video game could be a helpful form of exercise for middle aged adults. But, for exercise with Wii Fit to meet the government’s recommendation, the researchers stated that the length of the exercise session should be increased from 20 minutes to 30 minutes.
- Which question below is a reasonable research question for this investigation? Explain how you came to this answer.
- Do people think that playing the Wii Fit video game burns calories?
- Does the Wii Fit video game burn enough calories to be considered suitable exercise?
- Does the Wii Fit video game burn more calories than traditional exercise?
- What is the average amount of time that middle-aged adults spend playing Wii Fit video games?
- What data did the researchers collect to answer the research question?
- The amount of time that the adults exercised.
- The name of the adults.
- The type of exercise the adults completed.
- The total amount of calories the adults burned through exercising.
- How are the data summarized?
- Researchers found the proportion of adults who exercise using the Wii Fit video game.
- Researchers found the proportion of adults who prefer exercising with the Wii Fit video game over traditional exercises.
- Researchers found the average amount of calories that adults consumed through exercising using the Wii Fit video game.
- Researchers found the average amount of time that adults spent exercising using the Wii Fit video game.
- What did the researchers conclude?
- The Wii Fit video game is a preferred exercise for some middle-aged adults.
- The Wii Fit video game does not appear to burn enough calories in a 20–minute session, but a 30–minute session would possibly be enough.
- The Wii Fit video game does appear to burn enough calories in a 20–minute session, but a 30–minute session would be even better.
- The sample size is too small to make any reasonable conclusions about all middle-aged adults.
- Give an example of a research question that involves estimating a characteristic about the population of Registered Nurses in California.
- Improve this poorly stated research question: Do registered nurses work a lot?
- Create a cause-and-effect research question.
- For each of the following topics, provide a research question that involves that topic.
- Air quality in the Denver metro area.
- The bike paths of Lakewood.
- Access to fresh produce in Wheat Ridge.
- Cost of food in Colorado.
- Pokemon cards.
- Suppose that a research question has been formed. In 6 steps, describe the process of exploring this question. Think about what data we might need, how to collect that data and what we might do with that data. (This question is what this semester is about. At this point we are looking for some initial plans and we will refine this answer throughout the class)
- Find a public data set from the website www.Kaggle.com. Describe the data set and provide an example of a research question you could explore using this data.
1.6 Designing a Statistical Study
- List some practical difficulties involved in getting accurate results from a telephone survey.
- List some practical difficulties involved in getting accurate results from a mailed survey.
- Describe the difference between an observational study and an experiment.
- How is a survey different from an experiment? What limitations do surveys have?
- With your classmates, brainstorm some ways you could overcome these problems if you needed to conduct a phone or mail survey.
- A survey was conducted on 218 undergraduates from Duke University who took an introductory statistics course in Spring 2012. Among many other questions, this survey asked them about their GPA and the number of hours they spent studying per week. The scatterplot below displays the relationship between these two variables.

- What is the explanatory variable and what is the response variable?
- Describe the relationship between the two variables. Make sure to discuss unusual observations, if any.
- Is this an experiment or an observational study?
- Can we conclude that studying longer hours leads to higher GPAs?
- A large college class has 160 students. All 160 students attend the lectures together, but the students are divided into 4 groups, each of 40 students, for lab sections administered by different teaching assistants. The professor wants to conduct a survey about how satis ed the students are with the course, and he believes that the lab section a student is in might affect the student's overall satisfaction with the course.
- What type of study is this?
- Suggest a sampling strategy for carrying out this study.
- The scatterplot below shows the relationship between estimated life expectancy at birth as of 201260 and percentage of internet users in 201061 in 208 countries.
- Describe the relationship between life expectancy and percentage of internet users.
- What type of study is this?
- State a possible confounding variable that might explain this relationship and describe its potential effect.

- In order to assess the effectiveness of taking large doses of vitamin C in reducing the duration of the common cold, researchers recruited 400 healthy volunteers from staff and students at a university. A quarter of the patients were assigned a placebo, and the rest were evenly divided between 1g Vitamin C, 3g Vitamin C, or 3g Vitamin C plus additives to be taken at onset of a cold for the following two days. All tablets had identical appearance and packaging. The nurses who handed the prescribed pills to the patients knew which patient received which treatment, but the researchers assessing the patients when they were sick did not. No significant differences were observed in any measure of cold duration or severity between the four medication groups, and the placebo group had the shortest duration of symptoms.65
- Was this an experiment or an observational study? Why?
- What are the explanatory and response variables in this study?
- Were the patients blinded to their treatment?
- Was this study double-blind?
- Participants are ultimately able to choose whether or not to use the pills prescribed to them. We might expect that not all of them will adhere and take their pills. Does this introduce a confounding variable to the study? Explain your reasoning.
- You would like to conduct an experiment in class to see if your classmates prefer the taste of regular Coke or Diet Coke. Briefly outline a design for this study.
- A researcher is interested in the effects of exercise on mental health and he proposes the following study: Use stratified random sampling to ensure representative proportions of 18-30, 31-40 and 41-55 year olds from the population. Next, randomly assign half the subjects from each age group to exercise twice a week, and instruct the rest not to exercise. Conduct a mental health exam at the beginning and at the end of the study, and compare the results.
- What type of study is this?
- What are the treatment and control groups in this study?
- Does this study make use of blocking? If so, what is the blocking variable?
- Does this study make use of blinding?
- Comment on whether or not the results of the study can be used to establish a causal relationship between exercise and mental health, and indicate whether or not the conclusions can be generalized to the population at large.
- Suppose you are given the task of determining if this proposed study should get funding. Would you have any reservations about the study proposal?
- Chia Pets - those terra-cotta figurines that sprout fuzzy green hair - made the chia plant a household name. But chia has gained an entirely new reputation as a diet supplement. In one 2009 study, a team of researchers recruited 38 individuals and divided them evenly into two groups, and they randomly placed half of these participants into the treatment group and the other half into the control group. One group was given 25 grams of chia seeds twice a day, and the other was given a placebo. The subjects volunteered to be a part of the study. After 12 weeks, the scientists found no significant difference between the groups in appetite or weight loss.66
- What type of study is this?
- What are the experimental and control treatments in this study?
- Has blocking been used in this study? If so, what is the blocking variable?
- Has blinding been used in this study?
- Comment on whether or not we can make a causal statement, and indicate whether or not we can generalize the conclusion to the population at large.
- State whether each study is observational or experimental.
- You want to determine if cinnamon reduces a person’s insulin sensitivity. You give patients who are insulin sensitive a certain amount of cinnamon and then measure their glucose levels.
- A researcher wants to evaluate whether countries with lower fertility rates have a higher life expectancy. They collect the fertility rates and the life expectancies of countries around the world.
- A researcher wants to determine if diet and exercise together helps people lose weight over just exercising. The researcher solicits volunteers to be part of the study, and then randomly assigns the volunteers to be in the diet and exercise group or the exercise only group.
- You collect the weights of tagged fish in a tank. You then put an extra protein fish food in water for the fish and then measure their weight a month later.
- State whether each study is cross-sectional, retrospective or prospective.
- To see if there is a link between smoking and bladder cancer, patients with bladder cancer are asked if they currently smoke or if they smoked in the past.
- The Nurses Health Survey was a survey where nurses were asked to record their eating habits over a period of time, and their general health was recorded.
- A new study is underway to track the eating and exercise patterns of people at different time-periods in the future, and see who is afflicted with cancer later in life.
- The prices of generic items are compared to the prices of the equivalent named brand items.
- “Alkaline water: the secret to glowing skin” is the headline of an article that appeared in Scratch Magazine (February 10, 2021). The article claims that consuming alkaline water instead of tap water improves the hydration of skin and therefore, improves skin appearance. Consider the following hypothetical study design. Two hundred students were selected at random from those enrolled at a large college in California. Each student in the sample was asked whether they drank alkaline water more than once in a typical week. A skin specialist rated skin health for each student on a scale of 1 to 10. It was concluded that skin health was significantly better on average for the group that reported drinking alkaline water more than once a week than it was for the group that did not.
- Explain why this is an observational study.
- Was random selection used to create the sample? Explain.
- Did the study use random assignment to experimental groups? If so, explain what method was used to randomly assign students.
- Is the conclusion “drinking alkaline water leads to healthier skin” reasonable given the study description? Explain your answer.
- Is it reasonable to generalize conclusions from this study to some larger population? Justify your answer. If so, what population?
- Complete the design-layout table.
A student would like to know which of two possible routes is faster for the daily trips to school. Route 1 is shorter but has many traffic lights. Route 2 is a little longer but doesn’t have traffic lights. Each morning, a coin flip will be used to determine the route taken to school. The time it takes for the commute will be measured with a stopwatch. After approximately 15 trials on each route, the average time for each will be compared.
| Research Design Table | |
|---|---|
| Research Question: | |
| Type of Research | Observational Study Observational Experiment Manipulative Experiment |
| What is the response variable? | |
| What is the parameter that will be calculated? | Mean Proportion |
| List potential confounding variables. | |
| Grouping/explanatory Variable (if present) | Levels: |
- Many parents believe that their small children get a bit hyperactive when they eat or drink sweets (candies, sugary sodas, etc.), and so do not let their kids have such things before nap time, for example. A pediatrician at Euphoria State University Teaching Hospital [ESUTH] thinks instead that it is the parents’ expectations about the effects of sugar which cause their children to become hyperactive, and not the sugar at all.
- Describe a randomized, placebo-controlled, double-blind experiment which would collect data about this ESUTH pediatrician’s hypothesis. Make sure you are clear about both which part of your experimental procedure addresses each of those important components of good experimental design.
- Is the experiment you described in the previous exercise an ethical one? What must the ESUTH pediatrician do before, during, and after the experiment to make sure it is ethical? Make sure you discuss (at least) the checklist of ethical guidelines from this chapter and how each point applies to this particular experiment.

