Skip to main content
Statistics LibreTexts

2.6: Practice (Chapter 2)

  • Page ID
    59037
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\dsum}{\displaystyle\sum\limits} \)

    \( \newcommand{\dint}{\displaystyle\int\limits} \)

    \( \newcommand{\dlim}{\displaystyle\lim\limits} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \(\newcommand{\longvect}{\overrightarrow}\)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

    2.1: Measures of Central Tendency — Mean, Median, and Mode

    1. Define the mean, median, and mode in your own words. Which one is most affected by outliers?
    2. What symbol is used for the sample mean?
    3. What symbol is used or the population mean?
    4. Calculate the mean, median, and mode of the following dataset:
      {2, 5, 5, 7, 10, 14, 14, 14, 17, 20}
    5. Explain which measure of center is most appropriate for this dataset:
      {28, 30, 31, 34, 36, 38, 300}
      Why?
    6. A group of students reports how many hours they study per week: {10, 12, 12, 14, 16, 20}. Find the mean and the median. Which one better represents the group?
    7. Give an example of a small dataset (5–6 numbers) where:
      a) The mean is greater than the median
      b) The median is equal to the mode
    8. True or False: "The median is always one of the values in the dataset." Explain your reasoning.
    9. Describe a real-world situation where the mode might be more useful than the mean or median.
    10. Identify which value represents the sample mean and which value represents the claimed population mean.
      1. A recent article in a college newspaper stated that college students get an average of 5.5 hrs of sleep each night. A student who was skeptical about this value decided to conduct a survey by randomly sampling 25 students. On average, the sampled students slept 6.25 hours per night.
      2. American households spent an average of about $52 in 2007 on Halloween merchandise such as costumes, decorations and candy. To see if this number had changed, researchers conducted a new survey in 2008 before industry numbers were reported. The survey included 1,500 households and found that average Halloween spending was $58 per household.
      3. The average GPA of students in 2001 at a private university was 3.37. A survey on a sample of 203 students from this university yielded an average GPA of 3.59 in Spring semester of 2012.
    11. Workers at a particular mining site receive an average of 35 days paid vacation, which is lower than the national average. The manager of this plant is under pressure from a local union to increase the amount of paid time off. However, he does not want to give more days off to the workers because that would be costly. Instead he decides he should fire 10 employees in such a way as to raise the average number of days off that are reported by his employees. In order to achieve this goal, should he fire employees who have the most number of days off, least number of days off, or those who have about the average number of days off?
    12. A factory quality control manager decides to investigate the percentage of defective items produced each day. Within a given work week (Monday through Friday) the percentage of defective items produced was 2%, 1.4%, 4%, 3%, 2.2%. Calculate the mean for these data.
    13. The following data show the lengths of boats moored in a marina. The data are ordered from smallest to largest: 16; 17; 19; 20; 20; 21; 23; 24; 25; 25; 25; 26; 26; 27; 27; 27; 28; 29; 30; 32; 33; 33; 34; 35; 37; 39; 40
      1. Calculate the mean.
      2. Identify the median.
      3. Identify the mode.
    14. Sixty-five randomly selected car salespersons were asked the number of cars they generally sell in one week. Fourteen people answered that they generally sell three cars; nineteen generally sell four cars; twelve generally sell five cars; nine generally sell six cars; eleven generally sell seven cars. Calculate the following:
      1. sample mean = \(\bar{x}\ = _______
      2. median = _______
      3. mode = _______
    15. A group of 10 children are on a scavenger hunt to find different color rocks. The results are shown in the Table 2.57 below. The column on the right shows the number of colors of rocks each child has. What is the mean number of rock colors?

      Table for the rocks.
      Child Rock colors
      1 5
      2 5
      3 6
      4 2
      5 4
      6 3
      7 7
      8 2
      9 1
      10 10
      1. A group of children are measured to determine the average height of the group. The results are in Table 2.58 below. What is the mean height of the group to the nearest hundredth of an inch?
        Table of the heights of the children
        Child Height in inches
        Adam 45.21
        Betina 39.45
        Chen 43.78
        Donna 48.76
        Edhas 37.39
        Fran 39.90
        George 45.56
        Heather 46.24
      2. A person compares prices for five automobiles. The results are in Table 2.59 (opens in new window). What is the mean price of the cars the person has considered?
        Prices in the table.
        Price
        $20,987
        $22,008
        $19,998
        $23,433
        $21,444
      3. When is it best to use the mode as a measure of center? Describe what type of data would lead you to choosing the mode over the mean or median.
      4. Suppose you are representing the employees at a large corporation during contract negotiations. You have a list of the salaries of all the employees at the corporation (The salaries include the many lower level employees and the few high paid management employees) and you plan to find a measure of center (e.g. mean, median, or mode). Which measure or center would you use to represent the employees in an effort to support your claim that the average salary of lower level employees is much lower than the national average (for lower level employees) and thus should be increased? Why is this the better choice?
      5. A standardized test is given to ten people at the beginning of the school year with the results given in Table 2.79 below. At the end of the year the same people were again tested.
        1. What is the average improvement?
        2. Does it matter if the means are subtracted, or if the individual values are subtracted?
      Table of the scores for the students
      Student Beginning score Ending score
      1 1100 1120
      2 980 1030
      3 1200 1208
      4 998 1000
      5 893 948
      6 1015 1030
      7 1217 1224
      8 1232 1245
      9 967 988
      10 988

      997

      1. A small class of 7 students has a mean grade of 82 on a test. If six of the grades are 80, 82,86, 90, 90, and 95, what is the other grade?
      2. A class of 20 students has a mean grade of 80 on a test. Nineteen of the students has a mean grade between 79 and 82, inclusive.
        1. What is the lowest possible grade of the other student?
        2. What is the highest possible grade of the other student?
      3. If the mean of 20 prices is $10.39, and 5 of the items with a mean of $10.99 are sampled, what is the mean of the other 15 prices?
      4. In a recent issue of the IEEE Spectrum, 84 engineering conferences were announced. Four conferences lasted two days. Thirty-six lasted three days. Eighteen lasted four days. Nineteen lasted five days. Four lasted six days. One lasted seven days. One lasted eight days. One lasted nine days. Let X = the length (in days) of an engineering conference. If you were planning an engineering conference, which would you choose as the length of the conference: mean; median; or mode? Explain why you made that choice.
      5. Create a sample data set of size 𝑛 =3 for which the mean ¯𝑥 is greater than the median ˜𝑥.
      6. Create a sample data set of size 𝑛 =3 for which the mean ¯𝑥 is less than the median ˜𝑥.
      7. Create a sample data set of size 𝑛 =4 for which the mean ¯𝑥, the median ˜𝑥, and the mode are all identical.
      8. Create a sample data set of size 𝑛 =4 for which the median ˜𝑥 and the mode are identical but the mean ¯𝑥 is different.
      9. Which measure of central tendency is most often used for returns on investment?

    2.2: Measures of Spread — Range, Variance, and Standard Deviation

    1. Define the following in your own words:
      1. Range
      2. Variance
      3. Standard deviation
    2. What symbol is used for the sample standard deviation?
    3. What symbol is used or the population population standard deviation?
    4. Find the range, variance, and standard deviation of this dataset:
      {4, 6, 9, 10, 11} (assume it's a sample)
    5. Why do we divide by \( n - 1 \) when calculating the sample variance?
    6. A small dataset consists of the values {5, 7, 8, 8, 10}. Without calculating, which will be larger: the standard deviation or the range? Why?
    7. How does increasing an outlier in a dataset affect the standard deviation? Give a conceptual explanation.
    8. Use the six-step process to calculate the sample standard deviation for the set {2, 4, 6, 8}.
    9. What does a standard deviation of 0 mean? Give an example of a dataset with standard deviation equal to 0.
    10. In a class of 25 students, 24 of them took an exam in class and 1 student took a make-up exam the following day. The professor graded the rst batch of 24 exams and found an average score of 74 points with a standard deviation of 8.9 points. The student who took the make-up the following day scored 64 points on the exam.
      1. Does the new student's score increase or decrease the average score?
      2. What is the new average?
      3. Does the new student's score increase or decrease the standard deviation of the scores?
    11. A random sample of data from 5 smokers is provided below.
      Sample of the collected information.
      gender age maritalStatus grossIncome smoke amtWeekends amtWeekdays

      Female

      Male

      Female

      Female

      Female

      51

      24

      33

      17

      76

      Married

      Single

      Married

      Single

      Married

      $2,600 to $5,200

      $10,400 to $15,600

      $10,400 to $15,600

      $5,200 to $10,400

      $5,200 to $10,400

      Yes

      Yes

      Yes

      Yes

      Yes

      20 cig/day

      20 cig/day

      20 cig/day

      20 cig/day

      20 cig/day

      20 cig/day

      15 cig/day

      10 cig/day

      15 cig/day

      20 cig/day

      1. Find the mean amount of cigarettes smoked on weekdays and weekends by these 5 respondents.
      2. Find the standard deviation of the amount of cigarettes smoked on weekdays and on weekends by these 5 respondents. Is the variability higher on weekends or on weekdays?
    12. A factory quality control manager decides to investigate the percentage of defective items produced each day. Within a given work week (Monday through Friday) the percentage of defective items produced was 2%, 1.4%, 4%, 3%, 2.2%. Calculate the standard deviation for these data.
    13. For each part, compare distributions (1) and (2) based on their means and standard deviations. You do not need to calculate these statistics; simply state how the means and the standard deviations compare. Make sure to explain your reasoning. Hint: It may be useful to sketch dot plots of the distributions.
      1. (1) 3, 5, 5, 5, 8, 11, 11, 11, 13; (2) 3, 5, 5, 5, 8, 11, 11, 11, 20
      2. (1) -20, 0, 0, 0, 15, 25, 30, 30; (2) -40, 0, 0, 0, 15, 25, 30, 30
      3. (1) 0, 2, 4, 6, 8, 10; (2) 20, 22, 24, 26, 28, 30
      4. (1) 100, 200, 300, 400, 500; (2) 0, 50, 300, 550, 600
    14. If a set of numbers has a standard deviation of zero, what can you say about the numbers?
    15. If a set of grades for a class has a large range and a small standard deviation, what can you say about the class? Include an interpretation that is specific to grades in a class.
    16. Suppose you wish to invest money safely and are trying to decide between stock options in two companies. The average rates of return are the same for both companies but, one company has a much larger standard deviation of its rates of return. Which company should you invest in? Why is this the better choice?

    The population parameters below describe the full-time equivalent number of students (FTES) each year at Lake Tahoe Community College over a 29-year period.

    μ = 1000 FTES

    median = 1,014 FTES

    σ = 474 FTES

    first quartile = 528.5 FTES

    third quartile = 1,447.5 FTES

    n = 29 years

    1. A sample of 11 years is taken. About how many are expected to have a FTES of 1014 or above? Explain how you determined your answer.
    2. 75% of all years have an FTES:
      1. at or below: _____
      2. at or above: _____
    3. The population standard deviation = _____
    4. What percent of the FTES were from 528.5 to 1447.5? How do you know?
    5. What is the IQR? What does the IQR represent?
    6. How many standard deviations away from the mean is the median?

    Additional Information: The population for a specific six-year period was given in an updated report. The data are reported here.

    Table that houses the data for the above problem.
    Year 1 2 3 4 5 6
    Total FTES 1,585 1,690 1,735 1,935 2,021 1,890
    1. Calculate the mean, median, standard deviation, the first quartile, the third quartile and the IQR. Round to one decimal place.
    2. Compare the IQR for the FTES for the previous 29-year period with the IQR for the six-year period shown in Table 2.80. Why do you suppose the IQRs are so different?
    3. Three students were applying to the same graduate school. They came from schools with different grading systems. Which student had the best GPA when compared to other students at his school? Explain how you determined your answer.
    Table of sudent progress for the question
    Student GPA School Average GPA School Standard Deviation
    Thuy 2.7 3.2 0.8
    Vichet 87 75 20
    Kamala 8.6 8 0.4
    1. A music school has budgeted to purchase three musical instruments. They plan to purchase a piano costing $3,000, a guitar costing $550, and a drum set costing $600. The mean cost for a piano is $4,000 with a standard deviation of $2,500. The mean cost for a guitar is $500 with a standard deviation of $200. The mean cost for drums is $700 with a standard deviation of $100. Which cost is the lowest, when compared to other instruments of the same type? Which cost is the highest when compared to other instruments of the same type. Justify your answer.
    2. An elementary school class ran one mile with a mean of 11 minutes and a standard deviation of three minutes. Rachel, a student in the class, ran one mile in eight minutes. A junior high school class ran one mile with a mean of nine minutes and a standard deviation of two minutes. Kenji, a student in the class, ran 1 mile in 8.5 minutes. A high school class ran one mile with a mean of seven minutes and a standard deviation of four minutes. Nedda, a student in the class, ran one mile in eight minutes.
      1. Why is Kenji considered a better runner than Nedda, even though Nedda ran faster?
      2. Who is the fastest runner with respect to their class? Explain why.
    3. Create a sample data set of size 𝑛 =3 for which the range is 0 and the sample mean is 2.
    4. Create a sample data set of size 𝑛 =3 for which the sample variance is 0 and the sample mean is 1.

    2.3: The Five-Number Summary and Interquartile Range

    1. List the five numbers that make up the five-number summary and explain what each one represents.
    2. Given the dataset:
      {45, 50, 52, 55, 58, 60, 63, 65, 70, 75}
      Find the five-number summary and the IQR.
    3. Why is the IQR considered a "robust" measure of spread?
    4. A dataset has an IQR of 20. If Q1 = 40, what is Q3?
    5. Describe a situation where knowing the five-number summary would be more helpful than knowing just the mean and standard deviation.
    6. Explain how to find the IQR using only the sorted dataset and no formulas.
    7. True or False: "The median is always the average of Q1 and Q3." Explain your reasoning.
    8. For each part, compare distributions (1) and (2) based on their medians and IQRs. You do not need to calculate these statistics; simply state how the medians and IQRs compare. Make sure to explain your reasoning.
      1. (1) 3, 5, 6, 7, 9; (2) 3, 5, 6, 7, 20
      2. (1) 3, 5, 6, 7, 9; (2) 3, 5, 8, 7, 9
      3. (1) 1, 2, 3, 4, 5; (2) 6, 7, 8, 9, 10
      4. (1) 0, 10, 50, 60, 100; (2) 0, 100, 500, 600, 1000
    9. For each of the following, describe whether you expect the distribution to be symmetric, right skewed, or left skewed. Also specify whether the mean or median would best represent a typical observation in the data, and whether the variability of observations would be best represented using the standard deviation or IQR.
      1. Housing prices in a country where 25% of the houses cost below $350,000, 50% of the houses cost below $450,000, 75% of the houses cost below $1,000,000 and there are a meaningful number of houses that cost more than $6,000,000.
      2. Housing prices in a country where 25% of the houses cost below $300,000, 50% of the houses cost below $600,000, 75% of the houses cost below $900,000 and very few houses that cost more than $1,200,000.
      3. Number of alcoholic drinks consumed by college students in a given week.
      4. Annual salaries of the employees at a Fortune 500 company.
    10. For each of the quartiles Q1, Q2, & Q3 write a sentence using the definition of a percentile to interpret it. (Hint: A number x is at the kth percentile if k% of the data in the set is less than or equal to x.
    11. Jesse was ranked 37th in his graduating class of 180 students. At what percentile is their ranking?
    12. For runners in a race, a low time means a faster run. The winners in a race have the shortest running times.
      1. Is it more desirable to have a finish time with a high or a low percentile when running a race?
      2. Is it more desirable to have a speed with a high or a low percentile when running a race?
    13. The 20th percentile of run times in a particular race is 5.2 minutes. Write a sentence interpreting the 20th percentile in the context of the situation.
    14. A bicyclist in the 90th percentile of a bicycle race completed the race in 1 hour and 12 minutes. Are they among the fastest or slowest cyclists in the race? Write a sentence interpreting the 90th percentile in the context of the situation.
    15. The 40th percentile of speeds in a particular race is 7.5 miles per hour. Write a sentence interpreting the 40th percentile in the context of the situation.
    16. On an exam, would it be more desirable to earn a grade with a high or low percentile? Explain.
    17. Mina is waiting in line at the Department of Motor Vehicles (DMV). Her wait time of 32 minutes is the 85th percentile of wait times. Is that good or bad? Write a sentence interpreting the 85th percentile in the context of this situation.
    18. In a survey collecting data about the salaries earned by recent college graduates, Li found that her salary was in the 78th percentile. Should Li be pleased or upset by this result? Explain.
    19. In a study collecting data about the repair costs of damage to automobiles in a certain type of crash tests, a certain model of car had $1,700 in damage and was in the 90th percentile. Should the manufacturer and the consumer be pleased or upset by this result? Explain and write a sentence that interprets the 90th percentile in the context of this problem.
    20. The University of Colorado has two criteria used to set admission standards for students to be admitted to a college in the UC system:
      1. Students' GPAs and scores on standardized tests (SATs and ACTs) are entered into a formula that calculates an "admissions index" score. The admissions index score is used to set eligibility standards intended to meet the goal of admitting the top 12% of high school students in the state. In this context, what percentile does the top 12% represent?
      2. Students whose GPAs are at or above the 96th percentile of all students at their high school are eligible (called eligible in the local context), even if they are not in the top 12% of all students in the state. What percentage of students from each high school are "eligible in the local context"?
    21. Suppose that you are buying a house. You and your realtor have determined that the most expensive house you can afford is the 34th percentile. The 34th percentile of housing prices is $240,000 in the town you want to move to. In this town, can you afford 34% of the houses or 66% of the houses?
    22. Use the following information to find the statistics below. Sixty-five randomly selected car salespersons were asked the number of cars they generally sell in one week. Fourteen people answered that they generally sell three cars; nineteen generally sell four cars; twelve generally sell five cars; nine generally sell six cars; eleven generally sell seven cars.
      1. First quartile = _______
      2. Second quartile = median = 50th percentile = _______
      3. Third quartile = _______
      4. Interquartile range (IQR) = _____ – _____ = _____
      5. 10th percentile = _______
      6. 70th percentile = _______
    23. Your younger brother comes home one day after taking a science test. He says that someone at school told him that "60% of the students in the class scored above the median test grade." What is wrong with this statement? What if he said "60% of the students scored below the mean?"

    2.4: Identifying Outliers with Boxplots

    1. What rule do we use to identify outliers using the IQR? Write out both conditions.
    2. A dataset has Q1 = 25 and Q3 = 45. Use the 1.5 × IQR rule to determine the bounds for outliers.
    3. Using the rule from the previous question, determine if the value 5 is an outlier.
    4. Draw a quick sketch of a boxplot for the five-number summary below:
      {10, 20, 30, 40, 50}
    5. In a boxplot, what does a long right whisker suggest about the data?
    6. Choose a variable from your semester project dataset. List its five-number summary and indicate whether it has any outliers using the IQR method.
    7. Describe one strength and one limitation of boxplots as visualization tools.
    8. Create a box plot for the five number summary provided below.
      Table of the summary for problem 8
      Min Q1 Q2 (Median) Q3 Max
      57 72.5 78.5 82.5 94
    9. What can you say about a data set when the “box” in the box plot is very wide but the “whiskers” do not go out very far from the box?
    10. Why is it important to identify outliers?
    11. Give an example of a situation where an outlier might be removed from a dataset
    12. Give an example of a situation where an outlier from a dataset needs to be emphasized.
    13. Given the following box plot:
      Box-plot for question 13. The min is 0, Q1 is 2, the median 10, Q3 is 12, and the max 13.
      1. Which quarter has the smallest spread of data? What is that spread?
      2. Which quarter has the largest spread of data? What is that spread?
      3. Find the interquartile range (IQR).
      4. Are there more data in the interval 5–10 or in the interval 10–13? How do you know this?
      5. Which interval has the fewest data in it? How do you know this?
    14. The following box plot shows the U.S. population for 1990, the latest available year.
      A box plot with values from 0 to 105, with Q1 at 17, M at 33, and Q3 at 50.
      1. Are there fewer or more children (age 17 and under) than senior citizens (age 65 and over)? How do you know?
      2. 12.6% are age 65 and over. Approximately what percentage of the population are working age adults (above age 17 to age 65)?
    15. In a survey of 20-year-olds in China, Germany, and the United States, people were asked the number of countries they had visited in their lifetime. The following box plots display the results.
      Three box plots
      1. In complete sentences, describe what the shape of each box plot implies about the distribution of the data collected.
      2. Have more Americans or more Germans surveyed been to over eight foreign countries?
      3. Compare the three box plots. What do they imply about the foreign travel of 20-year-old residents of the three countries when compared to each other?
    16. Given the following box plot, answer the questions.
      Box plot for question 16.
      1. Think of an example (in words) where the data might fit into the above box plot. In 2–5 sentences, write down the example.
      2. What does it mean to have the first and second quartiles so close together, while the second to third quartiles are far apart?
    17. Twenty-five randomly selected students were asked the number of movies they watched the previous week. Construct a box plot of the data.
      Table for number of movies and the frequency.
      # of movies Frequency
      0 5
      1 9
      2 6
      3 4
      4 1

    2.6: Practice (Chapter 2) is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.

    • Was this article helpful?