Skip to main content

Registration is now open for this year's LibreFest! Join us virtually the week of July 13.

Register here
Statistics LibreTexts

2.2: Organizing Data - Frequency Distributions

  • Page ID
    10918
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\dsum}{\displaystyle\sum\limits} \)

    \( \newcommand{\dint}{\displaystyle\int\limits} \)

    \( \newcommand{\dlim}{\displaystyle\lim\limits} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \(\newcommand{\longvect}{\overrightarrow}\)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

    Categorical Frequency Distribution

    Once you have a set of data, you will need to organize it so that you can analyze how frequently each datum occurs in the set. However, when calculating the frequency, you may need to round your answers so that they are as precise as possible.

    Note: Answers and Rounding Off

    Traditional Rounding Rules:

    • Identify the digit in the place value you ar asked to round to and look at the digit to its right.
      • If the digit on the right is 5 or more, then round the requested place value digit up by 1 and the digits after it become zeros.
        • For example: 0.3679 rounded to the hundredths place (or 2 decimal places) means we go to the 6, look at the number to its right, 7, and compare it to 5. It is 5 or more, so we round up, changing 6 into a 7 and the rest of the values become zeros. 0.3679 is approximately 0.3700. We call this rounding up because the final result is numerically larger than the original value. Note: With decimal numbers, we won't typically write the zeros after the value in the position we rounded to, so 0.3700 becomes 0.37, which is equivalent.
      • If the digit on the right is 4 or less, then leave the requested place value alone and digits after it become zeros.  
        • For example: 0.59218 rounded to the thousandths place (or 3 decimal places) means we go to the 2, look at the number to its right, 1, and compare it to 5. It is less than 5, so we round down, leaving 2 as is and the rest of values become zeros. 0.59218 is approximately 0.59200. We call this rounding down because the final result is numerically smaller than the original value. Similarly, we would write 0.59200 as 0.592, which is equivalent. 

    For this class, you will use a combination of traditional rounding rules as well as some "always round up" rules. These rules are context specific and will be reviewed as needed. However, some general rounding advice is:

    • For final calculations for most statistics, unless told otherwise, is to round final answer one more decimal place than was present in the original data. For example, the average of the three quiz scores four, six, and nine is 6.3, rounded off to the nearest tenth, because the data are whole numbers. Most answers will be rounded off in this manner.
    • Round only the final answer (unless directed otherwise). In general, do not round off any intermediate results. Doing so introduces a round off error into subsequent calculations.
    • If it becomes necessary to round off intermediate results, carry them to at least twice as many decimal places as the final answer. For example, if you want 2 decimal places in your final answer, then intermediate results should be carried out to four decimal places. 

    It is not necessary to reduce most fractions in this course. In Probability and Counting, it can be more helpful to leave an answer as an unreduced fraction. Use your instructor's guidance regarding whether to reduce fractions.

    Definition: Categorical Frequency Distribution

    A categorical frequency distribution is a table to organize data that can be placed in specific categories, such as nominal- or ordinal

    A frequency is the number of times a value of the data occurs.

    Twenty students were asked how many hours they worked per day. Their responses, in hours, are as follows:

    5; 6; 3; 3; 2; 4; 7; 5; 2; 3; 5; 6; 5; 4; 4; 3; 5; 2; 5; 3.

    Table lists the different data values in ascending order and their frequencies.

    Table \(\PageIndex{1}\): Frequency Table of Student Work Hours
    DATA VALUE FREQUENCY
    2 3
    3 5
    4 3
    5 6
    6 2
    7 1

    According to Table Table \(\PageIndex{1}\), there are three students who work two hours, five students who work three hours, and so on. The sum of the values in the frequency column, 20, represents the total number of students included in the sample. 

    Relative Frequencies 

    Definition: Relative frequencies

    A relative frequency is the ratio (fraction or proportion) of the number of times a value of the data occurs in the set of all outcomes to the total number of outcomes. To find the relative frequencies, divide each frequency by the total number of data values.

    Relative frequency =\(\frac{f}{n}\) where \(f\) is the frequency of the class and \(n\) is the total number of data values. If needed, rounding to at least three decimal places is suggested.

    Relative frequencies can be written as fractions, percents, or decimals.

    In this example, for the first category, our calculation is divided by 20 since we have 20 total students.

    Relative frequency =\(\frac{5}{20}\)

    Note: Percentages

    Remember that "Percent" means "Per Hundred."

    To convert a decimal value to a percentage, you multiply by 100%. 

    Ex: 0.15 = \(0.15 \cdot 100 \% = 15 \%\)

    To convert a percentage to a decimal value, you divide by 100%. 

    Ex: 25% = \(\frac{25 \%}{100 \%}=0.25\)

    Table \(\PageIndex{2}\): Frequency Table of Student Work Hours with Relative Frequencies
    DATA VALUE FREQUENCY RELATIVE FREQUENCY
    2 3 \(\frac{3}{20}\) or 0.15 or 15%
    3 5 \(\frac{5}{20}\) or 0.25 or 25%
    4 3 \(\frac{3}{20}\) or 0.15 or 15%
    5 6 \(\frac{6}{20}\) or 0.30 or 30%
    6 2 \(\frac{2}{20}\) or 0.10 or 10%
    7 1 \(\frac{1}{20}\) or 0.05 or 5%

    The sum of the values in the relative frequency column of Table \(\PageIndex{2}\) is \(\frac{20}{20}\), or 1.

    Cumulative Relative Frequencies

    Definition: Cumulative relative frequency

    Cumulative relative frequency is the accumulation of the previous relative frequencies. To find the cumulative relative frequencies, add all the previous relative frequencies to the relative frequency for the current row, as shown in Table \(\PageIndex{3}\).

    Table \(\PageIndex{3}\): Frequency Table of Student Work Hours with Relative and Cumulative Relative Frequencies
    DATA VALUE FREQUENCY RELATIVE FREQUENCY CUMULATIVE RELATIVE FREQUENCY
    2 3 \(\frac{3}{20}\) or 0.15 0.15
    3 5 \(\frac{5}{20}\) or 0.25 0.15 + 0.25 = 0.40
    4 3 \(\frac{3}{20}\) or 0.15 0.40 + 0.15 = 0.55
    5 6 \(\frac{6}{20}\) or 0.30 0.55 + 0.30 = 0.85
    6 2 \(\frac{2}{20}\) or 0.10 0.85 + 0.10 = 0.95
    7 1 \(\frac{1}{20}\) or 0.05 0.95 + 0.05 = 1.00

    The last entry of the cumulative relative frequency column is one, indicating that one hundred percent of the data has been accumulated.

    Note: Rounding Errors

    Due to rounding, the relative frequency column may not always sum to one, but it should be very close. Typically within \(\pm\)0.001. Since the cumulative relative frequency also adds the rounded relative frequencies, its final value may not sum to exactly one either.

    Grouped Frequency Distribution

    Definition: Grouped Frequency Distribution

    A grouped frequency distribution is a table to organize data in which the data are grouped into classes with more than one unit in width. Used when the data is large, or it makes sense to group the data. Group frequency distributions are used for quantitative data

    Table \(\PageIndex{4}\) represents the heights, in inches, of a sample of 100 male semiprofessional soccer players.

    Table \(\PageIndex{4}\): Frequency Table of Soccer Player Height
    HEIGHTS (INCHES) FREQUENCY RELATIVE FREQUENCY CUMULATIVE RELATIVE FREQUENCY
    59.95–61.95 5 \(\frac{5}{100} = 0.05\) \(0.05\)
    61.95–63.95 3 \(\frac{3}{100} = 0.03\) \(0.05 + 0.03 = 0.08\)
    63.95–65.95 15 \(\frac{15}{100} = 0.15\) \(0.08 + 0.15 = 0.23\)
    65.95–67.95 40 \(\frac{40}{100} = 0.40\) \(0.23 + 0.40 = 0.63\)
    67.95–69.95 17 \(\frac{17}{100} = 0.17\) \(0.63 + 0.17 = 0.80\)
    69.95–71.95 12 \(\frac{12}{100} = 0.12\) \(0.80 + 0.12 = 0.92\)
    71.95–73.95 7 \(\frac{7}{100} = 0.07\) \(0.92 + 0.07 = 0.99\)
    73.95–75.95 1 \(\frac{1}{100} = 0.01\) \(0.99 + 0.01 = 1.00\)
      Total = 100 Total = 1.00

    The data in this table have been grouped into the following intervals:

    • 61.95 to 63.95 inches
    • 63.95 to 65.95 inches
    • 65.95 to 67.95 inches
    • 67.95 to 69.95 inches
    • 69.95 to 71.95 inches
    • 71.95 to 73.95 inches
    • 73.95 to 75.95 inches

    In this sample, there are 5 players whose heights fall within the interval 59.95–61.95 inches, 3 players whose heights fall within the interval 61.95–63.95 inches, 15 players whose heights fall within the interval 63.95–65.95 inches, 40 players whose heights fall within the interval 65.95–67.95 inches, 17 players whose heights fall within the interval 67.95–69.95 inches, 12 players whose heights fall within the interval 69.95–71.95, 7 players whose heights fall within the interval 71.95–73.95, and 1 player whose heights fall within the interval 73.95–75.95.

    All heights fall between the endpoints of an interval and not at the endpoints. These types of endpoints are called class boundaries and are specifically chosen because it isn't possible that any of our data values will be equal to them since the data set that was used to create this table only had data written to one decimal place. 

    This example is used again in the next section.

    Reading tables

    The next section will explain in detail how to create a grouped frequency distribution given a raw data set. For now, we will focus on understanding the information presented in the tables and how to read them. 

    Example \(\PageIndex{1}\)

    Use the heights of the 100 male semiprofessional soccer players in Table \(\PageIndex{4}\). Fill in the blanks and check your answers.

    1. The percentage of heights that are from 67.95 to 71.95 inches is: ____.
    2. The percentage of heights that are from 67.95 to 73.95 inches is: ____.
    3. The percentage of heights that are more than 65.95 inches is: ____.
    4. The number of players in the sample who are between 61.95 and 71.95 inches tall is: ____.
    5. What kind of data are the heights?
    6. Describe how you could gather this data (the heights) so that the data are characteristic of all male semiprofessional soccer players.

    Remember, you count frequencies. To find the relative frequency, divide the frequency by the total number of data values. To find the cumulative relative frequency, add all of the previous relative frequencies to the relative frequency for the current row.

    Answer

    1. Identify the classes that have the requested starting and ending values. This is row 5: 67.95 - 69.95 and row 6: 69.95 - 71.95. Add the relative frequencies: \(0.17 + 0.12 = 0.29\). To covert to a percentage, multiply by 100%: \(0.29 (100 \%) = 29 \%\)
    2. The requested values include the previous two row, so all we need to do is add the information from row7: 71.95 - 73.95. Add the relative frequencies: \(0.29 + 0.07 = 0.36\). Covert to a percentage: \(0.36(100 \%) = 36 \%\)
    3. To find the percentage of heights that are more than a value, we add up all of the relative frequencies starting with the class containing the requested value. Remember, we are using class boundaries, data in the group 65.95 - 67.95 and beyond will all be bigger than 65.95. Add the relative frequencies, we get: \(0.40 + 0.17 + 0.12 + 0.07 + 0.01 =  0.77 = 77 \%\). Note: Since we know the total of the relative frequencies is 1.00, you may find it easier to use subtraction instead where we remove the three classes we don't need from the total: \(1.00 - 0.05 - 0.03 - 0.15 = 0.77 = 77 \%\). Subtracting what we don't need, or the complement, is something we will revisit in Probability and Counting.
    4. To find the number of players, we need to look at the frequency column. We will use the row that starts with 61.95 as a lower boundary and the row that ends with 71.95 as an upper boundary. Add all of the frequencies: \(3+15+40+17+12 = 87\). 
    5. Since these data values are meaningfully numeric and do not have to be whole numbers, this is a quantitative continuous data set. 
    6. Answers may vary on this, but to we have a random and unbiased list of semiprofessional soccer players, we need to use Simple Random, Systematic, Stratified or Cluster sampling. The method that makes the most sense is using a stratified technique using rosters from each team and choosing a random sample from each roster.
    Example \(\PageIndex{1}\)
    1. From the Table \(\PageIndex{4}\), find the percentage of heights for male semiprofessional soccer players that are less than 65.95 inches.
    2. Find the percentage of heights that fall between 61.95 and 65.95 inches.

    Answer

    1. If you look at the first, second, and third rows, the heights are all less than 65.95 inches. There are \(5 + 3 + 15 = 23\) players whose heights are less than 65.95 inches. The percentage of heights less than 65.95 inches is then \(\frac{23}{100}\) or 23%. This percentage is the cumulative relative frequency entry in the third row.
    2. Add the relative frequencies in the second and third rows: \(0.03 + 0.15 = 0.18\) or 18%.
    Exercise \(\PageIndex{1}\)

    Table \(\PageIndex{5}\) shows the amount, in inches, of annual rainfall in a sample of towns.

    Table \(\PageIndex{5}\):
    Rainfall (Inches) Frequency Relative Frequency Cumulative Relative Frequency
    2.95–4.97 6 \(\frac{6}{50} = 0.12\) \(0.12\)
    4.97–6.99 7 \(\frac{7}{50} = 0.14\) \(0.12 + 0.14 = 0.26\)
    6.99–9.01 15 \(\frac{15}{50} = 0.30\) \(0.26 + 0.30 = 0.56\)
    9.01–11.03 8 \(\frac{8}{50} = 0.16\) \(0.56 + 0.16 = 0.72\)
    11.03–13.05 9 \(\frac{9}{50} = 0.18\) \(0.72 + 0.18 = 0.90\)
    13.05–15.07 5 \(\frac{5}{50} = 0.10\) \(0.90 + 0.10 = 1.00\)
      Total = 50 Total = 1.00  
    1. Find the percentage of rainfall that is less than 9.01 inches.
    2. Find the percentage of rainfall that is between 6.99 and 13.05 inches.
    Answer
    1. \(0.56\) or \(56%\)
    2. \(0.30 + 0.16 + 0.18 = 0.64\) or \(64%\)
    Exercise \(\PageIndex{2}\)

    From Table \(\PageIndex{5}\), find the number of towns that have rainfall between 2.95 and 9.01 inches.

    Answer

    \(6 + 7 + 15 = 28\) towns

    Exercise \(\PageIndex{3}\)

    Table \(\PageIndex{5}\) represents the amount, in inches, of annual rainfall in a sample of towns. What fraction of towns surveyed get between 11.03 and 13.05 inches of rainfall each year?

    Answer

    \(\frac{9}{50}\)

    Example \(\PageIndex{3}\)

    Nineteen people were asked how many miles, to the nearest mile, they commute to work each day.

    The data are as follows: 2; 5; 7; 3; 2; 10; 18; 15; 20; 7; 10; 18; 5; 12; 13; 12; 4; 5; 10.

    Table \(\PageIndex{6}\) was produced:

    Table \(\PageIndex{6}\): Frequency of Commuting Distances
    DATA FREQUENCY RELATIVE FREQUENCY CUMULATIVE RELATIVE FREQUENCY
    3 3 \(\frac{3}{19}\) 0.1579
    4 1 \(\frac{1}{19}\) 0.2105
    5 3 \(\frac{3}{19}\) 0.1579
    7 2 \(\frac{2}{19}\) 0.2632
    10 3 \(\frac{3}{19}\) 0.4737
    12 2 \(\frac{2}{19}\) 0.7895
    13 1 \(\frac{1}{19}\) 0.8421
    15 1 \(\frac{1}{19}\) 0.8948
    18 1 \(\frac{1}{19}\) 0.9474
    20 1 \(\frac{1}{19}\) 1.0000
    1. Is the table correct? If it is not correct, what is wrong?
    2. If the table is incorrect, make the corrections.
    3. What fraction of the people surveyed commute five or seven miles?
    4. What fraction of the people surveyed commute 12 miles or more? Less than 12 miles? Between five and 13 miles (not including five and 13 miles)?

    Answer

    1. No.
      1. Easy item to check is the sum of the Frequency column. The frequency column sums to 18, not 19. So we missed some data. 
      2. Check the Categories. Looks like we missed 2 miles (should have 2) and miscounted for 3 miles (should only have 1) and 18 miles (should have 2).
      3. Strangely, the cumulative frequency column added to 1.0000 even though we did not have all 19 of our data values. But what also stands out is that we know each cumulative value should increase as you travel down the table (unless a frequency is 0). But in row 3, the value decreased. Something is definitely wrong.
    2. DATA FREQUENCY RELATIVE FREQUENCY CUMULATIVE RELATIVE FREQUENCY
      2 2 \(\frac{2}{19}\) 0.1053
      3 1 \(\frac{1}{19}\) 0.1579
      4 1 \(\frac{1}{19}\) 0.2105
      5 3 \(\frac{3}{19}\) 0.3684
      7 2 \(\frac{2}{19}\) 0.4737
      10 3 \(\frac{3}{19}\) 0.6316
      12 2 \(\frac{2}{19}\) 0.7369
      13 1 \(\frac{1}{19}\) 0.7895
      15 1 \(\frac{1}{19}\) 0.8421
      18 2 \(\frac{2}{19}\) 0.9474
      20 1 \(\frac{1}{19}\) 1.0000
    3. We need to look at the rows for 5 miles and 7 miles. Since "or" means that it could be either, we want to add those values together: \(\frac{3}{19}+ \frac{2}{19} = \frac{5}{19}\)
    4. We will need to do some more fraction addition.
      1. To find the fraction of people who commute 12 or more miles, we need to start with the row for 12 miles and add the relative frequencies for that row through 20 miles. \(\frac{7}{19}\)
      2. To find the fraction of people who commute lest than 12 miles, we could add all of the relative frequencies for rows with fewer than 12 miles, but we could also use the complement. If we know the total is \(\frac{19}{19}\) and take away what we found for 12 miles or more, \(\frac{7}{19}\), then we get \(\frac{12}{19}\)
      3. To find the fraction of people who commute between 5 and 13 miles (not including 5 and 13) we need to add the relative frequencies for 7 miles, 10 miles and 12 miles: \(\frac{7}{19}\)
    Example \(\PageIndex{4}\)

    Table \(\PageIndex{7}\) contains the total number of deaths worldwide as a result of earthquakes for the period from 2000 to 2012.

    Table \(\PageIndex{7}\):
    Year Total Number of Deaths
    2000 231
    2001 21,357
    2002 11,685
    2003 33,819
    2004 228,802
    2005 88,003
    2006 6,605
    2007 712
    2008 88,011
    2009 1,790
    2010 320,120
    2011 21,953
    2012 768
    Total 823,356

    Answer the following questions.

    1. What is the frequency of deaths measured from 2006 through 2009?
    2. What percentage of deaths occurred after 2009?
    3. What is the relative frequency of deaths that occurred in 2003 or earlier?
    4. What is the percentage of deaths that occurred in 2004?
    5. What kind of data are the numbers of deaths?
    6. The Richter scale is used to quantify the energy produced by an earthquake. Examples of Richter scale numbers are 2.3, 4.0, 6.1, and 7.0. What kind of data are these numbers?

    Answer

    1. To find the frequency, we need to add up the frequencies for 2006 through 2009: 97,118 Note: If we were asked for the relative frequency or percentage of deaths for 2006 to 2009, then we divide 97,118 by the total of 823,356 to get 0.11795... Rounded to three decimal places, we get 0.118 and converted to a percentage is 11.8%.
    2. To find the percentage of deaths after 2009, add the frequencies for 2009 to 2012 to get: 344,631. Divide that by the total number to 0.41856... Rounded to 3 decimal places, we get 0.419 and converted to a percentage is 41.9%.
    3. To find the relative frequency (percentage) of deaths that occurred in 2003 or earlier, we add up the Frequencies from 2000 to 2003 and then divide by the total:  67,092/823,356 = 0.08148... or 0.081 and converted to a percentage is 8.1%.
    4. Dive frequency in 2004 by the total to get 27.8%
    5. The number of deaths are meaningfully numeric, but can't take on a partial value, so those data values are Quantitative discrete.
    6. The Ricter scal values are meaningfully numeric and can take on non-whole number values, so those data values are Quantitative continuous.
    Exercise \(\PageIndex{4}\)

    Table \(\PageIndex{8}\) contains the total number of fatal motor vehicle traffic crashes in the United States for the period from 1994 to 2011.

    Table \(\PageIndex{8}\):
    Year Total Number of Crashes Year Total Number of Crashes
    1994 36,254 2004 38,444
    1995 37,241 2005 39,252
    1996 37,494 2006 38,648
    1997 37,324 2007 37,435
    1998 37,107 2008 34,172
    1999 37,140 2009 30,862
    2000 37,526 2010 30,296
    2001 37,862 2011 29,757
    2002 38,491 Total 653,782
    2003 38,477    

    Answer the following questions.

    1. What is the frequency of deaths measured from 2000 through 2004?
    2. What percentage of deaths occurred after 2006?
    3. What is the relative frequency of deaths that occurred in 2000 or before?
    4. What is the percentage of deaths that occurred in 2011?
    5. What is the cumulative relative frequency for 2006? Explain what this number tells you about the data.
    Answer
    1. 190,800 (29.2%)
    2. 24.9%
    3. 260,086/653,782 or 39.8%
    4. 4.6%
    5. 75.1% of all fatal traffic crashes for the period from 1994 to 2011 happened from 1994 to 2006.

    References

    1. “State & County QuickFacts,” U.S. Census Bureau. quickfacts.census.gov/qfd/download_data.html (accessed May 1, 2013).
    2. “State & County QuickFacts: Quick, easy access to facts about people, business, and geography,” U.S. Census Bureau. quickfacts.census.gov/qfd/index.html (accessed May 1, 2013).
    3. “Table 5: Direct hits by mainland United States Hurricanes (1851-2004),” National Hurricane Center, http://www.nhc.noaa.gov/gifs/table5.gif (accessed May 1, 2013).
    4. “Levels of Measurement,” infinity.cos.edu/faculty/wood...ata_Levels.htm (accessed May 1, 2013).
    5. Courtney Taylor, “Levels of Measurement,” about.com, http://statistics.about.com/od/Helpa...easurement.htm (accessed May 1, 2013).
    6. David Lane. “Levels of Measurement,” Connexions, http://cnx.org/content/m10809/latest/ (accessed May 1, 2013).

    Glossary

    Categorical Frequency Distribution
    A table to organize data that can be placed in specific categories, such as nominal- or ordinal-level data.
    Cumulative Relative Frequency
    The term applies to an ordered set of observations from smallest to largest. The cumulative relative frequency is the sum of the relative frequencies for all values that are less than or equal to the given value.
    Frequency
    the number of times a value of the data occurs
    Relative Frequency
    the ratio of the number of times a value of the data occurs in the set of all outcomes to the number of all outcomes to the total number of outcomes

    Contributors and Attributions

    • Barbara Illowsky and Susan Dean (De Anza College) with many other contributing authors. Content produced by OpenStax College is licensed under a Creative Commons Attribution License 4.0 license. Download for free at http://cnx.org/contents/30189442-699...b91b9de@18.114.


    This page titled 2.2: Organizing Data - Frequency Distributions is shared under a CC BY 4.0 license and was authored, remixed, and/or curated by OpenStax via source content that was edited to the style and standards of the LibreTexts platform.