Skip to main content
Statistics LibreTexts

2.1: Organizing and Graphing Qualitative Data

  • Page ID
    26024
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    Learning Objectives

    By the end of this chapter, the student should be able to:

    • Display data graphically and interpret graphs: stemplots, histograms, and box plots.
    • Recognize, describe, and calculate the measures of location of data: quartiles and percentiles.
    • Recognize, describe, and calculate the measures of the center of data: mean, median, and mode.
    • Recognize, describe, and calculate the measures of the spread of data: variance, standard deviation, and range.

    Once you have collected data, what will you do with it? Data can be described and presented in many different formats. For example, suppose you are interested in buying a house in a particular area. You may have no clue about the house prices, so you might ask your real estate agent to give you a sample data set of prices. Looking at all the prices in the sample often is overwhelming. A better way might be to look at the median price and the variation of prices. The median and variation are just two ways that you will learn to describe data. Your agent might also provide you with a graph of the data.

    alt
    Figure \(\PageIndex{1}\): When you have large amounts of data, you will need to organize it in a way that makes sense. These ballots from an election are rolled together with similar ballots to keep them organized. (credit: William Greeson)

    In this chapter, you will study numerical and graphical ways to describe and display your data. This area of statistics is called "Descriptive Statistics." You will learn how to calculate, and even more importantly, how to interpret these measurements and graphs.

    A statistical graph is a tool that helps you learn about the shape or distribution of a sample or a population. A graph can be a more effective way of presenting data than a mass of numbers because we can see where data clusters and where there are only a few data values. Newspapers and the Internet use graphs to show trends and to enable readers to compare facts and figures quickly. Statisticians often graph data first to get a picture of the data. Then, more formal tools may be applied.

    Some of the types of graphs that are used to summarize and organize data are the dot plot, the bar graph, the histogram, the stem-and-leaf plot, the frequency polygon (a type of broken line graph), the pie chart, and the box plot. In this chapter, we will briefly look at stem-and-leaf plots, line graphs, and bar graphs, as well as frequency polygons, and time series graphs. Our emphasis will be on histograms and box plots.

    Qualitative Data Discussion

    Below are tables comparing the number of part-time and full-time students at De Anza College and Foothill College enrolled for the spring 2010 quarter. The tables display counts (frequencies) and percentages or proportions (relative frequencies). The percent columns make comparing the same categories in the colleges easier. Displaying percentages along with the numbers is often helpful, but it is particularly important when comparing sets of data that do not have the same totals, such as the total enrollments for both colleges in this example. Notice how much larger the percentage for part-time students at Foothill College is compared to De Anza College.

    De Anza College   Foothill College
    Table \(\PageIndex{1}\): Fall Term 2007 (Census day)
      Number Percent     Number Percent
    Full-time 9,200 40.9%   Full-time 4,059 28.6%
    Part-time 13,296 59.1%   Part-time 10,124 71.4%
    Total 22,496 100%   Total 14,183 100%

    Tables are a good way of organizing and displaying data. But graphs can be even more helpful in understanding the data. There are no strict rules concerning which graphs to use. Two graphs that are used to display qualitative data are pie charts and bar graphs.

    • In a pie chart, categories of data are represented by wedges in a circle and are proportional in size to the percent of individuals in each category.
    • In a bar graph, the length of the bar for each category is proportional to the number or percent of individuals in each category. Bars may be vertical or horizontal.
    • A Pareto chart consists of bars that are sorted into order by category size (largest to smallest).

    Look at Figures \(\PageIndex{3}\) and \(\PageIndex{4}\) and determine which graph (pie or bar) you think displays the comparisons better.

    alt
    alt
    Figure \(\PageIndex{3}\): Pie Charts

    It is a good idea to look at a variety of graphs to see which is the most helpful in displaying the data. We might make different choices of what we think is the “best” graph depending on the data and the context. Our choice also depends on what we are using the data for.

    alt
    Figure \(\PageIndex{4}\): Bar chart

    Percentages That Add to More (or Less) Than 100%

    Sometimes percentages add up to be more than 100% (or less than 100%). In the graph, the percentages add to more than 100% because students can be in more than one category. A bar graph is appropriate to compare the relative size of the categories. A pie chart cannot be used. It also could not be used if the percentages added to less than 100%.

    Characteristic/Category Percent
    Table \(\PageIndex{2}\): De Anza College Spring 2010
    Full-Time Students 40.9%
    Students who intend to transfer to a 4-year educational institution 48.6%
    Students under age 25 61.0%
    TOTAL 150.5%
    alt
    Figure \(\PageIndex{2}\): Bar chart of data in Table \(\PageIndex{2}\).

    Omitting Categories/Missing Data

    The table displays Ethnicity of Students but is missing the "Other/Unknown" category. This category contains people who did not feel they fit into any of the ethnicity categories or declined to respond. Notice that the frequencies do not add up to the total number of students. In this situation, create a bar graph and not a pie chart.

    Table \(\PageIndex{2}\): Ethnicity of Students at De Anza College Fall Term 2007 (Census Day)
      Frequency Percent
    Asian 8,794 36.1%
    Black 1,412 5.8%
    Filipino 1,298 5.3%
    Hispanic 4,180 17.1%
    Native American 146 0.6%
    Pacific Islander 236 1.0%
    White 5,978 24.5%
    TOTAL 22,044 out of 24,382 90.4% out of 100%
    alt
    Figure \(\PageIndex{3}\): Enrollment of De Anza College (Spring 2010)

    The following graph is the same as the previous graph but the “Other/Unknown” percent (9.6%) has been included. The “Other/Unknown” category is large compared to some of the other categories (Native American, 0.6%, Pacific Islander 1.0%). This is important to know when we think about what the data are telling us.

    This particular bar graph in Figure \(\PageIndex{4}\) can be difficult to understand visually. The graph in Figure \(\PageIndex{5}\) is a Pareto chart. The Pareto chart has the bars sorted from largest to smallest and is easier to read and interpret.

    alt
    Figure \(\PageIndex{4}\): Bar Graph with Other/Unknown Category
    alt
    Figure \(\PageIndex{5}\): Pareto Chart With Bars Sorted by Size

    Pie Charts: No Missing Data

    The following pie charts have the “Other/Unknown” category included (since the percentages must add to 100%). The chart in Figure \(\PageIndex{6}\) is organized by the size of each wedge, which makes it a more visually informative graph than the unsorted, alphabetical graph in Figure \(\PageIndex{6}\).

    alt
    alt
    Figure \(\PageIndex{6}\).

    Contributors and Attributions

     
    • Barbara Illowsky and Susan Dean (De Anza College) with many other contributing authors. Content produced by OpenStax College is licensed under a Creative Commons Attribution License 4.0 license. Download for free at http://cnx.org/contents/30189442-699...b91b9de@18.114.

     


    This page titled 2.1: Organizing and Graphing Qualitative Data is shared under a CC BY 4.0 license and was authored, remixed, and/or curated by OpenStax via source content that was edited to the style and standards of the LibreTexts platform; a detailed edit history is available upon request.