Skip to main content
Statistics LibreTexts

11.2: Bar Graphs

  • Page ID
    64749

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\dsum}{\displaystyle\sum\limits} \)

    \( \newcommand{\dint}{\displaystyle\int\limits} \)

    \( \newcommand{\dlim}{\displaystyle\lim\limits} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \(\newcommand{\longvect}{\overrightarrow}\)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

    In the previous chapter we found that qualitative data could be summarized using frequency distributions that reported frequencies, the number of observations in each category, relative frequencies, the proportion of observations in each category, and percentages, the percentage of values for each category of dataset. A bar graph takes the information presented in a frequency distribution and presents it visually.

    Definition: Bar Graph

    A bar graph of a frequency distribution is a graph whose horizontal axis shows the category headings of a qualitative variable. The vertical axis shows the observed frequency, relative frequency, percentage, or numerical summary (mean, median, etc.) for each category, which correspond to the bar heights. The bars are separated by spaces.

    We begin by considering the situation where we have a single categorical variable. To motivate the discussion, consider the data on race and debt from Table 10.2. Earlier we had computed the mean, median, and standard deviation of the amount of debt for each category of race. First consider constructing a bar graph for the means for each category of race. The horizontal axis of this graph will have equally spaced areas for each of the categories of race. The vertical axis corresponds to the observed mean debt for the categories. This axis is numerical and should start at 0 and should end near the largest mean debt amount. The mean debt values for the categories are 26.9 (African American), 7.6 (Asian), 14.5 (Hispanic), and 21.9 (white). Since the largest mean debt is near 30, we will start the vertical axis at 0 and end the range at 30, with marks at 0, 5, 10, 15, 20, 25, and 30. The bar graph in Figure \(\PageIndex{1}\) provides a quick visual summary of the mean values. Not only can one quickly observe that African Americans have the largest mean debt, and that Asians have significantly lower debt, but the relative sizes of the means can be easily observed.

    clipboard_e7129339878c364f580a1fc5c2a096b5a.png
    Figure \(\PageIndex{1}\): Bar graph of mean debt by race information given in Table 10.2 (Public domain figure created by Alan M. Polansky).

    As we discussed earlier, the mean is not a robust measure of location, meaning it can be skewed by outliers. Therefore, since debt can be extremely high and only as low as 0, we consider the median debt as well. A bar graph of the median debt for each race category is plotted in Figure \(\PageIndex{2}\). In this plot the overall general trend of debt by race appears the same here with Asians having the lowest debt and African Americans having the highest.

    clipboard_eabcb5f424b56160a5ed16924a7f8b44f.png
    Figure \(\PageIndex{2}\): Bar graph of median debt by race information given in Table 10.2 (Public domain figure created by Alan M. Polansky).

    When constructing the horizontal axis of a bar graph, the order of the categories must be specified. If the categories are measured on a scale that is at least an ordinal, the categories should be arranged in the order that is specified by their structure. For example, if the categories are based on a Likert scale of 1 to 5, then the categories on the horizontal axis should respect that ordering. When the measurement scale of the categories is nominal, there is no natural ordering to the categories and one ordering is as valid as the next. When observing bar graph, one should be careful not to over interpret trends in the bars when the categories are based on a nominal measurement scale. As an example, consider the bar graph shown in Figure \(\PageIndex{3}\), which is based on the same data used to construct the bar graph in Figure \(\PageIndex{2}\). The bar graph in Figure \(\PageIndex{3}\) has the categories on the horizontal axis arranged in a different order. Note that there appears to be an increasing trend in the bars. While the bars reflect the correct levels for the medians for the different categories, the idea of a trend is illusory as the ordering of the categories for this data have no meaning.

    clipboard_e2dfb9617e1f8b6c5fce6ba86838bcf33.png
    Figure \(\PageIndex{3}\): Bar graph of median debt by race information given in Table 10.2 (Public domain figure created by Alan M. Polansky).

    Unfortunately, visual representations of data can be manipulated in several ways to make some trends appear more apparent than the data indicates or to cover up other trends in the data. One common methods is manipulating the order of the categories when they are measured on a nominal measurement scale. For example, a low value can be masked somewhat if it is surrounded by other smaller values, or it can be emphasized by purposely putting the low category next to the highest category. The horizontal axis can also be manipulated by not starting the lower end of the range at zero. This can be used to overemphasize small differences.

    Bar graphs can also be used to show more than one measure for each category of a variable. Once again, we will consider the data on race and debt from Table 10.2. We computed two measures of location for debt in each category, the mean and the median. Both measures can be represented in a side-by-side bar graph as is shown in Figure \(\PageIndex{4}\). The horizontal and vertical axes are constructed in the same manner as before so that the largest of the two measures can be represented. The plot is similar in construction to the previous. Separate bar graph of the mean and the median debt. The difference with the graph in Figure \(\PageIndex{4}\) is that the bars representing the mean and the median debt for each category are plotted next to each other within the category. Note that because both measurements are within the same category, there is no space between mean and median debt. From the plot we can observe that except those individuals classified as Asian, there is minimal difference between the mean and the median of the observed debts. An alternative method for conveying the same information is to group the measures of location on the horizontal axis, essentially combining Figures \(\PageIndex{1}\) and \(\PageIndex{2}\) into a single plot. See Figure \(\PageIndex{5}\). Note that it is possible to also include measures of variation in these plots along with the means and medians. This is strongly discouraged because the two measures provide different types of information.

    clipboard_ea348970bbc0bee2448bc10928c08b36b.png
    Figure \(\PageIndex{4}\): Graph of a side-by-side bar graph for mean (blue) and median (orange) debt by race. Information given in Table 10.2 (Public domain image created by Alan M. Polansky).
    clipboard_ef0468dc2900f7dfe9cd7ccd1c788c118.png
    Figure \(\PageIndex{5}\): Graph of a side-by-side bar graph for African American (blue), Asian (red), Hispanic (green), and white (purple). Information given in Table 10.2 (Public domain image created by Alan M. Polansky).

    This page titled 11.2: Bar Graphs is shared under a CC BY 4.0 license and was authored, remixed, and/or curated by .

    • Was this article helpful?