Skip to main content
Statistics LibreTexts

4.2.3: Evaluating the Plot

  • Page ID
    56387
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\dsum}{\displaystyle\sum\limits} \)

    \( \newcommand{\dint}{\displaystyle\int\limits} \)

    \( \newcommand{\dlim}{\displaystyle\lim\limits} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \(\newcommand{\longvect}{\overrightarrow}\)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

    What are you looking for and why?

    Evaluating the categorical distribution plot means determining if the sample characteristics are a good estimate of the population's characteristics.

    Remember that all statistics involve comparison. In this case, you are comparing your sample characteristics to the population characteristics. The question is this – Did we get the “right” sample?

    Evaluating categorical distributions is good for determining if you have possible confounds. Common confounds include having too few participants in certain categories, or an imbalance of participants across the categories. Briefly, confounds are issues in your data or your research design that would invalidate the results of your research study. A goal of doing a descriptive statistical analysis is to discover if there are confounds to address that might affect your results. This issue will be continuously revisited.

    Evaluating the distribution plot.

    To evaluate the distribution plot, keep in mind that the shape of the distribution plot for a categorical variable can be anything. The reason the shape can be anything is because the codes we use for each group are in arbitrary order. For example, gender. On the X-axis, males can go first, females can go second, or females go first on the X-axis.

    Along the Y-axis are the frequency counts, or the number of participants in each group. Continuing our example, gender: males, n = 16, females, n = 16. The plot is listed below. How do you evaluate it?

    Recall that all evaluations involve comparisons. Most comparisons are basically in the form of good or bad in comparison to something.

    Comparing the sample to the population characteristics.

    In this case, one method of comparing your distribution plot is to follow the framework of how the sample compares to the population. The distribution plot represents your sample, and we want to compare if the sample distribution plot is the same as the distribution found in the population. We need the population characteristics in mind to make the comparison.

    Remember – you are always looking to compare what you are looking at to something. In this case, you want to ask yourself the following questions. Does the sample (what you have in front of you) represent the population (what you expect to see?), or do the sample statistics represent the population parameter characteristics? Are these sample characteristics expected or unexpected, given the population and the research question?

    Suppose we are evaluating a college classroom and gender distribution to determine associations between gender and grade. In this case, males, n = 16, females, n = 16, might be a good representation because if there are 32 students, there is an equal number of genders, and it looks like we got all the students. In this case, the sample characteristics are expected given the proportion of males and females, which is 50/50, we would expect in the population.

    But what if the context is students who receive clinical counseling services from a college counseling center? Suppose we got the following results: males, n = 16, females, n = 32.

    Is this a good or bad result? Well, it seems good. Here’s why. In the population, more females than males use counseling center services. So, while it is tempting to think that there is not an equal balance of the number of males and females, given the context of a counseling center population, this imbalance and these numbers are probably what’s expected. To evaluate this distribution of this nominal variable, it seems fine given the context of the population, which is that higher percentages of females will seek counseling compared to males. In this case, the sample characteristics are expected, given what we know about the population.

    What if the context was adolescents in a juvenile detention center, and we got the same results: males, n = 16, females, n = 32? Is this a good or bad result? Well, it seems bad. Here’s why. In the population, more males than females are in juvenile detention centers. So, it is odd that there are more females than males because we would expect more males than females. Especially if we picked the sample at random because if it was a random selection, there should be more males than females. In this case, the sample characteristics are unexpected, given what we know of the population.

    Are the previous scenarios a problem? It depends on the research question and if the variable you are interested in is the main focus of the research question. In the college classroom example, suppose the research question is to compare the males and females and their grade performance in the class. In this case, the results are good because you want an equal number of males and females to establish a balanced comparison. However, suppose the research question was about the number of hours studying for the course and their grade performance. In that case, the gender variable is not germane to this question, so whatever result you obtain about the distribution of males and females is not relevant.

    In the college counseling center example, if the research question is about gender differences in seeking counseling services, then yes, it is possible the imbalance with more females than males might be an issue. However, the point of that research question is that it is expected that more females would seek therapy than males. Despite the imbalance, it does reflect the reality of the situation, so it might not matter that you have the imbalance.

    The third research question is interesting. It is not expected that there will be more females than males. If the research question is to compare gender differences in therapy outcomes in juvenile detention centers, then you might wonder if this detention center with more females than males possesses any unique characteristic that might affect the results. It’s hard to say what this characteristic might be, but it does raise some questions, and it is best to obtain a consultation to be certain if the data are suitable for answering the research question.

    So why is it good to examine frequency distributions of your nominal (categorical) variables? It is to determine if your sample characteristics represent the population characteristics.

    If the population's characteristics are not well known, what else should you be looking for? Without any preconceived notion of population characteristics, we typically look for imbalances across demographic groups. For a demographic variable like race, you might be looking for equal numbers of participants across all races. If your research question is about using race as a predictor of an outcome, such as mental health, then yes, you would need to determine if the races are balanced in numbers.

    What does it mean when the population characteristics are not well known? There are some populations for which we simply do not know all the usual demographics. For example, for an undocumented immigrant population, we do not know if our distribution of gender, age ranges, race, economic status, or geographic location for our sample is similar to the population. Recall that the population is unknown because it is impossible to know everything about a population. For a population such as undocumented immigrants, it is difficult to know the true distribution of the population characteristics.

    Sometimes, the imbalances can be considered as confounds for your research question. For example, you are looking at race as a predictor of mental health outcomes. If your race variable has White, Black, Hispanic, and Asian, and there are 50+ in the White, Black, Hispanic group, and only 10 Asians, then consider whether including the Asian participants is worthwhile. It is difficult to establish a trend in any variable with 10 participants. This issue will be discussed in the sample size section. For now, you could remove those 10 Asian participants or conduct the analyses and indicate as a limitation that conclusions about Asians as a predictor of mental health outcomes are inconclusive. But at least when you examined the distribution of participants across the groups in the race variable, you spotted an imbalance. That is a good way to review your data and then bring the concern to a consultant to determine if the imbalance will pose problems in your statistical analysis.

    What sometimes happens is that we collapse groups when there are too few numbers in each group. For a demographic such as religion, we might list several religious denominations. But if there are too few within a denomination, we might just add them together to retain the participants. If you have religious denominations such as Protestant, Lutheran, and Catholic, but there are few participants in the Protestant and Lutheran denominations, you might consider putting all three denominations under a broader Christian religious group, or you might combine the Protestant and Lutheran groups. It depends on your research question and how you conceptualize the differences between the groups. If the groups are similar enough, collapsing the groups might not make a major difference in your outcome.

    How do you collapse groups? Collapsing groups is synonymous with combining groups. To combine groups, you add the groups together. A statistics program would allow you to substitute a value for the combined groups. Suppose you want to collapse Blacks and Hispanics into one overall race. If Black was coded as “2” and Hispanic was coded as “3,” you would relabel the races as a “2,” or Black coded as “2” and Hispanic coded as “2.” Be mindful of haphazardly collapsing groups. Combining Blacks and Hispanics from a racial framework makes little sense because of the racial differences between the two groups. However, combining the two groups could work if you are examining general disparities in accessing mental health care for minorities. So, the collapse could work here. Always think conceptually when collapsing t groups. Doing so will help you defend your rationale for collapsing them. If you do not have a good conceptualization, then leave the groups alone and do not collapse.

    Recap

    For categorical distributions, you do evaluate the distribution based on the shape of the distribution. You are not looking for a bell-shaped curve. Categorical/nominal variables are variations by type, and type has no value in terms of variation from low to high. The order of the groups for each categorical variable is arbitrary. This means the frequency count plots for each category can be in any order, and the resulting curve or distribution can take any shape. You cannot rely on the shape of the distribution as an indicator of the quality of the distribution. There are no evaluation criteria for the quality of the distribution of categorical variables.

    The only way to evaluate the quality of the categorical distribution is to determine if the variable distribution you obtained as your sample is the same as what you expect from the population under consideration. If the proportions of the frequency counted across the groups in your sample are like what you might expect from the population, then you have a good quality distribution. If they are not, then you might have a problem, and it is best to check with a consultant about options. The only two answers you can obtain from evaluating the categorical distribution are: (a) yes, the sample distribution of the participants across the groups for a given categorical variable is expected given the distribution of the participants across the groups for the population, or (b) they are not expected.

    Definitely do not rely on having a balance of participants per group. Sometimes, we expect equal numbers of males and females in a group; sometimes, we do not. Equal numbers across the groups are not a good way to evaluate the quality of the distribution for a categorical variable, simply because the distribution may not be equal in the population.

    One option is to drop a group if there are too few participants in that group. Or you can collapse that group into other groups if it conceptually makes sense to do so. Or you consider that too few participants in a group are a limitation that your study cannot address when considering how the results support your hypotheses.

    Above all else, think conceptually. If the distribution across your groups is for a nominal categorical variable that is part of your research question, then you have to consider if the imbalance is a confound that will disrupt your ability to answer your research question. Remember, confounds are any statistical or research factor that makes others doubt that you arrived at a valid or correct answer to your research question. Others may panic and state that imbalances in the distribution might result in an incorrect outcome. These imbalances likely do not affect the veracity of the outcome. If the distribution across your groups is for a nominal variable that is not essential to your research question, then it is likely that not adjusting the distribution by collapsing categories and simply leaving it alone would be just fine.


    This page titled 4.2.3: Evaluating the Plot is shared under a CC BY 4.0 license and was authored, remixed, and/or curated by Peter Ji.