11.3: Commentary
- Page ID
- 57580
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)
( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\id}{\mathrm{id}}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\kernel}{\mathrm{null}\,}\)
\( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\)
\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\)
\( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)
\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)
\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)
\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vectorC}[1]{\textbf{#1}} \)
\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)
\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)
\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)The commentary sections for each analysis are intended to highlight nuances or myths about each statistical test. For t-tests, most of the commentary is about how the t-test seems like a simple statistical test, when in fact, it is an essential test for simplifying how we organize our process of comparing groups.
11.3.1: “Looks Too Simple”
You would not be wrong if you thought, “This t-test looks simple, and kind of useless. All you can do is compare two groups and compare their means. What’s the big deal about this test? Where’s the complicated statistical analysis?”
Statistics sound and look complicated and advanced. However, we always strive to KISS or “keep it simple, silly.” The simplest way to compare anything is in pairs, or two at a time. In fact, that is the only way we can compare anything. We always compare in pairs. We compare treatment to controls, we compare males to females, we compare good kids to bad kids, we compare bullies to survivors.
The problem becomes when we have multiple groups. We compare treatment to control to usual care, we compare males to females to non-binary identified, we compare good kids to bad kids to somewhat bad kids, we compare bullies to survivors to bystanders. Comparing all of these groups at once is like trying to determine who wins “rock, paper, scissors” when there are three people playing. If all three throw down rock, paper, or scissors, you have no idea who won that round. The problem in that scenario is that you do not know where to start the comparison. Do you start with the rock, breaking scissors? What about the paper covering the rock first, which negates the rock breaking the scissors? And if you play Rock, Paper, Scissors, Lizard, Spock, you are in a world of hurt (https://youtu.be/x5Q6-wMx-K8?si=-MDNe9-kc2IvFY9N).
If you compare treatment to control to usual care all at once, you have a conundrum. Where do you start? Where is the comparison? The only way to make this comparison is to keep it simple, and that can only happen by comparing pairs. Treatment vs. control, control vs. usual care, usual care vs. Treatment. Comparing pairs and organizing the pairs so all combinations of pairs are accounted for is the process of making sense out of several groups at once. Comparing the pairs helps you rank the pairs. Treatment did better than control, control did better than usual care, and treatment did better than usual care. Arranging the comparisons allows you to see that treatment does work, treatment is better than the usual care clients receive, and the control group, that does nothing, will not improve.
In the next analysis on ANOVA, you will find that for any analysis where multiple groups are used, the follow-up tests involve a version of the t-test. When you compare multiple groups, or multiple independent variables, or multiple dependent variables, the method of making sense of the multiple comparisons is to keep it simple by breaking the comparisons into pairs. Making statistics complicated and advanced is not a good goal. Always simplify and keep your analyses in order and organized. Comparing pairs is really the only way to make sense of the complication.
And that’s why the t-test is an important tool in any statistical test. So don’t knock it.
11.3.2: Dichotomous Groups? That’s Too Simple
You would not be wrong to say that there are many groups for a given variable. Race is more than two groups. Living in a binary world is not the greatest because it ignores continuous spectrums. Gender is not a binary term; there is more than just males and females. Sexuality is not a binary term; there is more than just heterosexual and gay. You are correct. The t-test functions as a test for two groups. For better and often for worse, we reduce and collapse complex and continuous experiences into a two-group, dichotomous format. You would not be wrong if you asked, "Is that okay?”
As you are starting to see, an ongoing theme throughout the instruction of statistics is that you must always think conceptually. In this case, the question is – is the dichotomous grouping sufficient for our research question and analysis? For example, you are analyzing if a new treatment for anxiety is better than the usual treatment for anxiety, but you are concerned that gender should be considered part of the analysis. For better or for worse, there are differences in how males and females experience anxiety. There are differences in how cisgender males, self-identified males, transgender males, and non-binary males experience anxiety. Does collapsing all variations of being male into Male vs. Female alter the analysis of a treatment for anxiety? Unless you have a theoretically compelling reason and enough previous studies with evidence, you do not have a strong basis for including every version of the male spectrum identity into the analysis of the treatment effect. For the purpose of this example, collapsing different male identity spectrums into a single “male” category would be sufficient.
Splitting the male identity into several categories means you will be adding additional variables into your analysis. While comprehensive, you are using up your sample size for every variable you use. When you use up your sample size, you are using up your power, or your predictive power to detect an effect in your outcome variable. This issue will be addressed when we discuss degrees of freedom as part of the chapter on correlations. But for now, throwing all the groups and variables into an analysis, while apparently comprehensive, considerably reduces your power, or the likelihood that the statistical test will find an effect.
The bottom line is that you should not throw every possible group into an analysis unless you have a conceptually compelling reason to do so. By conceptually compelling, I mean there is a theoretical reason why all groups need to be included: they are important predictors of an outcome. If there is no reason, then keep it simple and use dichotomous groups.
11.3.3: Forming Dichotomous Groups
Forming dichotomous groups means collapsing multiple groups into two groups. Race has multiple groups. These groups can be combined into two groups. For example, you could take Black, Hispanic, Asian, and White and collapse them into two groups, such as White vs. Minority group. Here again, remember to keep it conceptual. If, for some reason, it makes sense to collapse into White vs. Minority, you are doing so because the comparison is meaningful. If you are interested in minority access to mental health compared to White populations, then the collapse is meaningful because there is something in common about minorities and their access to mental health services.
You can form groups by using a continuous variable and applying a cutoff score. For example, you can have two groups: clinical or subclinical groups based on the MMPI scores. You would be correct in saying, “Wait a second, aren’t these groups supposed to be dichotomous, categorical, meaning no level of low to high?” SPSS does not make any decisions about the variable being a categorical dichotomous variable. You can enter a categorical dichotomous variable, and the two groups can be low or high. This situation often occurs. We divide people into clinical or non-clinical depression. We divide people into “above the cutoff” and “below the cutoff” for admissions scores. We sort people into “good prognosis” or “bad prognosis” after discharge. We can reconfigure a continuous variable into a low-to-high group for any purpose. The t-test will not be rendered invalid.
Keep in mind that collapsing a continuous variable into two groups means losing precision information. Collapsing a continuous variable into two groups essentially means you are forming an ordinal, rank variable. The ordinal, rank variable is basically “high” vs “low,” and, if you recall the characteristics of a rank variable, it does not matter how high or low you are. You could be barely in the high group, or the top person in the high group. You are both considered in the high group.
When would this matter? Think of the reconfiguration of the continuous variable into an ordinal / rank variable as if having more of something really matters for your research question. For example, in determining alcoholism, it does matter if you drink, say, 10 beers, 12 beers or 25 beers in one night. Once you reach that threshold, the effects of alcohol are enough to determine alcoholism. However, if you are interested in alcohol poisoning, then yes, you do need to have a precise record of the number of drinks consumed. Although at that point, your blood alcohol level is a more precise record than relying on a person’s recall of exactly how many drinks they consumed to get blood alcohol poisoning. The point is that the configuration of your outcome variable as a continuous or an ordinal/rank variable depends on what information you need to answer your research question.
11.3.4: The Outcome Variable is Continuous
Technically, the outcome variable for a t-test is a continuous variable. This means the variable could be ordinal, interval, or ratio. However, interpreting the outcome variable is easier when the variable is an interval or ratio. Interval or ratio variables have equal intervals between the numbers. Two is greater than one. With equal intervals, you can say that 2.5 is greater than 1.5.
Using an ordinal variable as the outcome variable in a t-test makes little sense because we do not normally compare partial ranks. Getting a mean score for an ordinal variable makes little sense. If they are ranked first and the other second, the ranking is one and two. There is no such thing as a 1.5 rank. Ranks are whole numbers; they are not decimals. T-tests test the means of two groups. Saying that a group has a mean rank of 2.5 does not make much sense because there is no such thing as a 2.5 ranking. A person is either ranked two or three, not 2.5. That is like saying the average family has 2.3 children. You either have two or three children. You do not have two children, and 1/3 of a child (although I’m sure some families feel that way, unfortunately).
11.3.5: Why the Term “Independent” T-test?
Often, we say “t-test” and do not bother with saying “independent t-test.” In most cases, they mean the same thing. The word “independent” refers to having groups that are independent of each other, or where the members of each group are mutually exclusive, which means that a member of one group cannot be considered a member of the other group.
Another way of putting this is that the participants cannot be part of both groups. To state the obvious, a participant cannot be in both the male and female groups. Yes, assuming a binary gender world, a participant must be in one or the other group.
This situation may be obvious, but there are situations where it is not clear if someone is in one group and not in the other group. We like groups because they are clear, but most issues, diagnoses, and experiences are continuous, blended, and multi-dimensional. Assigning clients to alcohol abuse only, one drug abuse only, or poly drug use only, may seem clear and definitive, but it is likely in the messy context of alcohol and drug use, that clients clearly do not fit in just one of those groups, given the vicissitudes of their reported alcohol and drug use. Assigning clients to emotional abuse, physical abuse, and sexual abuse may seem clear in definition, but given the messy context of abuse, it is likely that clients experienced multiple types of abuse at different times during their lifetime. These are unfortunate situations, but depending on your conceptualization of your research question, it behooves you to consider whether your groups are independent or mutually exclusive. Consider your literature review, your theory, and consultation to be certain that you have independent groups.
Does the aforementioned discussion matter? Issues are messy and continuous, so no matter how we sort people into groups, there will be issues about the clarity of the assignment to groups. The t-test will run, and you will get a result. If the outcome shows no difference between the two groups, perhaps the messy assignment of people into the groups might account for the non-difference. If the outcome is what you wanted, and there is a difference between the two groups, perhaps the messy assignment did not matter, and there is a fundamental difference between the two groups.
Does it matter to say an “independent t-test” to test the means between two groups? Probably not. There are other t-tests, though, and the primary ones you want to be aware of are the paired t-test and the one-sample t-test, which will be discussed at the end of this chapter. It is necessary to say you are conducting a paired t-test or a one-sample t-test because those t-tests do serve specific purposes. However, for comparing two groups, the t-test is just fine; there is no need to add the extra “independent” word.
11.3.6: Uses of T-tests
On the surface, we use t-tests to compare the means of two groups. In which situations do we use this t-test?
We could use a t-test to compare a treatment and a control group, but that would be ill-advised. Making a definitive statement about treatment effects requires more than just comparing groups and one outcome variable. Suffice to say, you would need at least a pre-test and a post-test, and covariates, such as demographics, and a randomized controlled trial, at a minimum, to detect a treatment effect. Using only a t-test to test the effects of the treatment is ill-advised.
T-tests are used at the outset of studies where you want to determine if the demographics of samples are equivalent. In these situations, you do not want your t-test to be significant because you want to determine that the samples are similar. Hence, you are not looking for differences. You could use a t-test to determine if males and females have a similar mean age, similar income, and similar number of years of education. Doing so would allow you to determine that the males and females are similar in all demographic respects, except for being male or female. Making the males and females equivalent would help your overall goal. For example, if you wanted to determine if there are differences between males and females in terms of the average number of sessions of treatment they complete, you could first determine if they were equivalent demographically, such as in age, income, or years of education, so you could rule out the possibility that age, income, or years of education could have any effect on the number of sessions they complete . After establishing that the demographics are equivalent, you could rule out those demographics as possible confounds or alternative explanations and then proceed with comparing males and females on the number of sessions of treatment they complete.
A t-test is used for scale validation. If you want to see if males and females have different scores on a scale, a t-test is sufficient. If you want to see if in-patient and out-patient clients have different scores on a scale, a t-test is sufficient.
The best way to determine if you need a t-test is to consider your research question and your research design. A series of t-tests can be sufficient for your needs. Although the test seems simple, simple is good when detecting effects.