3.6: Making Decisions About the Variables in Your Research Study
- Page ID
- 49370
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\( \newcommand{\dsum}{\displaystyle\sum\limits} \)
\( \newcommand{\dint}{\displaystyle\int\limits} \)
\( \newcommand{\dlim}{\displaystyle\lim\limits} \)
\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)
( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\id}{\mathrm{id}}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\kernel}{\mathrm{null}\,}\)
\( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\)
\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\)
\( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)
\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)
\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)
\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vectorC}[1]{\textbf{#1}} \)
\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)
\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)
\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)There are two reasons why you need a schematic for making decisions about your variables. The first reason is to decide how to structure your variables for your research design. The second reason is deciding what statistical test to use.
The decision you need to make is how you will sort and divide the variation you see in your “it” or the variables you see.
3.6.1: Identifying the Variables
- First decision – how many variables are there?
- Second decision - Are the variables independent or dependent variables?
- Third decision - How are the variables scaled? Is it either Categorical or continuous? And if continuous, is the variable ordinal, interval, or ratio?
For your first decision, decide what is the “it” that you are studying. What is “it” that you want to understand? You want to “operationally define” your variables. You want to determine the definition of “it” and what “it” is not. If you want to study depression, you must ask yourself what type of depression you are curious about. Is it general depression, chronic depression, treatment-resistant depression, seasonal depression, or depression associated with an event such as grief?
The process of “operationally defining” “it” often involves a literature review to gather the conceptual understanding of what “it” is. You may decide what “it” is, but you may think that “it” has several dimensions, factors, or types. In psychology, we assess intelligence. Intelligence is subdivided into other types of intelligence, such as verbal, math, and analytic. We assess for PTSD, but we view PTSD as encompassing an array of symptoms, such as physical symptoms, for example, heart racing, sweating, and lack of sleep; social symptoms, for example, not wanting to socialize; and neurological symptoms, such as flashbacks or nightmares. We choose which dimensions, factors, or types of “it” according to what information we need to predict our outcome.
Then, think about how “it” varies. Recall that not every variable is inherently categorical or continuous. Variables can be scaled in different ways. How you divide the variation you experience in your variables will be according to your research question and your outcome.
A good place to start would be deciding how many different variables are associated with “it.” Start with the different types of “it” that are associated with the overall “it.” For depression, you could start with a self-report of mood, then loss of interest in activities, loss of interest in social relationships, isolation, and suicidal thoughts. So, start with the types. Each of these “types” constitutes its own variable associated with the overall “it.” Then, figure out how each of these variables varies and how you want them to vary by category, type, or continuous, which is by level.
When designing your research study, why make a big deal of this? Because researchers can take a sloppy approach to constructing their variables. Sloppy definitions of “it” and sloppy descriptions of how “it” varies, as well as how to divide the variation of “it,” can lead to sloppy variances. The sloppy variances will lead to a statistical analysis that cannot detect patterns among the independent and dependent variables. Statistical analysis cannot “save” an analysis if the variables are poorly defined and their variations are poorly demarcated. A chef only uses quality ingredients. To make a five-star worthy dish, chefs cannot do much with subpar ingredients. Researchers and statistical analyses need to have quality variables. The “it” is often messy in terms of how the variation of it is described by type and level. There are no straightforward ways to carefully define bullying in terms of type and by level, including more or less amounts and more or less intensities of violent behavior. One of the reasons why these variables can be sloppy is because there are several dimensions to any “it.” Bullying can be divided into social, verbal, physical, and internet bullying. Bullying can be different among classmates, between males and females, between males and males, and between females and females. Verbal bullying could be teasing, shaming, name-calling, or threats. You could count the number of instances of bullying incidents occurring. You could describe the intensity of a specific bullying incident, such as a physical shove or cornering a student in the hallway. You could catalog the bullying on ordinal scales, such as mild, threatening, or hostile. You could rate bullying as not feeling safe on a scale of 1 to 5 or 1 to 10. So, the “it” can consist of a chaotic variation. However, careful planning and review can make the variables and how they are scaled more precise so that the data collection for each variable does not end up as messy data.
3.6.2: Selecting a Statistical Test
- The decision to select a statistical test has three steps:
- How many variables are there?
- Are the variables independent or dependent variables?
- How are the variables scaled? Is it either categorical or continuous?
- And if continuous, is the variable ordinal, interval, or ratio?
- Consider whether the variation of the variable ranges by amount or intensity.
First step – how many variables are there?
To most, this question might seem like a “duh” question. However, the number of variables is not always clear when deciphering research studies and analyses.
It is not always clear what the variables are. Consider this example. A psychologist wants to examine hospitalized clinical patients and their depression and suicide severity levels. The depression and the suicide severity levels could be two variables – depression and suicide severity. Or it could be one variable where depression is the primary variable; and if the depression is high, it calls for a suicide severity risk. In most cases, research scenarios are not trying to “trick you” about the number of variables, but it is something to keep in mind.
Remember that not all variables are independent or dependent. Some of the variables are what we call covariates. Covariates are like independent variables but are not essential or important for predicting your dependent variable. Most covariates are demographics, such as age, gender, and race. These variables usually predict the outcome variable, but they are not that important because there might be another independent variable that is the focus of your study, such as the number of therapy sessions or the number of times spent doing mindfulness sessions.
You need to decide the number of independent and dependent variables. The only two answers to that decision are one or many.
The remaining steps were discussed in previous sections:
The second step is deciding if the variables are independent or dependent variables; the third step is how the variables are scaled. The fourth step is the variation by the amount or intensity.
Putting all this information together.
Fast forward when we talk about the statistical test, but suffice to say, if there is one independent variable, and the variable is categorical scaling, you are likely looking at using a t-test if there are only two categories for the independent variable or an ANOVA if there are three or more categories for the independent variable (you can actually use an ANOVA for two categories for the independent variable, but more later on that twist). If the one independent variable is continuous and either an ordinal, interval, or ratio scaling, you are looking for a Pearson or a Spearman correlation. The aforementioned tests only have one dependent variable, which is continuous.
If you have more than one independent variable and all are categorical, you are likely using an ANOVA. If you have more than one independent variable and a mix of categorical and continuous variables, or if they are all continuous variables, then you are likely using a regression. In both cases, there is only one dependent variable, and the variable is continuous.
If you have more than one dependent variable, and the variables are all continuous, then you are looking at a MANOVA or Multiple Analysis of Variance. If there are one or more independent variables, they need to be categorical scaling.
If you have more than one dependent variable, and the variables are all continuous, then you are looking at a MANOVA or Multiple Analysis of Variance. There can be more than one independent variable, but they must be categorical scaling.
If you have one independent variable and one dependent variable and both are categorical or if you have more than one independent variable and more than one dependent variable and both are categorical, then you are looking at a Chi-square analysis.


