3.4: Continuous Variables- Variables That Vary by Level
- Page ID
- 49368
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\( \newcommand{\dsum}{\displaystyle\sum\limits} \)
\( \newcommand{\dint}{\displaystyle\int\limits} \)
\( \newcommand{\dlim}{\displaystyle\lim\limits} \)
\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)
( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\id}{\mathrm{id}}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\kernel}{\mathrm{null}\,}\)
\( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\)
\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\)
\( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)
\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)
\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)
\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vectorC}[1]{\textbf{#1}} \)
\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)
\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)
\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)Variables vary by level. By level, we mean a) an amount or frequency, or b) intensity or severity. By level, we mean we are sorting our data and observations as more or less of something. This person has more of something than another person. This person has less of something than another person. This person has more intense something than another person. This person has less intense something than another person.
General terms for variation by level can be considered as amount, frequency, or intensity. It all comes down to more or less. Any adjective we use to describe more or less fits here. By amount, we say, taller versus shorter, faster or slower, older or younger, farther or closer. By intensity, we say, hyper or hypo, “drives me crazy,” or “chill,” funny or boring, delicious or yucky, or spicy hot or bland.
Statistically, “level” as a term refers to a random effect. In contrast to a fixed effect, where the variable can be only one value, random effects mean that the variable can be any value. Granted, random means many things in statistics and research. For now, we are using random effects to represent continuous data compared to a fixed effect, which is usually reserved for categorical data. The fixed effect does mean that we are confident that everyone or everything belongs in that category and no other category. Random effect implies that there is allowance or uncertainty that the entity does vary along a continuum.
There is a distinction between how a continuous variable is demarcated according to more or less of an amount versus more or less of an intensity. When the numbers represent an amount, it is self-explanatory. Age, number of years of education, number of therapy sessions attended, and number of days spent on vacation are all examples where the number represents the amount of something.
The issue gets interesting when the number represents the intensity or severity of something. Here, the number, the starting number, the end number, the range, and the unit get murky. The problem is that when a number represents an amount of something, we can usually identify what that something is. However, for intensity, the number represents our experience of something, and we infer what that something is and how we experience it.
What is the best way to use a number to represent depression? We know that depression is on a continuum, from no depression to severe depression. We could divide the continuum more by adding mild to moderate depression. The question then becomes, how many numbers are needed, and what is the range of these numbers to represent the continuum of depression?
What is the best way to use a number to represent your attraction to your partner? We could use a scale from one to ten, with one being no attraction and ten being an intense attraction. There are several interesting issues to raise here. Who decided that attraction is rated from one to ten? What is the difference between a one and a two? What are the indicators that someone is a five versus a six? Does everyone use the same criteria to decide if someone is a one, two, five, six, or ten?
What is the best way to use a number to represent pain intensity? The same issues as the attraction scale apply, but I use this example to highlight the role of context. Does the same one-to-ten scale apply to all pain? Do we use the pain scale for a person with a headache versus someone who broke their arm, someone experiencing a heart attack, or someone who is giving birth to their child? Pain is an individual experience. The same event, such as stubbing their toe, might be a level 10 to some people but a three to others, depending on their pain tolerance. The point is that intensity is often in the eye of the beholder, and the intensity range is arbitrary. A discussion of this issue is beyond this chapter; your conceptualization will aid in deciding how best to measure the intensity of the variable.
These issues are interesting. For now, a good idea to consider is if something does not vary by amount it varies by intensity. The challenge is deciding how to best represent that intensity from a range of low to high. Fast forward to the scaling options, and we will note that the ordinal and the interval scaling options are the choices for representing severity.
The variation is on a continuum from low to high; now, the numbers mean something because the number represents more or less of “it.” In contrast to categorical variables, the numbers are codes and have no inherent meaning. However, for continuous variables, the numbers do have meaning because they represent a value from low to high. The numbers matter because they now have meaning. 2 is greater than 1. The numbers stand for more of or less of something.
And having more of or less of something does matter. Something has more amount/more frequency, less amount/less frequency or more intensity/more severity, less intensity/less severity. Some examples are more or fewer hours studying (14 per week), more or less stress during the study session (stress: 5 on a scale from 1 to 10), more encounters with micro aggressions (2 per month), more intense responses to each micro- aggression (frustration: 7 on a scale from 1 to 10).
The continuum is one continuous line from low to high. We could say that someone is more of “it” than another person. That would suffice, and we use this language all the time. We say someone is taller than another person, more depressed than another person, or has a greater work ethic than another person. For most discussions, this comparison of more or less is enough. But we need something more for statistical analysis, figuring out associations, and quantifying the comparison.
That “something more” is demarcating the continuous line. Demarcating simply means subdividing the line into units. Like a ruler, we take a continuous line and then subdivide it into units. The questions for demarcating the line are how large and even the unit is or if the units are equal in length. When we demarcate the line, we do so in three ways: ordinal, interval, and ratio, and these are the three types of continuous variables.
This order does matter: ordinal, interval, and ratio. Consider these variable continuous types as arranged from less to more precise in measurement. Precision means how well the numbers cannot or can stand on their own. Less precision in the numbers means you need more information to understand the numbers. More precision means that the number itself gives you all the information you need. The ordinal is the less precise of the three, followed by the interval, then the ratio.
3.4.1: Ordinal Ranks Variables
First, the ordinal. Think of ordinal as a rank variable. Generally speaking, ordinal and rank are the same thing. When you read the term ordinal, think rank, and vice versa. Sometimes, you read the term “ordinal ranks.” Technically speaking, ordinal is referring to “order.” Rank is the noun version of arranging things into ranks. Ordinal and rank are the same thing, and there hasn’t been a major reason to separate the two terms.
Ordinal or rank variables have mutually exclusive categories, but the order of the ranks does matter. For example, we use ordinal variables for the degree of clinical severity: 1 = non-clinical, 2 = sub-clinical, and 3 = clinical. Ordinal variables have not set zero points, so you can assign numbers any way you want. There is nothing wrong with redoing the ordinal codes as 3 = non-clinical, 2 = sub-clinical, and 1 = clinical. However, the order does matter, and sub-clinical does have to be the middle category because it is less than clinical severity but greater than non-clinical severity.
No zero point means the numbers do not stand on their own. That means you cannot tell what a 2 means. You must get more information and context to interpret the score’s meaning. You have no idea what a 1, 2, or 3 means unless someone tells you. You need to know the lowest number in the range and the highest number, and what they represent. If someone gives you the number “2” for an ordinal variable, you must know what the “2” is out of what is the lowest and highest score. Is it “2” out of 3 ordinal ranks, or a “2” out of 10 ordinal ranks? You also need to know if a “2” is good or bad according to what the lowest and highest ranks represent.
We use rankings all the time. In sports, we say a team is in 1st place, 2nd place,or 3rd place. When we rank LGBTQ-friendly campus climate, we say this campus has the #1 best ranking, 2 = #2 ranking, and so forth. We do switch what the first rank means. We could describe the #1 sports blooper error this week or rank the worst albums by a favorite artist. For instance, “Yellow Submarine” is the #1 worst Beatles album. In psychology, we do rankings. Non-clinical, sub-clinical, and clinical. We rank behavioral problems from worst-behaved students to best-behaved students.
What should the numbers for the rankings be? This is an issue that we revisit with interval variables. One way to decide the numbers for the rankings is to figure out if we want the lowest and highest numbers to mean a good thing or a bad thing. The numbers do matter for continuous variables, and they matter in terms of good or bad valence.
Intuitively, we use numbers that align with the rankings. So, if someone is in 1st place, we give that rank a “1.” For a non-clinical, sub-clinical, and clinical ranking, we can use 1, 2, and 3, or 3, 2, and 1. Does it matter for the ranking? It does not.
The problem might be when we associate the numbers with something else. Generally, we increase the number when we say there is more of something. It is a lot easier to say that more of “this” leads to more of “that.” It gets interesting for rankings where the 1st ranking is a good thing. The more hours an athlete spends on drills, the more likely the athlete will get first place instead of second place. So, an increase in more hours means that the direction will decrease from a ‘2’ ranking to a ‘1’ ranking. Is it confusing? Possibly. The problem comes when we think about how to describe associations between good and bad things. For psychology, the more days a person spends time alone and isolated, the more likely they will be in the clinical range of depression. So, more days alone are associated with increasing the ranking from a “2” sub-clinical to a “3” clinical range. If you want to show improvement, the ranking direction becomes interesting. We want to say that the more time a person spends in treatment, the more the person improves. But the ranking decreases from “3” to “2.” So, we say that the more time the person spends in treatment, the person improves by going from a “3” clinical ranking, decreasing to a “2” subclinical ranking. You must be mindful of the mental flip-flops that sometimes occur when you are associating the numbers as representing low or high amounts of the construct.
An important characteristic of the ordinal rank variable is that the distance between and within the categories is not equal, which varies across the variable's full range. When we say a team is in 1st place, a team is in 2nd place, and a team is in third place, we do not say how much the teams are in those places. A team can be in 1st place by 1 or 10 points. It doesn’t matter because the team is in 1st place. A team can be in 1st place by 1 point, and the team in 2nd place can be in that place by 10 points compared to the 3rd place team. The distance between the 1st and 2nd place teams and the 2nd and 3rd place teams are unequal. All that matters is how the teams are ranked by 1st place, 2nd place, and 3rd place.
The distance, or the range, within the ranks, can vary across the ranks. This situation happens all the time in psychology. On the MMPI, we give scores from 1 to 100. The scores of 1 to 55 indicate a normal range, 56 to 65 indicate a sub-clinical range, and 66 to 100 indicate a clinical range. The range of the normal rank is 1 to 55 for a span of 55 points. The sub-clinical range has a span of only 11 points. The clinical range has a span of 34 points. Within the ranks themselves, the ranges differ. In education, passing is 70 to 100, and failing is 69 to 0. So, the passing score range is 30, while the failing score range is 69. The range intervals are different within each rank.
We also group entities or people in terms of ranks, and within that rank, there could be people who just made the cut-off score for the rank and people who clearly vaulted the cut-off score. We do this all the time. We give students an “A” as the top grade. However, within that “A” group are students who got a 91 and just made the “A,” and students who got a 99 clearly got the “A.” To the students, it may matter who got the 99 or the 91, but to assign a grade or a rank, these students get the “A.” What matters is that the group as a whole are ranked above all other students who got a “B.”
In psychology, we use rankings all the time. Usually, we use a three-ranking system: normal range, sub-clinical, or clinical range. Admissions interviews sort candidates into three categories: admit, possibly admit or put on a waitlist, and no admit.
Why use ordinal as a variable type?
That might suffice when you believe that the distribution of “it” can be summarized into groups. For example, suppose you want to measure which students are good, not-so-good, or bad. You could count the number of disciplinary referrals per student, and per month, you could count each referral for each student. However, it might be easier to sort the students into general categories such as the following: 0 = none, 1 = 1 to 3, 2 = 3 to 10, 3 = more than 10.
You can use ordinal variables when you are unsure whether there is an equal interval between the ranks. This situation usually occurs when there are “grey areas.” For situations where you are sorting basketball skills, you may sort them into three groups: make the team as a starter, be on the bench, or not make the team. It is difficult to assign a number that designates how to “score” basketball ability, so we might just sort the players into groups and rank them from low to high.
Ordinal variables are used when the distribution of the variable seems “lumpy,” when the distribution of scores does not have to be equal, or when there are usually more people in one group versus another. For example, when measuring the number of alcoholic drinks someone consumes, you could count the number of drinks. However, when sorting through the drinks, one to two drinks may be equivalent, and the difference between one or two drinks might not matter in terms of the effect of driving home. But the difference between two and three or four or five might matter. So, the cutoff for the range is two, but then two or more up to five may matter in terms of trying to decide if the person is too drunk to drive. Any drinks after six and above don’t matter; if a person is beyond six, the person is not safe to drive.
3.4.2: Continuous Interval Variables
The second type is the continuous interval variable type. For this variable, the distance between scores is (supposed to be) equal among all points on the scale. Ordinal variables have unequal distances between the ranks and within the ranks. For interval variables, the scores do have an equal distance between them.
Interval and ordinal variables have one thing in common. Both variable types do not have a zero point. This means there is no true zero for any of the measurement scales. So, you can technically use any number you want to arrange the interval scale. Like the ordinal variable, because there is no zero point and you can use whatever number you want, you must get additional information and context (range, average score) to interpret the score’s meaning.
For interval scales, you assume the scores have equal intervals: 1 to 2, 2 to 3, 3 to 4, and 4 to 5 are the same distance. That means the distance from 1 to 2 is 2 to 3. In ordinal scales, the distance between the first and second ranks can differ. A person who is considered a heavy drinker will be ranked first in terms of the number of drinks, compared to a person who is a social drinker who is ranked second in terms of the number of drinks. But the number of drinks for the first person could be 10 beers, and the second person could have 2 beers. However, the number of drinks for another person in the heavy drinker rank could be 15 beers, and the number of drinks for another person in the social drinker rank could be 1 beer. The distance between beers in scenario one is 8 beers, and the distance in scenario two is 14. However, the former persons still receive the same rank – heavy drinker- and the latter persons still receive the same rank – social drinker. Even though the two pairs of drinkers have a different number of beers drank between them.
The classic and only example of the continuous interval scale is the Likert Scale. Likert scales are the following – “rate your customer satisfaction on a scale from 1 to 5”; “rate your depression on a scale from 1 to 10”; “rate your child’s temper tantrum severity on a scale from 1 to 5.”
The scale itself can be any number range. These scales can be from 0 to 4, 0 to 5, 1 to 5, 1 to 3, 1 to 10, 1 to 100.
The important note about the interval variable is that you have no idea what the lowest or highest number stands for. On a scale from 1 to 10, 1 could be bad, or 1 could be good, 10 could be bad, and 10 could be good. For example, you could rate someone’s physical attractiveness on a scale from 1 to 10, with 1 being unattractive and 10 being attractive. A 1 is bad, and 10 is good. You can also rate how much a student in a classroom is driving you crazy, with 1 as good and 10 as bad. We rate pain on this scale – 1 as no pain and 10 as excruciating pain. In these cases, 1 could be good or bad, and 10 could be good or bad. Someone needs to tell you what the lowest score and highest score mean. The numbers cannot stand independently because there is no context under which you can determine what values are good or bad.
Why use this variable type?
When you want to use a conventional interval scale to measure something. Scales from 1 to 10 are ubiquitous because converting our experiences into a scale of 1 to 10 or 1 to 5 is easy. We have 10 fingers, so we can easily use them to show our experience of something. The range of one to 10, 1 to 5, or 1 to 7 implies a midpoint where we can say we do not think our experience fits either extreme.
We use this variable type when we really do not have a “standard” measurement for something. Usually, this “something” is theoretical. There are no “standard” measurements of racism, life satisfaction, hope or worry about the country's economic future, or being a team player. We have measures of psychological phenomenon that are in common use, such as the MMPI, the Beck Depression Scale, the PTSD Inventory, the Positive and Negative Schizophrenia Symptom Checklist, the Autism Diagnostic Observation System, the Pittsburgh Sleep Index, and the Weschler Intelligence Scale. But suppose you really think about the history of how those measures were assembled. In that case, there really is no basis for the interval scoring protocol: “Rate the child’s behavior on a scale of 0 to 3, rate the patient’s delusions on a scale from 0 to 7, have you felt you are no longer interested in things you were interested in?” I would argue that there really are no “standard” measures of depression, alcoholism, PTSD, autism, and so forth. Some measures have more popular use and evidence of their validity, but as far as a standard, the psychology phenomenon does not have it.
At this point, I will remind the reader that the ordinal and the interval variables are the only choices we have to measure the intensity of our experience of the variation of a phenomenon. The ordinal and interval variables are used when the experience of the phenomenon is arbitrary. No rule or standard says that the starting point for the experience is zero or one, or that the endpoint for the experience is 10, 20, or 100. The unit of intensity does not have physical properties. There is no such thing as one unit or two units of a phenomenon such as depression. These issues mean the ordinal and interval variables are the only choices we have to measure the intensity of a phenomenon because the ordinal and interval variables, by definition, are not based on a definitive zero and do not have equal intervals as their unit of measurement.
Note the following stance from this author – interval continuous variables, or Likert scales, are flawed. In psychology research, it is hard to determine whether the intervals are equal between two points. For example, what is the difference between a “4” and a “5” or Agree vs. Strongly Agree on a scale measuring an adolescent’s liking for school? What is the difference between a “9” vs. “10” in pain thresholds for a woman who is giving birth compared to a man who is suffering from a leg cramp? What is the difference between a 90 vs. 100 in a stress index? Researchers have pointed out the flaws of Likert scales, and the discussion is beyond this textbook. Suffice it to say that interval variables are inefficient ways to measure something. However, given that most psychological, educational, social, and cultural phenomena that we encounter are based on perceptions and experiences, and don’t have any physical indication of what is nothing, absent, more,or maximum, we are stuck with interval variables, or Likert scales, because there really is no other alternative.
3.4.3: Ratio Variables
Ratio variables have equal interval measurement; the key distinction is that they have a zero point. The zero has meaning: zero does mean zero, or nothing, or the absence of the “it.” The ratio variables are called ratio variables because the number generated by a measurement is compared to zero. So, the number does have meaning, and the number can stand on its own, meaning there is no need for context to interpret the number.
Ratio variables are used when there are physical measurements: height, weight, distance, and speed. In these cases, the number stands independently and does not need context. A person who is 5 feet 2 inches is 5 feet 2 inches. A person who weighs 150 lbs. weighs 150 lbs. A car that goes 50 miles per hour goes 50 miles per hour. Each value is compared to 0: five feet two inches to 0, 150 to 0, 50 to 0. So, the number can stand on its own.
In other disciplines, we use ratio variables more as frequency counts. The population of a town, the number of churches per neighborhood, the number of flu cases during the winter season, the number of deer in a park preserve, and the number of shoplifting incidents for a franchise retail store.
In psychology, we use ratio variables, especially in terms of demographics. Age, income, and years of education are ratio variables. A person who is 30 years old earns $75,000 / year with a college degree and has 16 years of education.
Psychology uses ratio variables as frequency counts. The number of counselling sessions attended, the number of student disciplinary referrals, and the number of alcoholic drinks consumed per day.
When do you use this variable? Well, the variable is strict in terms of what it can be used for. The ratio variable is clearly used when there is a clear zero point, and the number represents a frequency count. Everything else would be considered an interval or an ordinal variable.
You might want to be careful, however. The entity might be a ratio variable, but how it is scaled as a variable for a statistical analysis might be an ordinal variable. Consider this scenario. You want to study the number of drinks a person has and their decision to drive afterwards. You create a variable called drinks, and the variable is divided into the following: 0-1, 2-5, 6-10, 10 or more. What is the scaling of the variable? In this case, the variable that is entered into the analysis is ordinal. The number of drinks is originally a ratio variable because it is a frequency count of the number of drinks. But the variable is rescaled into an ordinal variable. The variable has a zero point, but the intervals are not equidistant. The first rank is zero or one drink; the second and third ranks have a range of four drinks, and the fourth rank has an indefinite range of drinks. Why rescale the ratio variable into an ordinal variable? For parsimony, theory, and ease of interpretation. It may not matter if a person has two versus three drinks, but it may matter if they have five or six drinks. It definitely does not matter if the person has 10, 11, or 12 drinks, because at that point, the person is definitely in no condition to drive. For conceptual purposes, entering the variable as a ratio variable does not enhance the understanding of the outcome because it does not matter if the person has two or three or four drinks. It may matter if there is a general threshold, such as two drinks versus six drinks, when the outcome is affected. In this case, the number of drinks as a variable could originally be a ratio variable, but it is re-scaled as an ordinal variable for the statistical analysis.
Why the Variables are Arranged from Ordinal to Interval to Ratio
Continuous variables are arranged from Ordinal to Interval to Ratio in terms of the level of precision in demarcating the variation of “it” from low to high. The ordinal variable is the sloppiest; the ratio is the most precise. Sloppy vs. precision is based on how the variation of the variable is divided into units that we count.
The phrase “units” that we count is interesting. We divide the variable “length” into units called inches (assuming the English system here). We divide the variable “weight” into units called ounces. We divide the variable “time” into units called seconds. We divide “money” into units called dollars. In psychology, we divide the variable “depression” into … well something.
We know that people are more or less depressed, but what is the unit? We know that some people have a bad day, some are sad based on an external event (e.g., a death in the family or the Cubs losing), and some are chronically sad. But what is the unit? The challenge in psychology is that these variables are internal experiences you cannot see. All we can do is infer that something is there and that something varies in amount from low to high. When we divide “depression” into units, the best we can do is infer something to count. For depression, we could ask the number of days that a person is depressed. The days become the unit. Is that the best way to count the number of units for an internal state such as depression? Putting aside any debate, for now, we divide the variable into units, determine if the unit is equal or unequal and if the number assigned to each of those units can be interpreted with or without context.
You might wonder how a unit can be equal or unequal in terms of the physical world. The measurement apparatus itself needs to be equal in units. However, how we use the apparatus might lend itself to unequal units. Take baking. A heaping tablespoon of sugar is different than a tablespoon of sugar. Levelling off one cup of flour is different from scooping a cup of flour and different than weighing a cup of flour, especially if the flour is all-purpose or bread flour (or strong flour, as the British say). The measurement apparatus needs to be precise in terms of equal units. The act of measuring something might yield unequal amounts of the variable.
Back to the dimensions of sloppiness and precision. The factors to consider are: a) is the variable divided into unequal or equal units? b) do the numbers need context to interpret them, or can they be interpreted without context? By context, we mean someone else must tell you what the number represents.
Ordinal variables are the sloppiest because a) the units are unequal, and b) the number associated with each unit needs additional context. Interval variables are less sloppy because a) the units are equal, but b) the numbers associated with each unit need added context. Ratio variables are the most precise because a) the units are equal, and b) the numbers associated with each unit do not need additional context.


