Skip to main content
Statistics LibreTexts

6.4: Standardized Values of Reporting Scores

  • Page ID
    57551
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\dsum}{\displaystyle\sum\limits} \)

    \( \newcommand{\dint}{\displaystyle\int\limits} \)

    \( \newcommand{\dlim}{\displaystyle\lim\limits} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

    The reason we learn about standardized values for reporting scores is for conversion purposes. We take a person’s raw score and convert it into z-scores, t-scores, and percentiles. For those taking the psychology EPPP exam, there are questions that ask you to compare which value, z-score, t-score, or percentile, is higher or lower.

    6.4.1: Standard Deviation

    The standard deviation is a ubiquitous statistic. When you see a mean, you should see a standard deviation value reported alongside the mean. Both values, in concert, provide good information about the quality of the distribution of the scores for a variable.

    The formula for the standard deviation is the square root of the variance value. The variance value is expressed as squared units or sum of squares. It is hard to interpret what a good high value is. The variance value is based on the measurement system of the variable. If a variable is a Likert scale from one to five, it will have a different variance value compared to an interval variable such as one to 100. So, a variance value of 50 cannot be compared to a variance value of 500 because the values are based on different variable value ranges.

    Taking the square root of the variance gives us the average deviation of scores from the mean. This value is a metric that is used as a gauge to measure how spread apart the scores are around the mean. The metric allows for easy communication about the spread of scores or the distance from the mean. Think of the standard deviation as a distance measurement unit. It tells us how far the score is from the mean. The standard deviation is an easy communication system. Standard means that everyone knows what you are talking about. That is why we call it a “standard” deviation. The term standard means that everyone agrees on what it is and its properties.

    These are the two common ways we use standard deviation: location and range.

    Location, location, location.

    First, we use the standard deviation to communicate the location of a variable’s score. When you see this set – Mean = 10, SD = 2. What do you do?

    First, what you need is a particular score. Your goal is to determine the location of a score. If someone tells you that a score is 12, you want to know how you interpret that score. Is that a good or bad score? To do so, you take the score and determine if it is above or below the mean. Add the SD to the mean; in this case, you get 12. The score under question is one standard deviation above the mean. That seems like a good score because the person’s score is higher than the mean by one standard deviation. If someone tells you that a score is 8, is that a good or bad score? You take the score and determine if it is below the mean. Subtract the SD from the mean; in this case, you get 8. The score under question is one standard deviation below the mean. That seems like a bad score because the person’s score is lower than the mean by one standard deviation.

    If someone wants to know what a score would be if someone scored two standard deviations above the mean, then you add the standard deviation value to the mean. In this case, you add two standard deviation values, and 10 + 2 + 2 = 14. For scores that are two standard deviations below the mean, subtract the values: 10 – 2 – 2 = 6.

    We use the standard deviation to communicate if a particular score is above or below the mean. All values of the SD are + or - because you can go above or below the mean. The sign of the standard deviation communicates the location above or below the mean.

    The number of SDs communicates how far away a score is from the mean. We use the number of standard deviations and the value of the standard deviation to communicate how far the score is from the mean.

    For example, a given score of 12, with a mean of 10 and an SD of 2. That score is 1 SD above the mean. A given score of 6, with a mean of 10 and an SD of 2. That score is 2 SD below the mean.

    Notice the notation of the standard deviation. This, SD = 2, is different than this, 2 SD. This notation, SD = 2, tells you the value of each standard deviation unit. This notation, 2 SD, tells you how many standard deviation units a score is away from the mean.

    Think of SD = 2, like using different measuring systems. For instance, an inch is this long ––––. But a centimeter is this long ––. Both are useful, but we must decide on what unit we are using. In baking, we often communicate using cups or tablespoons. However, advanced bakers advocate using a scale and weighing ingredients using grams. The reason is that measuring a cup of flour is not quite the same amount as the precision of using grams of flour (hint: Google the reasons). The point of this baking example is that we have to decide on what measuring unit we are going to use. In some ways, the notation SD = 2 just indicates that the unit we are using to measure the distance of a particular score is away from the mean; the “2” means that we are basing this distance unit according to the variable range and the variance of the scores. For a variable with a range of 1 to 10, an SD = 2 is based on the range and the variance or spread of scores around the variable’s mean. So, for a variable with a range of 1 to 100, SD = 20 is based on the range and the variance or spread of scores around the variable’s mean.

    The notation of 2 SD is different because it is part of a communication of the location of a particular score. We never say “2 SD” on its own. We say, “A score is 2 SD above the mean.” We usually want to know how far a particular score is from the mean to evaluate the score’s distance. We also want to know what a score would be if it were one, two, or three standard deviations above or below the mean. Indicating how many SDs is an indication of the location of the score.

    Why do we want to know about the location of a score? The location of the score is an indication of distance away from the mean. The distance away from the mean is part of the process of determining significance, and this issue is addressed in chapter eight. For now, we want to know if the score of interest is well beyond the mean. Put another way, we want to know if the score is interesting (significant, rare) because it is well beyond average (something that is typical and frequent). A particular score is significant if it generally falls two SD above/below the mean. A particular score is not significant if it generally falls within one SD above/below the mean.

    What do we mean when we say the location of the score? Recall that all values, or scores, are placed on a continuum. For continuous variables, this continuum is from low to high. It is always low to high. We always discuss scores in terms of how low or high they are relative to the starting point.

    When we say the location of the score, we mean where the score is located along the continuum along the X-axis of an X-Y frequency chart. Location is important when we discuss significance. Fast forward to the significance of a statistical test, when we determine significance by determining if a result rarely occurs and if it is far away from the mean. When we discuss how far a score is away from the mean, we mean the location and use the standard deviation to determine exactly how much a score is away from the mean.

    Does it matter if the standard deviation value is small or large? There is no inherent value if the standard deviation is small or large. The value matters if you want a homogeneous or heterogeneous sample. High standard deviations suggest considerable variability in the sample. High variability à heterogeneous sample: The higher the SD, the greater the spread and wide distribution. Low standard deviations suggest less variability in the sample. Low variability -> homogenous sample.

    Studies that describe the extent of an issue usually have heterogeneous variation. We want to see that everyone is different from each other and that there is a range of differences. In descriptive studies, such as epidemiological studies, you want a wide standard deviation indicating a heterogeneous sample. For example, discovering that there is a considerable range of people who need services, such as students who experience stress over homework. You want to see that there is a range of students who have some, more, or considerable stress over homework. In correlation studies, you want a high standard deviation, indicating a heterogeneous sample to establish patterns between variables. For example, those who suffer from micro aggressions and low social support also have mental health concerns. Yet, those who suffer from micro aggressions but have high social support have fewer mental health concerns. You cannot establish this pattern unless you have variation.

    The instances where you want studies to have a small standard deviation value usually come from comparing groups, especially treatment and control groups. You do not want those groups to overlap; you want them to be distinct, so you want each group’s variation to be small. For example, to examine a treatment effect, your treatment and control groups at the pre-test likely have similar sizes in their standard deviation values. At the post-test for the treatment group, you want everyone in the treatment group to have a positive outcome. For example, you want everyone who completed the mindfulness treatment to come out more centered and better able to handle stress. You want your treatment group to be homogenous, hence a low standard deviation value.

    There are other uses for evaluating the size of the standard deviation value.

    You might have wondered what happens if the SD equals zero. If the variable’s SD = 0, there is no deviation or spread. The variable is a constant. Completely useless. This situation never happens because there usually is some variation in every variable.

    • Recall that if the SD value is greater than the mean, that is a red flag that the distribution is not normal.
    • If the SD value is just about the mean value, that suggests wide variability in the sample. You might consider whether the distribution is normal or not.
    • Or take the value of the mean and divide it in half; if the SD is about half the mean, that also suggests wide variability.
    • If the SD value is less than ¼ of the mean, that suggests little variability.

    Why do we take the square root of the variance to calculate the standard deviation?

    The answer to that question is mathematically complicated. Maybe one way to approach this question is to think about how we want the standard deviation units to be equal in length. Although dividing something in half would provide that result, continuing to divide it in half may not produce units that are equal in length. So, it is easier to take the square root to achieve that goal.

    If you bring home a pizza, you want it all to yourself. If your roommate wants some pizza, you will take the square root of the pizza or divide it in half. Then, if your roommate brings their partner to the impromptu pizza party, you will have to figure out how to provide equal proportions. Taking the square root of the pizza would do just that.

    The reason why we have different values for the standard deviation is because no matter the size of the pizza, taking the square root will give you equal proportions. However, each of those proportions differs in size depending on the size of the original pizza. However, the square root keeps all these proportions equivalent, no matter what size of pizza it is. Hence, the square root keeps all the standard deviation units equivalent, no matter the size or the number associated with the variance.

    Range

    The range is a simple way to describe variance. It is based on the highest and lowest scores for a variable. Simple as it is, it has its uses.

    The range is good for describing the variability of continuous variables without limits. It is good for studies where you need the upper and lower limits to define the sample. Age, height, and weight are important variables for diet and eating disorder studies, and it is useful to know the lowest and highest values for each of the variables to understand the outcome of these studies. In medical studies, variables such as HIV viral loads and blood cell counts often report the lowest and highest values in their sample. Studies about incarceration have the lowest and highest number of criminal acts to contextualize the severity of their sample.

    6.4.2: Z-scores

    Z-scores are used to communicate the location of scores. To find out how far a score is from the mean, you need three pieces of information: the score, the mean, and the standard deviation. With a z-score, all you need is the z-score. Simpler….

    If a score is 1 SD above the mean, the z-score is +1; if it is 2 SD above the mean, the Z-score is +2; if it is 3 SD, the Z-score is +3. If a score is -1 SD below the mean, the z-score is -1; -2 SD below the mean, the Z-score is -2; 3 SD, the Z-score is -3.

    Z-scores transform raw scores. Z-scores are used to express (transform) raw scores in terms of standard deviations. Raw scores are those scores where you cannot understand the meaning of the score based on the score alone. X = 20. What does 20 represent? I have no clue; you need context. The high score, the low score, the mean, and the SD. But if I said the raw score of 20 is transformed into a z-score of +2, then you know exactly what's going on. You know that this score of 20 is two standard deviations above the mean. In other words, it's a high score, possibly a significant score.

    The common use of z-scores is that they allow us to compare different distributions with different scale ranges. Z-scores are useful because different distributions from two or more variables have different means and standard deviations, different ranges, and low and high scores. There's no way to compare them directly. For example, we have many ways to measure depression. The Center for Epidemiological Studies Depression Scale is 0 to 60. The patient health questionnaire, PHQ-9, PHQ-9 score can range from 0 to 27. How do you compare a CESD 50 with a PHQ-9 of 17? Another example is Academic aptitude. The SAT score range is 400 to 1600, and the ACT score range is 1 (low) to 36 (high). How do you compare SAT 1000 with ACT 27?

    The solution is to standardize them so that all the scores’ distributions have a mean of 0 and an SD of 1. Using the same metric allows for easy comparison. It allows us to get a quick picture of how well above or below the mean the score is. Allows for better communication of scores when the score is standardized.

    Using our depression example, for Z-scores, how do you compare a CESD 50 with a PHQ-9 of 17? CESD Z score is +3, and PhQ z-score is +2. A person has a higher score on the CSED compared to the PhQ. Academic aptitude: The SAT Z score is +3, and the ACT Z score is +2. A person has a higher score on the SAT compared to the ACT.

    To calculate Z-scores, take the score of interest, subtract it from its mean, then divide it by its standard deviation. For the Self-Esteem Scale, the range is from 20 (no self-esteem) to 80 (high self-esteem). Sample M = 63, SD = 5. The raw score and individual client score are 76. The score is over 2 SD away from the mean. Self-Esteem Raw score converted to Z score: (76 – 63)/5 = 2.60 Z score. Just say the z score is 2.60, and you know exactly where that score is located and if it is significant.

    For the Satisfaction with Life Scale, the range is from 5 (not satisfied) to 35 (satisfied). Sample M = 20, SD = 4.

    The raw score and individual client score are 29. The score is over 2 SD away from the mean. Satisfaction Raw score converted to Z score: (29 – 20)/4 = 2.25 Z score. Just say the z score is 2.25, and you have all the information you need.

    Helpful features of Z scores are:

    • You quickly get information based on the z-score. If a Z score is 2.5, it means it is above the mean and in the upper 5% of scores. If a Z score is -.5, you know that it is below the mean and within 68% of scores around the mean.
    • You can compare the scores from different ranges. For example, CESD and BDI have different score ranges. For the same subject, the CESD: Z score is 1.5, and the BDI: Z score is 1. The subject did worse on the CESD. For the same subject, the SAT: Z score is .6, and the ACT: Z score is .9. The subject did better on the ACT.

    Z-scores, so.... They are kind of useless. Honestly, I have never used z-scores in any statistic test, reported them, or read about them in journal articles. I’ve only used them for outlier analysis. Any score with a z-score of above +- 3 is flagged as a possible outlier. To be an outlier, the score needs to be alone, i.e., no other scores around it. But I never report a z-score for outliers. The z-score analysis is for diagnostic purposes that are behind the scenes as I diagnose normal distributions.

    6.4.3: T-scores

    A T-score is a standardized score based on a score distribution with a mean of 50 and a standard deviation of 10. For example, a raw score that is 1 standard deviation above its mean would be converted to a T score of 60.

    Note that the t-score is not to be confused with a statistical t-test. A t-test compares the mean scores of two groups. The t-score is a standardized score, and it has nothing to do with a t-test.

    T-scores are used in psychometrics. A t-score in psychometric (psychological) testing is a specialized term that is not the same thing as a t-score that you get from a t-test. T-scores in psychometric testing are always positive, with a mean of 50. A difference of 10 (positive or negative) from the mean is a difference of one standard deviation. For example, a score of 70 is two standard deviations above the mean, while a score of 40 is one standard deviation below the mean.

    A t-score is similar to a z-score — it represents the number of standard deviations from the mean. While the z-score returns values from between -3 and 3 standard deviations from the mean, the t-score has a greater value and returns results from between 0 and 100, and most scores will fall between 20 and 80. Many people prefer t-scores because the lack of negative numbers means they are easier to work with, and there is a larger range, so decimals are almost eliminated. Table 6.1 shows z-scores and their equivalent t-scores.

    Table 6.1 Equivalent Z-scores and T-scores

    z-score

    t-score

    -5

    0

    -4

    10

    -3

    20

    -2

    30

    -1

    40

    0

    50

    1

    60

    2

    70

    3

    80

    4

    90

    5

    100

    6.4.4: Percentiles

    Percentiles tell you the location of the person’s score along the normal distribution and how many subjects scored above and below. Percentiles are used for comparisons. For example, person A scored in the 90th percentile, and person B in the 80th percentile.

    Percentiles are not to be confused with percentages. Percentages need a denominator, which can shift, making comparisons impossible. Take 50%. You have no idea if that is good or bad because you do not know the denominator. 50% could mean 2/4 or 50/100. The denominator shifts, giving 50% a different context.

    Summary Tables

    z score = (raw score - mean)/SD.

    T score = 10 (z-score) + 50.


    This page titled 6.4: Standardized Values of Reporting Scores is shared under a CC BY 4.0 license and was authored, remixed, and/or curated by Peter Ji.