2.3: Statistics as Estimates, or Making Guesses
- Page ID
- 48880
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\( \newcommand{\dsum}{\displaystyle\sum\limits} \)
\( \newcommand{\dint}{\displaystyle\int\limits} \)
\( \newcommand{\dlim}{\displaystyle\lim\limits} \)
\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)
( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\id}{\mathrm{id}}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\kernel}{\mathrm{null}\,}\)
\( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\)
\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\)
\( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)
\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)
\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)
\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vectorC}[1]{\textbf{#1}} \)
\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)
\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)
\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\(\newcommand{\longvect}{\overrightarrow}\)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)A hallmark of statistics is estimation. Estimations are guesses. What are we guessing at? We are guessing the truth.
The goal of statistics is about how we get to the truth. How do we know that these conclusions about patterns of variation among our variables are the truth? When we say something is true, in statistics and in research, we are guessing that what we found is true for everyone, everywhere.
When we say that studying more leads to good grades, we want that pattern to be true for everyone and across all contexts. When we say that attending more therapy sessions leads to better outcomes, we want that pattern to be true for everyone and across all contexts.
The art of making a guess involves understanding the relationship between Sample Statistics and Population Parameters. And these are the first two “official” statistics terms that you want to know (insert your celebration emoji here).
Your first official statistical term is the population parameter. The population parameter is the “truth” or “This is what’s really going on.” Describing the parameter is our goal.
The problem is that we can’t know the truth. It is impossible to know what is really going on. There are logistical reasons for this. The truth means everything, everywhere, everyone. But we can’t take measures of every variable for everyone and everywhere. We do not have large-scale testing of most phenomena for the population. We do not know who is included in the population of interest. For example, we can do nationwide testing of standardized test scores for all students because, in a sense, we have a roster of all students registered in public high schools. However, we don’t have a “roster” of all high school students who have an eating disorder. There is no such thing as a “roster” for this issue, and the logistics are difficult for determining who has an eating disorder and who does not. We do not have standardized measures of psychological phenomena. Although our measurement of eating disorders has improved, debates and discussions still ensue about how to accurately diagnose eating disorders. So, we cannot be sure we have the correct level of eating disorders among high school students. We do not know if we have the correct prevalence rate and level of severity of this issue among high school students. We cannot simply ask, “Who has an eating disorder?” and expect anyone with an eating disorder to come forward. The stigma accompanying eating disorders means students will not readily come forward and identify themselves as having an eating disorder. Therefore, we cannot obtain a population estimate because it is likely that our sample estimate is low or that there is an underreporting of the prevalence of this issue. For issues such as racial identity, career outlook, and risk of dropping out of school, these issues are conceptualized in many ways, and, rightfully so, there is no standard for how these issues should be assessed. With logistical concerns, determining a roster census, and no agreed-upon standard of assessment of a construct, it is difficult to know the “truth” of the nature of an issue.
So, if we cannot know the truth of an issue, the best we can do is to make a guess at the “truth.” This procedure is considered to be an estimate in statistics. When you hear people say we are estimating a parameter, what they are basically saying is we are making a “guess” about the truth about something. This estimation involves using sample statistics, which is your second official statistical term. Sample statistics is the term that describes the process of making guesses about what we find in our corner of the world, which is considered the sample, and this guess is true about the population, which is everyone or everything else in the world. The reason it is called “sample” is that all we can do is take a piece of the puzzle, or a “sample,” and hope that what we found in the sample is true for everyone else. That hope is an estimation or a “guess,” and it is our estimate that what we found is true for everyone.
The trick is making the best guess possible by using a good research design that convinces others to ascertain that we did everything we could to make the best guess possible. This issue will be revisited throughout this textbook.
Both population parameter and sample statistics are conceptual terms. They do not refer to statistical tests or terminology. They are terms to describe the relationship between what we found as a guess about what the truth is.
So, to recap, this is the process of getting to the truth:
- The concepts you want to know are population parameters and sample statistic
- Start here: Parameter – a number that represents the value (level) of a variable that describes what is going on with this issue in the population. Think: Parameter is theoretical: We think this issue exists out there. This is the “truth” about an issue. This is what we want to know.
- Statistic – a number that represents the value (level) of a variable that we use to estimate (or guess) the level of the variable in the population. Think: locally, right here, right now, my corner.
- The process: I think my results are the truth that could apply to everyone or the population.
This is the process of statistics.


