2: Descriptive Statistics

Last updated
Save as PDF

Page ID: 500

Anonymous
LibreTexts

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

Statistics naturally divides into two branches, descriptive statistics and inferential statistics. Our main interest is in inferential statistics to try to infer from the data what the population might thin or to evaluate the probability that an observed difference between groups is a dependable one or one that might have happened by chance in this study. Nevertheless, the starting point for dealing with a collection of data is to organize, display, and summarize it effectively. These are the objectives of descriptive statistics, the topic of this chapter.

2.1: Three Popular Data Displays
Graphical representations of large data sets provide a quick overview of the nature of the data. A population or a very large data set may be represented by a smooth curve. This curve is a very fine relative frequency histogram in which the exceedingly narrow vertical bars have been omitted. When a curve derived from a relative frequency histogram is used to describe a data set, the proportion of data with values between two numbers a and b is the area under the curve between a and b, as
2.2: Measures of Central Location - Three Kinds of Averages
The mean, the median, and the mode each answer the question “Where is the center of the data set?” The nature of the data set, as indicated by a relative frequency histogram, determines which one gives the best answer.
2.3: Measures of Variability
The range, the standard deviation, and the variance each give a quantitative answer to the question “How variable are the data?”
2.4: Relative Position of Data
The percentile rank and z-score of a measurement indicate its relative position with regard to the other measurements in a data set. The three quartiles divide a data set into fourths. The five-number summary and its associated box plot summarize the location and distribution of the data.
2.5: The Empirical Rule and Chebyshev's Theorem
The Empirical Rule is an approximation that applies only to data sets with a bell-shaped relative frequency histogram. It estimates the proportion of the measurements that lie within one, two, and three standard deviations of the mean. Chebyshev’s Theorem is a fact that applies to all possible data sets. It describes the minimum proportion of the measurements that lie must within one, two, or more standard deviations of the mean.
2.E: Descriptive Statistics (Exercises)
These are homework exercises to accompany the Textmap created for "Introductory Statistics" by Shafer and Zhang. Complementary General Chemistry question banks can be found for other Textmaps and can be accessed here. In addition to these publicly available questions, access to private problems bank for use in exams and homework is available to faculty only on an individual basis; please contact Delmar Larsen for an account with access permission.