10.1: Bivariate Data and Scatter Plots
- Page ID
- 46190
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\( \newcommand{\dsum}{\displaystyle\sum\limits} \)
\( \newcommand{\dint}{\displaystyle\int\limits} \)
\( \newcommand{\dlim}{\displaystyle\lim\limits} \)
\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)
( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\id}{\mathrm{id}}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\kernel}{\mathrm{null}\,}\)
\( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\)
\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\)
\( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)
\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)
\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)
\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vectorC}[1]{\textbf{#1}} \)
\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)
\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)
\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\(\newcommand{\longvect}{\overrightarrow}\)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)- Define bivariate data and recognize that it involves pairs of related values.
- Interpret scatter plots to identify patterns, trends, or relationships between variables.
- Use scatter plots to detect correlations, clusters, or outliers in a dataset.
Bivariate data involves two variables that are analyzed together to determine their relationship. One common way to visualize bivariate data is through a scatterplot, a graph displaying data points on a coordinate plane. Each point represents a pair of values, plotting one variable on the x-axis and the other on the y-axis. Scatterplots help identify patterns, trends, and correlations between the variables, such as positive, negative, or no correlation, making them useful for statistical analysis and predictions.
Bivariate data involves two variables, typically measured simultaneously for each observation or data point. Each variable is paired, and each pair represents a single observation in the dataset. The pairs are represented using the variables \((x,y)\) where \(x\) is called the independent variable and \(y\) is called the dependent variable.
Bivariate data is used to analyze and understand the relationship between two variables. Below are a few key reasons for using bivariate data.
-
Exploring Relationships:
Bivariate data allows you to determine if and how two variables are related. For example, you might explore the relationship between study hours and test scores. -
Identifying Correlation:
It helps quantify the strength and direction of the relationship between two variables. Correlation coefficients (e.g., Pearson's r) are often used. -
Predicting Outcomes:
By analyzing bivariate data, you can predict the value of one variable based on the value of another. For example, using height to predict weight.
An educational researcher is interested in studying the relationship between the amount of sleep (measured in hours) a student gets the night before an exam and their performance on the exam (measured out of 100 points). The collected data will consist of ordered pairs, where the first value represents the hours of sleep and the second value represents the exam score. One data point collected during the study was (6, 87). Use this pair to answer the following questions.
- Determine the independent variable value in this context and explain what it represents.
- Determine the dependent variable value in this context and explain what it represents.
Solution
- The independent variable value is 6, representing the number of hours the student slept before the exam.
- The dependent variable value is 87, representing the number of points out of 100 that a student earned on the exam.
Definition of Independent and Dependent Variables
The independent or explanatory variable ( commonly represented as \(x\)) explains or causes changes in the response variable. The independent variable can be manipulated or changed by the researcher.
The dependent variable or response variable (commonly represented by \(y\)) reflects the outcome of a study. This variable is measured or observed by the researcher. For example, if we are studying how the amount of time spent studying impacts exam scores, study time serves as the predictor variable, while the exam score is the response variable.
In experimental data, it is usually straightforward to identify the independent and dependent variables. However, this distinction can be more challenging in observational data. The dependent variable is the one you aim to study or gain insights about.
When examining the relationship between the unemployment rate and the economic growth rate, it may not be immediately clear which variable should be \(x\) and which should be \(y\). The choice depends on whether we aim to predict the unemployment rate or the economic growth rate. It is important to avoid drawing cause-and-effect conclusions from observational data. A strong relationship between these two variables does not necessarily imply that one directly causes changes in the other. Other factors, such as retirements or pandemics, could influence both rates simultaneously.
Scatter Plot
A common method for examining the relationship between independent and dependent variables is to plot all the data points on a coordinate system. This type of graph, known as a scatter plot, helps visualize the relationship (if any) between \(x\) and \(y\).
A scatter plot is a type of graph used to display and analyze the relationship between two numerical variables. Each data point in a scatter plot represents an ordered pair \((x, y)\) and is plotted on a two-dimensional coordinate system. The independent variable \(x\) is typically placed on the horizontal axis, while the dependent variable \(y\) is on the vertical axis. Scatter plots are commonly used to identify patterns, trends, correlations, or potential outliers in the data.
An educational researcher is interested in studying the relationship between the amount of sleep (measured in hours) a student gets the night before an exam and their performance on the exam (measured out of 100 points). The collected data will consist of ordered pairs, where the first value represents the hours of sleep and the second value represents the exam score. A list of the collected data points is provided below. Construct a graph and a scatter plot and describe the relationship (if any) between them.
| x: Hours of Sleep on the Night Before Exam | y: Points (out of 100) Earned on the Exam |
|---|---|
| 8 | 75 |
| 6 | 86 |
| 3 | 72 |
| 4 | 65 |
| 2 | 68 |
| 0 | 50 |
| 12 | 90 |
| 7 | 98 |
| 10 | 84 |
Table \(\PageIndex{1}\): The Data Consists of Hours of Sleep the Night Before an Exam and Exam Score
Solution
The scatter plot is plotted on the graph represented in the image below.
Author
"10.1: Bivariate Data and Scatter Plots" by Alfie Swan is licensed under CC BY-SA 4.0
Attributions
"12.1: Correlation" by Rachel Webb is licensed under CC BY-SA 4.0


