15.2.1: Using Linear Equations
 Page ID
 17420
Before we start practicing calculating all of the variables in a regression line equation, let's work a little with just the equation on it's own.
Regression Line Equations
As we just learned, linear regression for two variables is based on a linear equation:
\[\widehat{\mathrm{Y}}=\mathrm{a}+(\mathrm{b}*{X}) \nonumber \]
where \(a\) and \(b\) are constant numbers. What this means is that for every sample, the intercept (a) and the slope (b) will be the same for every score. The X score will change, and that affects Y (or predicted Y, or \(\widehat{\mathrm{Y}}\)). Some consider the predictor variable (X) as an IV and the outcome variable (Y) as the DV, but be careful that you aren't confusing prediction with causation!
We also just learned that the graph of a linear equation of the form \(\widehat{\mathrm{Y}}=\mathrm{a}+(\mathrm{b}*{X}) \nonumber \) is a straight line.
Exercise \(\PageIndex{1}\)
Is the following an example of a linear equation? Why or why not?
 Answer

No, the graph is not a straight line; therefore, it is not a linear equation.
The minimum criterion for using a linear regression formula is that there be a linear relationship between the predictor and the criterion (outcome) variables.
Exercise \(\PageIndex{2}\)
What statistic shows us whether two variables are linearly related?
 Answer

Pearson's r (correlation).
If two variables aren’t linearly related, then you can’t use linear regression to predict one from the other! The stronger the linear relationship (larger the Pearson’s correlation), the more accurate will be the predictions based on linear regression.
Slope and YIntercept of a Linear Equation
As we learned previously, \(b =\) slope and \(a = y\)intercept. From algebra recall that the slope is a number that describes the steepness of a line, and the \(y\)intercept is the \(y\) coordinate of the point \((0, a)\) where the line crosses the \(y\)axis. Figure \(\PageIndex{2}\) shows three possible graphs of the regression equation (\(y = a + b\text{x}\)). Panel (a) shows what the regression line looks like if the slope is positive (\(b > 0\)), the line slopes upward to the right. Panel (b) shows what the regression line looks like if there's no slope (\(b = 0\)); the line is horizontal. Finally, Panel (c) shows what the regression line looks like if the slope is negative (\(b < 0\)), the line slopes downward to the right.
I get it, everything has been pretty theoretical so far. So let's get practical. Let's try constructing the regression line equation even when you don't have the scores for either of the variables. First, we'll start by identifying the variables in the examples.
Example \(\PageIndex{1}\)
Svetlana tutors to make extra money for college. For each tutoring session, she charges a onetime fee of $25 plus $15 per hour of tutoring. A linear equation that expresses the total amount of money Svetlana earns for each session she tutors is \(y = 25 + 15\text{x}\).
What are the predictor and criterion (outcome) variables? What is the \(y\)intercept and what is the slope? Answer using complete sentences.
Answer
The predictor variable, \(x\), is the number of hours Svetlana tutors each session. The criterion (outcome) variable, \(y\), is the amount, in dollars, Svetlana earns for each session.
The \(y\)intercept is the constant, the one time fee of $25 (\(a = 25\)). The slope is 15 (\(b = 15\)) because Svetlana earns $15 for each hour she tutors.
Although it doesn't make sense in these examples, the yintercept (a) is determined when \(x = 0\). I guess with Svetlana, you could say that she gets $25 for any sessions that you miss or don't cancel ahead of time. But geometrically and mathematically, the yintercept is based on when the predictor variable (x) has a value of zero.
Exercise \(\PageIndex{3}\)
Jamal repairs household appliances like dishwashers and refrigerators. For each visit, he charges $25 plus $20 per hour of work. A linear equation that expresses the total amount of money Jamal earns per visit is \(y = 25 + 20\text{x}\).
What are the predictor and criterion (outcome) variables? What is the \(y\)intercept and what is the slope? Answer using complete sentences.
 Answer

The predictor variable, \(x\), is the number of hours Jamal works each visit. he criterion (outcome) variable, \(y\), is the amount, in dollars, Jamal earns for each visit.
The yintercept is 25 (\(a = 25\)). At the start of a visit, Jamal charges a onetime fee of $25 (this is when \(x = 0\)). The slope is 20 (\(b = 20\)). For each visit, Jamal earns $20 for each hour he works.
Now, we can start constructing the regression line equations.
Example \(\PageIndex{2}\)
Alejandra's Word Processing Service (AWPS) does word processing. The rate for services is $32 per hour plus a $31.50 onetime charge. The total cost to a customer depends on the number of hours it takes to complete the job.
Find the equation that expresses the total cost in terms of the number of hours required to complete the job. For this example,
 \(x =\) the number of hours it takes to get the job done.
 \(y =\) the total cost to the customer.
Answer
The $31.50 is a fixed cost. This is the number that you add after calculating the rest, so it must be the intercept (a).
If it takes \(x\) hours to complete the job, then \((32)(x)\) is the cost of the word processing only.
Thus, the total cost is: \(y = 31.50 + 32\text{x}\)
Let's try another example of constructing the regression line equation.
Exercise \(\PageIndex{4}\)
Elektra's Extreme Sports hires hanggliding instructors and pays them a fee of $50 per class as well as $20 per student in the class. The total cost Elektra pays depends on the number of students in a class. Find the equation that expresses the total cost in terms of the number of students in a class.
 Answer

For this example,
 \(x =\) number of students in class
 \(y =\) the total cost
The constant is $50 per class, so that must be the intercept (a).
So $20 per student is the slope (b).
The resulting regression equation is: \(y = 50 + 20\text{x}\)
You can also use the regression equation to graph the line if you input scores from your X variable and your Y variable into the equation. Let's see what that might look like in Figure \(\PageIndex{3}\) for the equation: \(y = 1 + 2\text{x}\)
In the example in Figure \(\PageIndex{3}\), the intercept (a) is replaced by 1 and the slope (b) is replaced by 2 to get the regression equation (\(y = 1 + 2\text{x}\)). Right now, you are being provided these constants. Soon, you'll be calculating them yourself!
Summary
The most basic type of association is a linear association. This type of relationship can be defined algebraically by the equations used, numerically with actual or predicted data values, or graphically from a plotted. Algebraically, a linear equation typically takes the form \(y = mx + b\), where \(m\) and \(b\) are constants, \(x\) is the independent variable, \(y\) is the dependent variable. In a statistical context, a linear equation is written in the form \(y = a + bx\), where \(a\) and \(b\) are the constants. This form is used to help readers distinguish the statistical context from the algebraic context. In the equation \(y = a + b\text{x}\), the constant b that multiplies the \(x\) variable (\(b\) is called a coefficient) is called the slope. The constant a is called the \(y\)intercept.
The slope of a line is a value that describes the rate of change between the two quantitative variables. The slope tells us how the criterion variable (\(y\)) changes for every one unit increase in the predictor (\(x\)) variable, on average. The \(y\)intercept is used to describe the criterion variable when the predictor variable equals zero.
Contributors and Attributions
Barbara Illowsky and Susan Dean (De Anza College) with many other contributing authors. Content produced by OpenStax College is licensed under a Creative Commons Attribution License 4.0 license. Download for free at http://cnx.org/contents/30189442699...b91b9de@18.114.
