# 10: Correlation and Regression

Our interest in this chapter is in situations in which we can associate to each element of a population or sample two measurements $$x$$ and y, particularly in the case that it is of interest to use the value of $$x$$ to predict the value of y. For example, the population could be the air in automobile garages, $$x$$ could be the electrical current produced by an electrochemical reaction taking place in a carbon monoxide meter, and $$y$$ the concentration of carbon monoxide in the air. In this chapter we will learn statistical methods for analyzing the relationship between variables $$x$$ and $$y$$ in this context.

• 10.1: Linear Relationships Between Variables
In this chapter we will analyze situations in which variables x and y exhibit a linear relationship with some randomness. The level of randomness will vary from situation to situation.
• 10.2: The Linear Correlation Coefficient
The linear correlation coefficient measures the strength and direction of the linear relationship between two variables x and y. The sign of the linear correlation coefficient indicates the direction of the linear relationship between x and y.
• 10.3: Modelling Linear Relationships with Randomness Present
For any statistical procedures, given in this book or elsewhere, the associated formulas are valid only under specific assumptions. The set of assumptions in simple linear regression are a mathematical description of the relationship between x and y. Such a set of assumptions is known as a model. Statistical procedures are valid only when certain assumptions are valid.
• 10.4: The Least Squares Regression Line
How well a straight line fits a data set is measured by the sum of the squared errors. The least squares regression line is the line that best fits the data. Its slope and y-intercept are computed from the data using formulas. The slope of the least squares regression line estimates the size and direction of the mean change in the dependent variable y when the independent variable x is increased by one unit.
• 10.5: Statistical Inferences About β₁
The parameter βₗ, the slope of the population regression line, is of primary importance in regression analysis because it gives the true rate of change in the mean E(y) in response to a unit increase in the predictor variable x.
• 10.6: The Coefficient of Determination
The coefficient of determination estimates the proportion of the variability in the variable y that is explained by the linear relationship between y and the variable x. There are several formulas for computing. The choice of which one to use can be based on which quantities have already been computed so far.
• 10.7: Estimation and Prediction
The coefficient of determination estimates the proportion of the variability in the variable y that is explained by the linear relationship between y and the variable x. There are several formulas for computing coefficient of determination; the choice of which one to use can be based on which quantities have already been computed so far.
• 10.8: A Complete Example
In this section we will go through a complete example of the use of correlation and regression analysis of data from start to finish, touching on all the topics of this chapter in sequence.
• 10.9: Formula List
• 10.E: Correlation and Regression (Exercises)