13.2 The Correlation Coefficient r
- one group of subjects, some of whom possess characteristics of trait A, the remainder possessing those of trait B
- measures of trait A on one group of subjects and of trait B on another group
- two groups of subjects, one which could be classified as A or not A, the other as B or not B
- two groups of subjects, one which could be classified as A or not A, the other as B or not B
- 81% of the variation in the money spent for repairs is explained by the age of the auto
- 81% of money spent for repairs is unexplained by the age of the auto
- 90% of the money spent for repairs is explained by the age of the auto
- none of the above
- 20
- 16
- 40
- 80
- plus and minus 10% from the means includes about 68% of the cases
- one-tenth of the variance of one variable is shared with the other variable
- one-tenth of one variable is caused by the other variable
- on a scale from -1 to +1, the degree of linear relationship between the two variables is +.10
- X and Y have standard distributions
- the variances of X and Y are equal
- there exists no relationship between X and Y
- there exists no linear relationship between X and Y
- none of these
- Approximately 0.9
- Approximately 0.4
- Approximately 0.0
- Approximately -0.4
- Approximately -0.9
- height is expressed centimeters.
- weight is expressed in Kilograms.
- both of the above will affect r.
- neither of the above changes will affect r.
13.3 Testing the Significance of the Correlation Coefficient
- anxiety causes neuroticism
- those who score low on one test tend to score high on the other.
- those who score low on one test tend to score low on the other.
- no prediction from one test to the other can be meaningfully made.
13.4 Linear Equations
13. True or False? If False, correct it: Suppose a \(95 \%\) confidence interval for the slope \(\beta\) of the straight line regression of Y on X is given by \(-3.5<\beta<-0.5\). Then a two-sided test of the hypothesis \(H_0\) : \(\beta=-1\) would result in rejection of \(H_0\) at the \(1 \%\) level of significance.
X: Number of widgets purchased – 1, 3, 6, 10, 15
Y: Cost per widget(in dollars) – 55, 52, 46, 32, 25
Suppose the regression line is \(\hat{y}=-2.5 x+60\). We compute the average price per widget if 30 are purchased and observe which of the following?
a. \(\hat{y}=15\) dollars; obviously, we are mistaken; the prediction \(\hat{y}\) is actually +15 dollars.
b. \(\hat{y}=15\) dollars, which seems reasonable judging by the data.
c. \(\hat{y}=-15\) dollars, which is obvious nonsense. The regression line must be incorrect.
d. \(\hat{y}=-15\) dollars, which is obvious nonsense. This reminds us that predicting Y outside the range of X values in our data is a very poor practice.
13.5 The Regression Equation
Information:
- miles driven per day
- weight of car
- number of cylinders in car
- average speed
- miles per gallon
- number of passengers
- there is no relationship between Y and X in the sample
- there is no relationship between Y and X in the population
- there is a perfect negative relationship between Y and X in the population
- there is a perfect negative relationship between Y and X in the sample.
- negative.
- low.
- heterogeneous.
- between two measures that are unreliable.
13.6 Interpretation of Regression Coefficients: Elasticity and Logarithmic Transformation
X | Y |
4 | 8 |
2 | 4 |
8 | 18 |
6 | 22 |
10 | 30 |
6 | 8 |
Regression equation: \(\hat{y}_i=-3.6+3.1 \cdot X_i\)
What is your estimate of the average height of all trees having a trunk diameter of 7 inches?
Suppose that a test has been conducted and results from a computer include:
Intercept = 60
Slope = −4
Standard error of the regression coefficient = 1.0
Degrees of Freedom for Error = 2000
95% Confidence Interval for the slope −2.04, −5.96
Is this evidence consistent with the claim that the number of fleas is reduced at a rate of 5 fleas per unit chemical?
13.7 Predicting with a Regression Equation
The fitted trend line is
\(\hat{y}_j=80+1.5 \cdot X_j\)
(\(\Y_j\): Average yield in j year after introduction)
(\(\X_j\): j year after introduction).
- What is the estimated average yield for the fourth year after introduction?
- Do you want to use this trend line to estimate yield for, say, 20 years after introduction? Why? What would your estimate be?
- most
- half
- very little
- one quarter
- none of these
- r=1.18r=1.18
- r=−.77r=−.77
- r=.68r=.68
13.8 How to Use Microsoft Excel® for Regression Analysis
Part of the computer output includes:
i | bibi | SbiSbi |
0 | 8 | 1.6 |
1 | 2.2 | .24 |
2 | -.72 | .32 |
3 | 0.005 | 0.002 |
Calculation of confidence interval for \(b_2\) consists of _______± (a student's t value) (_______)
- The confidence level for this interval is reflected in the value used for _______.
- The degrees of freedom available for estimating the variance are directly concerned with the value used for _______
Variable | Coefficient | Standard Error of bibi |
1 | 0.45 | 0.21 |
2 | 0.80 | 0.10 |
3 | 3.10 | 0.86 |
- 0.80 is an estimate of ___________.
- 0.10 is an estimate of ___________.
- Assuming the responses satisfy the normality assumption, we can be 95% confident that the value of \(\beta_2\) is in the interval,_______ ± [t.025 ⋅ _______], where t.025 is the critical value of the student's t-distribution with ____ degrees of freedom.