Two-Variable Data • Topic 2 of 5

Correlation

Correlation describes how strongly two variables move together and in what direction. A correlation can be positive (both rise), negative (one rises while the other falls), or near zero (no linear relationship). It is measured on a scale from −1 to 1: values near 1 or −1 mean a strong linear relationship, values near 0 mean a weak one. Crucially, correlation is not causation — two variables can correlate without one causing the other. A useful related quantity is the residual: how far an actual data value sits above or below the line of best fit, found as actual minus predicted.

Points near a rising line of best fit with one residual markedCorrelation & residualsxyresidualResidual = actual − predicted.

✅ Solved examples

1. A line of best fit gives predicted y = 30 at x = 5, but the actual y is 34. Residual?
Actual − predicted = 34 − 30 = 4.
2. Predicted y = 50, actual y = 45. Residual?
45 − 50 = −5.
3. Does a correlation near 0 indicate a strong relationship?
No — near 0 means little or no linear relationship.
4. Ice-cream sales and drownings both rise in summer. Does one cause the other?
No — correlation is not causation; heat drives both.

✏️ Practice — try these, take hints as needed

1. Predicted y = 20, actual y = 26. What is the residual?
Actual − predicted.
26 − 20.
6.
2. Predicted y = 40, actual y = 33. Residual?
33 − 40.
−7.
3. A correlation of −0.9 indicates what kind of relationship?
Sign = direction, size = strength.
Close to −1.
Strong negative.
4. A residual of 0 at a point means what?
Actual = predicted.
The point lies exactly on the line of best fit.
5. Predicted y = 15, actual y = 15. Residual?
15 − 15.
0.

📝 Topic test — 8 questions

Auto-graded with full solutions; saved to your dashboard. Use the calculator and formula sheet (top-right) any time.

Loading questions…