Linear Regression • Topic 1 of 2

Lines of Regression

Regression finds the best-fitting straight line through paired data $(x_i,y_i)$, used to predict one variable from the other.

Two regression lines

There are two distinct lines — one for predicting $y$ from $x$, the other for predicting $x$ from $y$:

$$\text{Line of }y\text{ on }x:\quad y-\bar y=b_{yx}\,(x-\bar x),$$

$$\text{Line of }x\text{ on }y:\quad x-\bar x=b_{xy}\,(y-\bar y).$$

Two key facts

  • Both regression lines pass through the mean point $(\bar x,\bar y)$ — so solving them simultaneously recovers the means.
  • Use the $y$-on-$x$ line to predict $y$ and the $x$-on-$y$ line to predict $x$; using the wrong line gives wrong predictions.
1
Worked Example
The regression line of $y$ on $x$ is $y=2x+3$. Predict $y$ when $x=5$.
Solution

$y=2(5)+3=13.$

2
Worked Example
The regression line of $x$ on $y$ is $x=0.5y+2$. Predict $x$ when $y=4$.
Solution

Substitute $y=4$: $x=0.5(4)+2=2+2=4.$ (Use the $x$-on-$y$ line precisely because we are predicting $x$.)

3
Worked Example
If both regression lines are $2x+3y=8$ and $x+2y=5$, find $\bar x,\bar y$.
Solution

Solve simultaneously: from the second, $x=5-2y$; substitute: $2(5-2y)+3y=8\Rightarrow 10-4y+3y=8\Rightarrow y=2$, then $x=1$. So $\bar x=1,\ \bar y=2$.

4
Worked Example
Which line should you use to estimate $x$ from a known $y$?
Solution

The line of $x$ on $y$: $x-\bar x=b_{xy}(y-\bar y)$.

Key Points

  • Line of $y$ on $x$: $y-\bar y=b_{yx}(x-\bar x)$; line of $x$ on $y$: $x-\bar x=b_{xy}(y-\bar y)$.
  • Both lines pass through $(\bar x,\bar y)$.
  • Predict $y$ with the $y$-on-$x$ line; predict $x$ with the $x$-on-$y$ line.
  • Solve the two lines simultaneously to get the means.
Tap an option to check your answer0 / 4
Q1.Both regression lines pass through:
Explanation: They intersect at the mean point.
Q2.To predict $y$ from $x$, use the line of:
Explanation: Use $y$ on $x$.
Q3.If $y=2x+3$ (y on x), then at $x=5$, $y=$
Explanation: $2(5)+3=13$.
Q4.Solving the two regression lines simultaneously gives:
Explanation: Their intersection is the mean point.