Regression finds the best-fitting straight line through paired data $(x_i,y_i)$, used to predict one variable from the other.
Two regression lines
There are two distinct lines — one for predicting $y$ from $x$, the other for predicting $x$ from $y$:
$$\text{Line of }y\text{ on }x:\quad y-\bar y=b_{yx}\,(x-\bar x),$$
$$\text{Line of }x\text{ on }y:\quad x-\bar x=b_{xy}\,(y-\bar y).$$
Two key facts
- Both regression lines pass through the mean point $(\bar x,\bar y)$ — so solving them simultaneously recovers the means.
- Use the $y$-on-$x$ line to predict $y$ and the $x$-on-$y$ line to predict $x$; using the wrong line gives wrong predictions.