Linear regression: - looks at correlation in terms of Predictability,
Linear regression finds the best fitting line: Y'=a+bX [or X'=ax+bxY]
This is called the least squares regression line [where Y' is the predicted value of Y] -- minimizes the sum of squared deviations, sum of deviations between predicted values of y' and actual observed values of y =0. these deviations are called residuals. [Note that if you were to then plot the residuals Yres against X, there would be no linear relation, the correlation would be 0].
[The linear regression line can be thought of as the straight line that summarizes the linear relationship in a scatterplot by, on average, passing through the average of the Y scores for each X.]
For perfect correlations (r=
± 1.0):1) Every participant who obtained a given value of X obtained one, and only one value of Y: there are no differences in Y scores for a given X
2) Y scores are perfectly predictable from X scores: the data points for a given X are all on top of one another and all data points fall along the regression line.
For intermediate correlations:
1) There are different values of Y for each X, however these different Ys are relatively close in value (the variability in Y associate with a given X is less than the overall variability in Y)
2) knowing X allows prediction of approximately what Y will be: data points will fall near the regression line but not on it.
For zero correlation:
1) Y scores are as variable at a given value of X as in the overall sample
2) The best prediction of Y, regardless of X will be the average of Y and there will be no regression solution.
using standard scores and 2 variables [1 IV], regression coefficient (b) [or raw score regression weight] = standardized regression weight (or
b ) = correlation coefficient (r) 


as the correlation grows less strong, Y' moves less in response to a given change in X, (the slope, b approaches 0). If standard scores (z scores) are plotted, the slope of the least squares regression line = r [r= change in S.D. units in Y' (the predicted value of Y) associated with a change of 1 S.D. in X. If r=0, best predictor of Y from X is the mean of Y, and the best predictor of X from Y is the mean of X. If r=
± 1.0: then the regression line from regressing Y on X and the regression line from regressing X on Y is the same (and passes through the point (mean of X, mean of Y). As the correlation between X and Y weakens, the predicted value of Y' for a Zx=1 will be Zy'<1 and the predicted value of X' for a Zy=1 will be Zx'<1. The regression lines predicting Y' from X and X' from Y diverge with decreasing correlation until at r=0.0, they are perpendicular: horizontal and vertical lines passing through the means of Y and X respectively. This can lead to regression artifact (e.g., Rushton: women less brainy then men.And remember cautions - same as for correlations: assumes linear relations among variables, truncated ranges can reduce correlations or regressions, outliers, heteroscedasticity, etc.