Entry 10 of 13
ML Fundamentals Series
·1 min read

MSE Penalizes Big Mistakes Harder: R² Tells You If Your Model Even Learned Anything

You've trained a regression model. It outputs numbers. How do you know if those numbers are any good? Two metrics cover this from different angles: MSE measures the size of your errors, R2R^2 measures whether your model learned anything at all.

Mean Squared Error: for each prediction, compute the difference between actual and predicted value, square it, average across all predictions:

MSE=1ni=1n(yiy^i)2\text{MSE} = \frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y}_i)^2

The squaring does two things: negatives and positives don't cancel, and large errors get penalized disproportionately (error of 10 → squared error of 100; error of 1 → 1). This makes MSE sensitive to outliers. RMSE (root of MSE) has the same properties but units match your original variable, which is easier to interpret.

R2R^2 (coefficient of determination) represents goodness of fit on a scale from 0 to 1:

R2=1SSresSStotal=1(yiy^i)2(yiyˉ)2R^2 = 1 - \frac{SS_{\text{res}}}{SS_{\text{total}}} = 1 - \frac{\sum(y_i - \hat{y}_i)^2}{\sum(y_i - \bar{y})^2}

R2=1R^2 = 1 means perfect fit. R2=0R^2 = 0 means the model does no better than always predicting the mean. R2<0R^2 < 0 is possible, it means the model is actively worse than the mean baseline.