Code&Data Insights
[Statistics] R-squared 본문
[ R-squared ]
R-squared - a measure of the goodness-of-fit of a regression model.
- It represents the proportion of the variance in the dependent variable that is predictable from the independent variables.
- the percentage of variation explained by the relationship between two variables.
=> range : 0 to 1
=> R² = 1 - (SSR/SST)
SSR = the sum of squared residuals (the sum of the squared differences between the observed and predicted values of the dependent variable)
SST = the total sum of squares (the sum of the squared differences between the observed dependent variable and its mean)
-> High R-squared leads overfitting !
-> Overfitting: When a model is overfitted to the data, it can result in poor predictive performance and a negative R-squared
-> negative R-squared doesn't necessarily mean that the model is meaningless or useless. However, it does indicate that the model is not a good fit for the data and should be interpreted with caution