Evaluation metrics for Regression | Part - 1
There are 6 evaluations metrics for regression, Those are
- Mean Absolute Error
- Mean Square Error
- Root Mean Square Error
- Root Mean Square Log Error
- R-Square
- Adjusted R-Square
Let us discuss the first four evaluation metrics in this article
Before diving into the evaluation metrics, let us discuss what is meant by error
Error:
If you observe the above image, the distance between the actual point and the predicted point on the straight line is called Error
Error = Actual Value - Predicted Value
If we observe the above table, Error can be either positive or negative
If we take the average of all the errors in the dataset, we will be getting the mean error, but if we look closely at the above figure, error values have the possibility of getting both positive and negative, so there are high chances of those values being canceled while doing average
Hence we will be taking the Absolute of the difference of the values
Mean Absolute Error :
MAE is nothing but taking the average of the difference between the actual and predicted values after making them absolute
There is another way to handle these negative values in the error, that is nothing but MSE
Mean Square Error :
MSE is nothing but taking the average of the difference between the actual and predicted values after taking the square of them
In these metrics, there is a downside. where the units will be converted while making the square calculations
meter * meter = meter2
To handle this scenario, there is another metric that is nothing but RMSE
Root Mean Square Error :
Here we are taking the root for the MSE. so that whatever unit conversions or value scaling happened will be diluted
RMSE is being used widely for analyzing the regression models.
It has a drawback in some scenario's, let us see what those are,
Here there is a lot of difference between the values, there is some serious problem in the model
Whereas here the value is predicted close to the actual value
In the above 2 scenarios, even though the first model is very bad and the second one is good, the RMSE is the same, because it simply gives the difference of both values. In order to solve this issue, we can perform log operations on the values before doing the calculation
Root Mean Square Log Error
RMSLE perform log operations on the values before doing the calculation
If we see the RMSLE values now, the first data point has 5.3 where are the second has 0.035 which is very negligible
Now that we understood how to evaluate the models, but what is the benchmark value that you need to compare to check whether your model is correct are not, which we will discuss it in the next part.
Comments
Post a Comment