Evaluation metrics for Regression | Part - 1

There are 6 evaluations metrics for regression, Those are 

  • Mean Absolute Error
  • Mean Square Error
  • Root Mean Square Error
  • Root Mean Square Log Error
  • R-Square
  • Adjusted R-Square

Let us discuss the first four evaluation metrics in this article

Before diving into the evaluation metrics, let us discuss what is meant by error 


Error: 

If you observe the above image, the distance between the actual point and the predicted point on the straight line is called Error

Error = Actual Value - Predicted Value 

If we observe the above table, Error can be either positive or negative 

If we take the average of all the errors in the dataset, we will be getting the mean error, but if we look closely at the above figure, error values have the possibility of getting both positive and negative, so there are high chances of those values being canceled while doing average

Hence we will be taking the Absolute of the difference of the values  


Mean Absolute Error :

MAE is nothing but taking the average of the difference between the actual and predicted values after making them absolute

There is another way to handle these negative values in the error, that is nothing but MSE


Mean Square Error : 

MSE is nothing but taking the average of the difference between the actual and predicted values after taking the square of them

In these metrics, there is a downside. where the units will be converted while making the square calculations 

 meter * meter = meter2

To handle this scenario, there is another metric that is nothing but RMSE


Root Mean Square Error : 

Here we are taking the root for the MSE. so that whatever unit conversions or value scaling happened will be diluted 

RMSE is being used widely for analyzing the regression models.

It has a drawback in some scenario's, let us see what those are,

Here there is a lot of difference between the values, there is some serious problem in the model

Whereas here the value is predicted close to the actual value 

In the above 2 scenarios, even though the first model is very bad and the second one is good, the RMSE is the same, because it simply gives the difference of both values. In order to solve this issue, we can perform log operations on the values before doing the calculation


Root Mean Square Log Error

RMSLE perform log operations on the values before doing the calculation 


If we see the RMSLE values now, the first data point has 5.3 where are the second has 0.035 which is very negligible 


Now that we understood how to evaluate the models, but what is the benchmark value that you need to compare to check whether your model is correct are not, which we will discuss it in the next part.

Comments

Popular posts from this blog

A complete guide to K-means clustering algorithm

What is Exploratory Data Analysis? | Part 1

COMPARABLE VS COMPARATOR