Understanding Cost Functions
Understanding Cost Functions
In machine learning — especially in regression problems — we build models to predict values. But how do we know if our model is doing well or poorly? The answer lies in the cost function, which tells us how far our predictions are from reality.
In this post, we will walk through:
-
What is a cost function?
-
Different types of error measurements
-
Why we needed new ones
-
Which one is better in which case
What is a Cost Function?
A cost function is a formula used to measure how wrong the predictions of a model are.
It compares the predicted values () to the actual values () from the dataset.
The goal of training a model is to minimize the cost function — i.e., make predictions as close to actual results as possible.
Step-by-Step Evolution of Error Metrics
Let’s now go step-by-step through the order in which error/cost functions evolved, and why.
Error
Formula:
This is the raw difference between prediction and actual.
Problem: Some errors are positive, some negative. If you add them, they cancel out!
Absolute Error (AE)
Formula:
By taking the absolute value, we remove the cancellation problem.
But we’re still dealing with individual errors.
Mean Absolute Error (MAE)
Formula:
Now, we average all the absolute errors to get an overall measure of error.
Pros: Easy to understand, doesn’t penalise big errors too harshly
Cons: Not smooth mathematically (harder to use with gradient descent because it's non-differentiable at zero)
Squared Error (SE)
Formula:
We square the error to remove negatives and highlight bigger mistakes more.
E.g., error of 2 becomes 4, but error of 5 becomes 25.
Mean Squared Error (MSE)
Formula:
This is one of the most commonly used cost functions.
-
Always positive
-
Smooth and differentiable (perfect for gradient descent)
-
Penalizes larger errors heavily
Sensitive to outliers (because of squaring)
Root Mean Squared Error (RMSE)
Formula:
Just like MSE, but takes the square root so the final value is in the same unit as the target variable.
-
Easy to interpret (same unit as y)
-
Keeps benefits of MSE
-
Still sensitive to outliers
Which One Should You Use?
-
Use MAE when all errors matter equally and you're okay with small optimization challenges.
-
Use MSE when you want to penalize large errors more (good for gradient-based learning).
-
Use RMSE when you want interpretability (in real units).
No comments for "Understanding Cost Functions"
Post a Comment