Course Description
You may have come across the terms Precision, Recall, and F1 when reading about Classification Models and machine learning. This tutorial describes the concepts in detail with relevant examples.
What You'll Learn
- Introduction to Precision, Recall, and F1 in data science
- When can you use each for measuring the accuracy of your model
Welcome to this short introduction to Precision, Recall, and F1.
You might have come across these terms when reading about classification models and machine learning but, basically, they’re all ways to measure the accuracy of a model. When you build a model to predict a certain class or category, you need a way to measure how accurate the predictions are. This is what precision, recall, and F1 do: they measure the classification model’s accuracy.
In our video on the confusion matrix, we learned about true positives and negatives, and false positives and negatives. This is how many times a model correctly or incorrectly predicts a class. Precision, recall, and F1 use these to measure a model as making many mistakes when predicting class, or if it’s doing a pretty good job at being spot on in its predictions. But precision, recall, and F1 measure different things.
So, let's break it down into each of their parts and the role each plays in measuring a model’s accuracy. Let’s say your classification model predicts apples and bananas. If your model avoids a lot of mistakes in predicting bananas and apples, then your model has a high precision. Likewise, if your model avoids a lot of mistakes in predicting apples as bananas, then your model has a high recall. You want your model to aim high in both precision and recall, where your model avoids as many mistakes as possible, doing a good job at correctly predicting both apples and bananas. But what if your model aces the ability to predict one class and sucks at predicting the other? Wouldn’t it be misleading to look at precision or recall in isolation?
This is where F1 comes in. It takes into account both precision and recall. A balance of the two is what F1 scores on. If your model does a good job at accurately predicting both apples and bananas, then it will have a high F1 score. There are some cases where you might want to focus on precision more so than recall, and vice versa.
For example, Class A might be an aggressive type of cancer, and Class B might be no cancer. The stakes of misleading cancer as no cancer or overlooking cancer can be extremely high. Therefore, you want your model to avoid mistaking cancer for no cancer or mistaking a for b. This means you want to focus on recall. You don’t want your model to say, “Whoops, I missed the cancer, sorry!” You want your model to say “I got the cancer, maybe I was overly cautious and had mistaken a few no cancer patients for cancer patients," but isn’t it better to have a false alarm of “you don’t have cancer after all!” rather than “sorry, you actually do, but we missed it.” And that sums up Precision, Recall, and F1.
Thanks for watching. Give us a like if you found this useful, or you can check out our other videos at online.datasciencedojo.com.
Rebecca Merrett - Rebecca holds a bachelor’s degree of information and media from the University of Technology Sydney and a post graduate diploma in mathematics and statistics from the University of Southern Queensland. She has a background in technical writing for games dev and has written for tech publications.