Jun 27, 2023

Object Detection: A 2-Minute Introduction

Object Detection is one of the most useful applications of Machine Learning. Here is a 2-minute introduction to Object Detection, and how it can help us find objects in pictures.

AI MACHINE_LEARNING COMPUTERS

Santiago

Machine Learning. I run https://t.co/iZifcK7n47 and write @0xbnomial.

Member of Software Developers

While everyone is looking at Large Language Models, Object Detection is one of the most useful applications of Machine Learning.

Here is a 2-minute introduction to Object Detection:
— Santiago (@svpino) June 27, 2023
Object Detection helps us find objects in pictures.

We can do that by training a Machine Learning model with lots of example pictures until it can spot objects by itself.

There are two main ways a computer detects objects:
— Santiago (@svpino) June 27, 2023
The first approach is to find potential objects, then guess what each object is. This is called a "two-stage" detector. This is slower but more accurate.

The second is called a "single-stage" detector and it attempts to do both things at once. This is faster but less accurate.
— Santiago (@svpino) June 27, 2023
There's a big issue with Object Detection:

Teaching a computer from scratch requires too much time, money, and too many pictures.

Instead of starting from scratch, we use pre-trained models that are already trained on large datasets and "fine-tune" them with our pictures.
— Santiago (@svpino) June 27, 2023
Imagine you want to detect birds, and you have a dataset of 500 photos.

Instead of starting from scratch, you can find a model that was pre-trained on millions of pictures and fine-tune it with your photos.

It will be cheap and get you better results.
— Santiago (@svpino) June 27, 2023
I'm sure you have seen these annotations before.

That's the result we get from an Object Detection model: a set of bounding boxes containing every object we care about.

Here is a GIF showing how @Cometml displays annotations around every person on the screen. pic.twitter.com/ybOlqGBCii
— Santiago (@svpino) June 27, 2023
To evaluate Object Detection models, we compare the bounding boxes it predicts to the actual boxes from our annotated dataset.

If they overlap a lot, then our model is doing a good job.

This metric is called "Intersection over Union" or IoU for short.
— Santiago (@svpino) June 27, 2023
We can also look at the Precision and Recall of the model.

Precision tells us how accurate our model is, and Recall tells us how many objects our model can find.

There's a trade-off between Precision and Recall. Ideally, we get a model that balances them appropriately.
— Santiago (@svpino) June 27, 2023
Here is a fantastic article that will show you step-by-step how to build and compare different Object Detection models using TorchVision and @Cometml.

The best part: Open the Colab notebook that comes with the article and make sure you follow along!https://t.co/vgN97Tycis
— Santiago (@svpino) June 27, 2023