Introduction

In this course block and its subsequent chapters we will demonstrate how to build your own YOLO model in practice. In the end, you will be able to build a "ready-to-go" model that can detect objects in images, videos, and live streams.

Along the way we will explore the four major steps of building a YOLO model:

graph LR
  A[Data Acquisition] --> B[Annotation];
  B --> C[Training];
  C --> D[Inference];
  click A "../acquisition" _self
  click B "../annotation" _self
  click C "../training" _self
  click D "../inference" _self
  classDef active fill:#950f42

Let's get started! 🚀

In addition to the theoretical foundation, we will look at the following chapters using a practical example. This will enable us to better understand and apply the theoretical concepts.

Prerequisites

0. What's our goal?

We will start with defining a goal for our practical example.

Build a YOLO model that can detect and classify different types of banknotes (5€ and 10€).

In order to build this kind of model, we will first need to find and generate suitable dataset.

1. Project structure

Start with creating a new folder for our project:

📁 yolo_training/

2. Virtual environment

Create a virtual environment. Now, you should have the following structure:

📁 yolo_training/
├── 📁 .venv/

Be sure to activate the environment!

3. Install packages

Install the necessary packages in this specific order: label-studio, ImageEngine, opencv-python, ultralytics. This time, the sequence of installation is important to due some dependencies.