Segmentation
After learning how to detect objects in images, we can now go one step further: Instead of just detecting where objects are located using bounding boxes, we can identify exactly which pixels belong to each object. This is called segmentation.
Project Setup
We'll continue with the project structure from before and create a new jupyter notebook yolo_segment.ipynb
:
We'll start with the picture pic2.jpg
that we used for detection:

Inference
Pretrained Models
Just like with detection, YOLO provides pre-trained models specifically for segmentation. These models have been trained on the COCO dataset but with segmentation masks instead of just bounding boxes.
Running the Segmentation
The code for segmentation is very similar to detection - we just need to use a segmentation model instead:
# Import required libraries
from ultralytics import YOLO
# Define the path to the source picture
picpath = "pics/pic2.jpg"
# Load a pretrained YOLO11 Segmentation Model (Size: Nano)
model_seg = YOLO("yolo11n-seg.pt") # (1)!
# Apply the model to our source picture
results = model_seg(picpath)
- As with detection, there are also different models for segmentation:
YOLO11n-seg
,YOLO11s-seg
,YOLO11m-seg
,YOLO11l-seg
,YOLO11x-seg
image 1/1 c:\path\to\pics\pic2.jpg: 448x640 4 persons, 1 car, 5 motorcycles, 3 traffic lights, 1 stop sign, 52.7ms
Speed: 2.0ms preprocess, 52.7ms inference, 4.0ms postprocess per image at shape (1, 3, 448, 640)
The output looks similar to detection, but behind the scenes YOLO has created detailed segmentation masks for each object!
Task: Analyze the Segmentation Results
Now it's time to analyze the segmentation results and compare them to the results from our detection. Take a closer look at the results
and answer the following questions:
- What's the difference between the detection results and the segmentation results now?
- What's the difference between
boxes
andmasks
? What information is stored in these variables? - What's the shape of a mask and what does each dimension represent?
- How are the coordinates in masks different from bounding boxes?
- Visualize the results by saving the resulting image.
Visualizing Segmentation Results
A graphical representation of the results can also be useful for segmentation. For this, the same commands are available as for detection.

Another visualization option is the result.plot
command. With this, you can customize how the segmentation (or detection) results are displayed to better suit your analysis or presentation needs, allowing you to highlight specific features like bounding boxes, segmentation masks, confidence scores, or class labels.
fname = "output_segmentation.jpg"
result.plot(
show = True, # Display the plot immediately
save = True, # Save the plotted image to a file
filename = fname, # Specify the filename for the saved image
boxes = True, # Include bounding boxes around detected objects
masks = True, # Overlay segmentation masks on the image
conf = False, # Do not display confidence scores for the predictions
labels = True, # Display class labels for each detected object
)
Inference Arguments
Many of the same inference arguments from detection also work with segmentation, plus some additional ones specific to masks. Therefore check the documentation.
Task: Segmentation Practice
Try these exercises to better understand image segmentation:
- Mask Quality (Inference Argument)
- Run segmentation with
retina_masks=True
- Compare the output with default masks
- What differences do you notice in quality and speed?
- Run segmentation with
- Compare different model sizes (nano vs. small vs. medium) for segmentation
- Experiment with different confidence thresholds
- Try segmenting different types of images
🎉 Congratulations
You've learned the basics of image segmentation! Try applying these concepts to your own projects and explore more advanced techniques.