Semantic Segmentation with Pre-trained Models

This tutorial builds on the previous tutorial that introduced the Detector class for object and instance detection. In contrast to object detection, which focuses on predicting bounding boxes, and instance segmentation, which identifies individual object masks, semantic segmentation classifies every pixel in an image based on its semantic meaning. It does not differentiate between separate instances of the same class but instead assigns a class label to each pixel. In this module, this functionality is encapsulated by the Segmenter abstract class, which provides a unified interface for performing semantic segmentation on image data.

Each subclass of the Segmenter class provides its own implementation of the segment method, tailored to the specific framework and architecture used an underlying neural network. This allows for flexibility in supporting various detection models while maintaining a consistent interface.

Currently implemented Segmenter subclasses includes a UNET style architecture based on the UNET (object detection) architecture. Additionally, the LociSegmenter subclass segments an image as the loci of pixels centered on detections obtained through a Detector instance.

An architecture can be trained to detect any class of object or instance by adjusting its weights. These model weights are learned by optimizing a loss function on training data - specifically, images paired with corresponding labels. The resulting weights are typically saved to a file, which can later be loaded for inference. This tutorial assumes that the model weights have already been calculated and are available for the architecture being used. If you have do not have model weights, you may wish to follow the tutorial on model training for guidance on how to obtain them.

The examples below demonstrate how to use the UNET and LociSegmenter classes for semantic segmentation, respectively.

# UNET
from gwel.networks.UNET import UNET
from gwel.networks.UNET import UNet

classes = [ 'class_name_1','class_name_2',...,'class_name_n']
unet = UNet(in_channels = 3, out_channels = len(classes))
weights= "path/to/model/weights"
segmenter = UNET(unet, weights, patch_size=256, channels = classes)

dataset.segment(segmenter)

#LociSegmenter with YOLOv8 Detector
from gwel.networks.loci import LociSegmenter
from gwel.networks.YOLOv8 import YOLOv8

model_weights_path = 'path/to/model/weights'
detector = YOLOv8(weights = model_path)

h = 10
n = 10
segmenter = LociSegmenter(detector, bandwidth = h, kernel_size = n)

dataset.segment(segmenter)

To visualize the segmentation, create a Viewer instance and set its mode attribute to 'segmentation'. You may also want to adjust the contour_thickness attribute.

from gwel.viewer import Viewer
viewer = Viewer(dataset,max_pixels=1500)
viewer.mode="segmentation"
viewer.contour_thickness = 4
viewer.open()
#to navigate to the next or previous images use the 'n' and 'p' keys respectively.
#press the 'q' key to quit. 
#pressing the 'f' key will flag images, see earlier tutorials for a recap on flagging.

By default, the ImageDataset.segment method automatically caches the segmentation by storing run length encoded binary arrays in COCO json format at '.gwel/masks.json' inside the images directory. When the segment method is called a second time, it will automatically read this file without executing the model, unless the use_saved optional argument is set to False. Additionally, if you do not wish to cache or overwrite an existing masks.json , set the write optional argument to False when calling the segment method.