This tutorial builds on the previous tutorial that introduced the Detector
class for object and instance detection. In contrast to object detection, which focuses on predicting bounding boxes, and instance segmentation, which identifies individual object masks, semantic segmentation classifies every pixel in an image based on its semantic meaning. It does not differentiate between separate instances of the same class but instead assigns a class label to each pixel. In this module, this functionality is encapsulated by the Segmenter
abstract class, which provides a unified interface for performing semantic segmentation on image data.
Each subclass of the Segmenter
class provides its own implementation of the segment
method, tailored to the specific framework and architecture used an underlying neural network. This allows for flexibility in supporting various detection models while maintaining a consistent interface.
Currently implemented Segmenter
subclasses includes a UNET
style architecture based on the UNET (object detection) architecture. Additionally, the LociSegmenter
subclass segments an image as the loci of pixels centered on detections obtained through a Detector
instance.
An architecture can be trained to detect any class of object or instance by adjusting its weights. These model weights are learned by optimizing a loss function on training data - specifically, images paired with corresponding labels. The resulting weights are typically saved to a file, which can later be loaded for inference. This tutorial assumes that the model weights have already been calculated and are available for the architecture being used. If you have do not have model weights, you may wish to follow the tutorial on model training for guidance on how to obtain them.
The examples below demonstrate how to use the UNET
and LociSegmenter
classes for semantic segmentation, respectively.
# UNET
from gwel.networks.UNET import UNET
from gwel.networks.UNET import UNet
classes = [ 'class_name_1','class_name_2',...,'class_name_n']
unet = UNet(in_channels = 3, out_channels = len(classes))
weights= "path/to/model/weights"
segmenter = UNET(unet, weights, patch_size=256, channels = classes)
dataset.segment(segmenter)
#LociSegmenter with YOLOv8 Detector
from gwel.networks.loci import LociSegmenter
from gwel.networks.YOLOv8 import YOLOv8
model_weights_path = 'path/to/model/weights'
detector = YOLOv8(weights = model_path)
h = 10
n = 10
segmenter = LociSegmenter(detector, bandwidth = h, kernel_size = n)
dataset.segment(segmenter)
To visualize the segmentation, create a Viewer
instance and set its mode
attribute to 'segmentation'
. You may also want to adjust the contour_thickness
attribute.
from gwel.viewer import Viewer
viewer = Viewer(dataset,max_pixels=1500)
viewer.mode="segmentation"
viewer.contour_thickness = 4
viewer.open()
#to navigate to the next or previous images use the 'n' and 'p' keys respectively.
#press the 'q' key to quit.
#pressing the 'f' key will flag images, see earlier tutorials for a recap on flagging.
By default, the ImageDataset.segment
method automatically caches the segmentation by storing run length encoded binary arrays in COCO json format at '.gwel/masks.json'
inside the images directory. When the segment
method is called a second time, it will automatically read this file without executing the model, unless the use_saved
optional argument is set to False
. Additionally, if you do not wish to cache or overwrite an existing masks.json
, set the write
optional argument to False
when calling the segment
method.