This tutorial follows on from the previous tutorial Handling Image Data with the ImageDataset Class. This tutorial will assume you have covered the content of that tutorial.
When an image dataset is generated as a result of an experiment, the corresponding experimental variables are referred to as factors within the gwel
package. An experiment can have multiple factors, it is assumed each image is assigned a single value for each factor. Factors may correspond to independent, dependent or control variables. Examples of factors can include time, treatment or replicate.
To add a factor to an ImageDataset
instance named dataset
, use the factor
method. This method has arguments, a string, corresponding to the factor name, and a list, corresponding the values of the factor in the order of the images in dataset.images
. See below for an example.
factor_name = 'factor1'
factor_values = [1,2,...,N] # corresponding to the initial order of images in `dataset.images` in future versions this will be updated to a dictionary with image name keys and factor values .
dataset.factor(factor_name , factor_values)
An unlimited number of factors can be added to the dataset
object in the way. To check the number of factors defined in dataset
, use the status
method.
dataset.status()
To access the factors as a dictionary use the factors
attribute,
factors = dataset.factors
Images can be reordered based on the nesting order of the factors by using the sort
method and providing a list of factor names in the desired order, from most to least nested. Numerical factors are sorted in ascending numerical order, while categorical (string) factors are sorted in alphabetical order.
dataset.sort(["factor_name_1","factor_name_2","...","factor_name_n"])
# ordered from most to least nested
To filter the images based on specific factor values, use the filter
method. Pass a dictionary where each key is a factor name and the corresponding value is a list of acceptable values for that factor. The filter performs an OR operation across the specified factor values for each factor.
The method returns a list of image names that match any of the given values for each specified factor.
filtered_images = dataset.filter({'factor_name':factor_values})
After filtering a list of image names these images can then be copied into a new directory. Since this is a common workflow it is implemented in the sample
method.
sample_dir = 'path/to/directory/where/images/will/be/copied/into'
filtered_images = dataset.sample(directory = sample_dir, factors = {'factor_name':factor_values})