lit_ecology_classifier.helpers package

Submodules

lit_ecology_classifier.helpers.argparser module

lit_ecology_classifier.helpers.argparser.argparser()[source]

Creates an argument parser for configuring, training, and running the machine learning model for image classification.

Arguments: –datapath: str

Path to the tar file containing the training data. Default is “/store/empa/em09/aquascope/phyto.tar”.

–train_outpath: str: Output path for training artifacts. Default is “./train_out”.
–main_param_path: str: Main directory where the training parameters are saved. Default is “./params/”.
–dataset: str: Name of the dataset. Default is “phyto”.
–use_wandb: flag: Use Weights and Biases for logging. Default is False.
–priority_classes: str: Path to the JSON file specifying priority classes for training. Default is an empty string.
–rest_classes: str: Path to the JSON file specifying rest classes for training. Default is an empty string.
–balance_classes: flag: Balance the classes for training. Default is False.
–batch_size: int: Batch size for training. Default is 64.
–max_epochs: int: Number of epochs to train. Default is 20.
–lr: float: Learning rate for training. Default is 1e-2.
–lr_factor: float: Learning rate factor for training of full body. Default is 0.01.
–no_gpu: flag: Use no GPU for training. Default is False.
–testing: flag: Set this to True if in testing mode, False for training. Default is False.

Returns:: The argument parser with defined arguments.
Return type:: argparse.ArgumentParser

lit_ecology_classifier.helpers.argparser.inference_argparser()[source]

Creates an argument parser for using the classifier on unlabeled data.

Arguments: –batch_size: int

Batch size for inference. Default is 180.

–outpath: str: Directory where predictions will be saved. Default is “./preds/”.
–model_path: str: Path to the model checkpoint file. Default is “./checkpoints/model.ckpt”.
–datapath: str: Path to the tar file containing the data to classify. Default is “/store/empa/em09/aquascope/phyto.tar”.
–no_gpu: flag: Use no GPU for inference. Default is False.
–no_TTA: flag: Disable test-time augmentation. Default is False.
–gpu_id: int: GPU ID to use for inference. Default is 0.
–limit_pred_batches: int: Limit the number of batches to predict. Default is 0, meaning no limit, set a low number to debug.
–prog_bar: flag: Enable progress bar. Default is False.

Returns:: The argument parser with defined arguments.
Return type:: argparse.ArgumentParser

lit_ecology_classifier.helpers.calc_class_weights module

lit_ecology_classifier.helpers.calc_class_weights.calculate_class_weights(datamodule)[source]

Calculate and save class weights and the mean and standard deviation of the dataset.

Parameters:: dataloader (DataLoader) – DataLoader for the dataset.
Returns:: (mean, std) where mean and std are tensors representing the mean and standard deviation of the dataset.
Return type:: tuple

lit_ecology_classifier.helpers.helpers module

class lit_ecology_classifier.helpers.helpers.CosineWarmupScheduler(optimizer, warmup, max_iters)[source]

Bases: _LRScheduler

Learning rate scheduler with cosine annealing and warmup.

Parameters:

optimizer (torch.optim.Optimizer) – Wrapped optimizer.
warmup (int) – Number of warmup steps.
max_iters (int) – Total number of iterations.

get_lr()[source]: Compute the learning rate at the current step.

get_lr_factor()[source]: Compute the learning rate factor at the current step.

get_lr()[source]

get_lr_factor(epoch)[source]

class lit_ecology_classifier.helpers.helpers.FocalLoss(gamma=0, alpha=None, size_average=True)[source]

Bases: Module

forward(input, target)[source]

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

lit_ecology_classifier.helpers.helpers.TTA_collate_fn(batch: dict, train=False)[source]

Collate function for test time augmentation (TTA).

Parameters:: batch (dict) – Dict of tuples containing images and labels.
Returns:: All rotations stacked row-wise batch_labels: Labels of the images
Return type:: batch_images

lit_ecology_classifier.helpers.helpers.cvd_colormap()[source]: A color map accessible for people with color vision deficiency (CVD).

lit_ecology_classifier.helpers.helpers.define_priority_classes(priority_classes)[source]

lit_ecology_classifier.helpers.helpers.define_rest_classes(priority_classes)[source]

lit_ecology_classifier.helpers.helpers.gmean(input_x, dim)[source]

Compute the geometric mean of the input tensor along the specified dimension.

Parameters:

input_x (torch.Tensor) – Input tensor.
dim (int) – Dimension along which to compute the geometric mean.

Returns:

Geometric mean of the input tensor.

Return type:

torch.Tensor

lit_ecology_classifier.helpers.helpers.output_results(outpath, im_names, labels, scores, priority_classes=False, rest_classes=False, tar_file=False)[source]

Output the prediction results to a file.

Parameters:

outpath (str) – Output directory path.
im_names (list) – List of image filenames.
labels (list) – List of predicted labels.

lit_ecology_classifier.helpers.helpers.plot_confusion_matrix(all_labels, all_preds, class_names)[source]

Plot and return confusion matrices (absolute and normalized).

Parameters:

all_labels (torch.Tensor) – True labels.
all_preds (torch.Tensor) – Predicted labels.
class_names (list) – List of class names.

Returns:

(figure for absolute confusion matrix, figure for normalized confusion matrix)

Return type:

tuple

lit_ecology_classifier.helpers.helpers.plot_loss_acc(logger)[source]

Plots the training and validation loss and accuracy from the logger’s metrics file.

Parameters:: logger (Logger) – The logger object containing the save directory, name, and version.

Saves:: loss_accuracy.png: A plot of the training and validation loss and accuracy over steps.

lit_ecology_classifier.helpers.helpers.plot_reduced_classes(model, priority_classes)[source]

Plots the confusion matrix for reduced classes.

Parameters:

model (LightningModule) – The trained model.
priority_classes (list) – List of priority classes.

Saves:: reduced_confusion_matrix.png: A confusion matrix of the reduced classes. reduced_confusion_matrix_norm.png: A normalized confusion matrix of the reduced classes.

lit_ecology_classifier.helpers.helpers.plot_score_distributions(all_scores, all_preds, class_names, true_label)[source]

Plot the distribution of prediction scores for each class in separate plots.

Parameters:

all_scores (torch.Tensor) – Confidence scores of the predictions.
all_preds (torch.Tensor) – Predicted class indices.
class_names (list) – List of class names.

Returns:

A list of figures, each representing the score distribution for a class.

Return type:

list

lit_ecology_classifier.helpers.helpers.setup_callbacks(priority_classes, ckpt_name)[source]

Sets up callbacks for the training process.

Parameters:

priority_classes (list) – List of priority classes to monitor for false positives.
ckpt_name (str) – The name of the checkpoint file.

Returns:

A list of configured callbacks including EarlyStopping, ModelCheckpoint, and ModelSummary.

Return type:

list

lit_ecology_classifier.helpers.helpers.setup_classmap(datapath='', priority_classes=[], rest_classes=[])[source]

lit_ecology_classifier.helpers package

Submodules

lit_ecology_classifier.helpers.argparser module

lit_ecology_classifier.helpers.calc_class_weights module

lit_ecology_classifier.helpers.helpers module

Module contents