lit_ecology_classifier.helpers package

Submodules

lit_ecology_classifier.helpers.argparser module

lit_ecology_classifier.helpers.argparser.argparser()[source]

Creates an argument parser for configuring, training, and running the machine learning model for image classification.

Arguments: –datapath: str

Path to the tar file containing the training data. Default is “/store/empa/em09/aquascope/phyto.tar”.

–train_outpath: str

Output path for training artifacts. Default is “./train_out”.

–main_param_path: str

Main directory where the training parameters are saved. Default is “./params/”.

–dataset: str

Name of the dataset. Default is “phyto”.

–use_wandb: flag

Use Weights and Biases for logging. Default is False.

–priority_classes: str

Path to the JSON file specifying priority classes for training. Default is an empty string.

–rest_classes: str

Path to the JSON file specifying rest classes for training. Default is an empty string.

–balance_classes: flag

Balance the classes for training. Default is False.

–batch_size: int

Batch size for training. Default is 64.

–max_epochs: int

Number of epochs to train. Default is 20.

–lr: float

Learning rate for training. Default is 1e-2.

–lr_factor: float

Learning rate factor for training of full body. Default is 0.01.

–no_gpu: flag

Use no GPU for training. Default is False.

–testing: flag

Set this to True if in testing mode, False for training. Default is False.

Returns:

The argument parser with defined arguments.

Return type:

argparse.ArgumentParser

lit_ecology_classifier.helpers.argparser.inference_argparser()[source]

Creates an argument parser for using the classifier on unlabeled data.

Arguments: –batch_size: int

Batch size for inference. Default is 180.

–outpath: str

Directory where predictions will be saved. Default is “./preds/”.

–model_path: str

Path to the model checkpoint file. Default is “./checkpoints/model.ckpt”.

–datapath: str

Path to the tar file containing the data to classify. Default is “/store/empa/em09/aquascope/phyto.tar”.

–no_gpu: flag

Use no GPU for inference. Default is False.

–no_TTA: flag

Disable test-time augmentation. Default is False.

–gpu_id: int

GPU ID to use for inference. Default is 0.

–limit_pred_batches: int

Limit the number of batches to predict. Default is 0, meaning no limit, set a low number to debug.

–prog_bar: flag

Enable progress bar. Default is False.

Returns:

The argument parser with defined arguments.

Return type:

argparse.ArgumentParser

lit_ecology_classifier.helpers.calc_class_weights module

lit_ecology_classifier.helpers.calc_class_weights.calculate_class_weights(datamodule)[source]

Calculate and save class weights and the mean and standard deviation of the dataset.

Parameters:

dataloader (DataLoader) – DataLoader for the dataset.

Returns:

(mean, std) where mean and std are tensors representing the mean and standard deviation of the dataset.

Return type:

tuple

lit_ecology_classifier.helpers.helpers module

class lit_ecology_classifier.helpers.helpers.CosineWarmupScheduler(optimizer, warmup, max_iters)[source]

Bases: _LRScheduler

Learning rate scheduler with cosine annealing and warmup.

Parameters:
  • optimizer (torch.optim.Optimizer) – Wrapped optimizer.

  • warmup (int) – Number of warmup steps.

  • max_iters (int) – Total number of iterations.

get_lr()[source]

Compute the learning rate at the current step.

get_lr_factor()[source]

Compute the learning rate factor at the current step.

get_lr()[source]
get_lr_factor(epoch)[source]
class lit_ecology_classifier.helpers.helpers.FocalLoss(gamma=0, alpha=None, size_average=True)[source]

Bases: Module

forward(input, target)[source]

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

lit_ecology_classifier.helpers.helpers.TTA_collate_fn(batch: dict, train=False)[source]

Collate function for test time augmentation (TTA).

Parameters:

batch (dict) – Dict of tuples containing images and labels.

Returns:

All rotations stacked row-wise batch_labels: Labels of the images

Return type:

batch_images

lit_ecology_classifier.helpers.helpers.cvd_colormap()[source]

A color map accessible for people with color vision deficiency (CVD).

lit_ecology_classifier.helpers.helpers.define_priority_classes(priority_classes)[source]
lit_ecology_classifier.helpers.helpers.define_rest_classes(priority_classes)[source]
lit_ecology_classifier.helpers.helpers.gmean(input_x, dim)[source]

Compute the geometric mean of the input tensor along the specified dimension.

Parameters:
  • input_x (torch.Tensor) – Input tensor.

  • dim (int) – Dimension along which to compute the geometric mean.

Returns:

Geometric mean of the input tensor.

Return type:

torch.Tensor

lit_ecology_classifier.helpers.helpers.output_results(outpath, im_names, labels, scores, priority_classes=False, rest_classes=False, tar_file=False)[source]

Output the prediction results to a file.

Parameters:
  • outpath (str) – Output directory path.

  • im_names (list) – List of image filenames.

  • labels (list) – List of predicted labels.

lit_ecology_classifier.helpers.helpers.plot_confusion_matrix(all_labels, all_preds, class_names)[source]

Plot and return confusion matrices (absolute and normalized).

Parameters:
  • all_labels (torch.Tensor) – True labels.

  • all_preds (torch.Tensor) – Predicted labels.

  • class_names (list) – List of class names.

Returns:

(figure for absolute confusion matrix, figure for normalized confusion matrix)

Return type:

tuple

lit_ecology_classifier.helpers.helpers.plot_loss_acc(logger)[source]

Plots the training and validation loss and accuracy from the logger’s metrics file.

Parameters:

logger (Logger) – The logger object containing the save directory, name, and version.

Saves:

loss_accuracy.png: A plot of the training and validation loss and accuracy over steps.

lit_ecology_classifier.helpers.helpers.plot_reduced_classes(model, priority_classes)[source]

Plots the confusion matrix for reduced classes.

Parameters:
  • model (LightningModule) – The trained model.

  • priority_classes (list) – List of priority classes.

Saves:

reduced_confusion_matrix.png: A confusion matrix of the reduced classes. reduced_confusion_matrix_norm.png: A normalized confusion matrix of the reduced classes.

lit_ecology_classifier.helpers.helpers.plot_score_distributions(all_scores, all_preds, class_names, true_label)[source]

Plot the distribution of prediction scores for each class in separate plots.

Parameters:
  • all_scores (torch.Tensor) – Confidence scores of the predictions.

  • all_preds (torch.Tensor) – Predicted class indices.

  • class_names (list) – List of class names.

Returns:

A list of figures, each representing the score distribution for a class.

Return type:

list

lit_ecology_classifier.helpers.helpers.setup_callbacks(priority_classes, ckpt_name)[source]

Sets up callbacks for the training process.

Parameters:
  • priority_classes (list) – List of priority classes to monitor for false positives.

  • ckpt_name (str) – The name of the checkpoint file.

Returns:

A list of configured callbacks including EarlyStopping, ModelCheckpoint, and ModelSummary.

Return type:

list

lit_ecology_classifier.helpers.helpers.setup_classmap(datapath='', priority_classes=[], rest_classes=[])[source]

Module contents