lit_ecology_classifier.helpers package
Submodules
lit_ecology_classifier.helpers.argparser module
- lit_ecology_classifier.helpers.argparser.argparser()[source]
Creates an argument parser for configuring, training, and running the machine learning model for image classification.
Arguments: –datapath: str
Path to the tar file containing the training data. Default is “/store/empa/em09/aquascope/phyto.tar”.
- –train_outpath: str
Output path for training artifacts. Default is “./train_out”.
- –main_param_path: str
Main directory where the training parameters are saved. Default is “./params/”.
- –dataset: str
Name of the dataset. Default is “phyto”.
- –use_wandb: flag
Use Weights and Biases for logging. Default is False.
- –priority_classes: str
Path to the JSON file specifying priority classes for training. Default is an empty string.
- –rest_classes: str
Path to the JSON file specifying rest classes for training. Default is an empty string.
- –balance_classes: flag
Balance the classes for training. Default is False.
- –batch_size: int
Batch size for training. Default is 64.
- –max_epochs: int
Number of epochs to train. Default is 20.
- –lr: float
Learning rate for training. Default is 1e-2.
- –lr_factor: float
Learning rate factor for training of full body. Default is 0.01.
- –no_gpu: flag
Use no GPU for training. Default is False.
- –testing: flag
Set this to True if in testing mode, False for training. Default is False.
- Returns:
The argument parser with defined arguments.
- Return type:
argparse.ArgumentParser
- lit_ecology_classifier.helpers.argparser.inference_argparser()[source]
Creates an argument parser for using the classifier on unlabeled data.
Arguments: –batch_size: int
Batch size for inference. Default is 180.
- –outpath: str
Directory where predictions will be saved. Default is “./preds/”.
- –model_path: str
Path to the model checkpoint file. Default is “./checkpoints/model.ckpt”.
- –datapath: str
Path to the tar file containing the data to classify. Default is “/store/empa/em09/aquascope/phyto.tar”.
- –no_gpu: flag
Use no GPU for inference. Default is False.
- –no_TTA: flag
Disable test-time augmentation. Default is False.
- –gpu_id: int
GPU ID to use for inference. Default is 0.
- –limit_pred_batches: int
Limit the number of batches to predict. Default is 0, meaning no limit, set a low number to debug.
- –prog_bar: flag
Enable progress bar. Default is False.
- Returns:
The argument parser with defined arguments.
- Return type:
argparse.ArgumentParser
lit_ecology_classifier.helpers.calc_class_weights module
- lit_ecology_classifier.helpers.calc_class_weights.calculate_class_weights(datamodule)[source]
Calculate and save class weights and the mean and standard deviation of the dataset.
- Parameters:
dataloader (DataLoader) – DataLoader for the dataset.
- Returns:
(mean, std) where mean and std are tensors representing the mean and standard deviation of the dataset.
- Return type:
tuple
lit_ecology_classifier.helpers.helpers module
- class lit_ecology_classifier.helpers.helpers.CosineWarmupScheduler(optimizer, warmup, max_iters)[source]
Bases:
_LRSchedulerLearning rate scheduler with cosine annealing and warmup.
- Parameters:
optimizer (torch.optim.Optimizer) – Wrapped optimizer.
warmup (int) – Number of warmup steps.
max_iters (int) – Total number of iterations.
- class lit_ecology_classifier.helpers.helpers.FocalLoss(gamma=0, alpha=None, size_average=True)[source]
Bases:
Module- forward(input, target)[source]
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- lit_ecology_classifier.helpers.helpers.TTA_collate_fn(batch: dict, train=False)[source]
Collate function for test time augmentation (TTA).
- Parameters:
batch (dict) – Dict of tuples containing images and labels.
- Returns:
All rotations stacked row-wise batch_labels: Labels of the images
- Return type:
batch_images
- lit_ecology_classifier.helpers.helpers.cvd_colormap()[source]
A color map accessible for people with color vision deficiency (CVD).
- lit_ecology_classifier.helpers.helpers.gmean(input_x, dim)[source]
Compute the geometric mean of the input tensor along the specified dimension.
- Parameters:
input_x (torch.Tensor) – Input tensor.
dim (int) – Dimension along which to compute the geometric mean.
- Returns:
Geometric mean of the input tensor.
- Return type:
torch.Tensor
- lit_ecology_classifier.helpers.helpers.output_results(outpath, im_names, labels, scores, priority_classes=False, rest_classes=False, tar_file=False)[source]
Output the prediction results to a file.
- Parameters:
outpath (str) – Output directory path.
im_names (list) – List of image filenames.
labels (list) – List of predicted labels.
- lit_ecology_classifier.helpers.helpers.plot_confusion_matrix(all_labels, all_preds, class_names)[source]
Plot and return confusion matrices (absolute and normalized).
- Parameters:
all_labels (torch.Tensor) – True labels.
all_preds (torch.Tensor) – Predicted labels.
class_names (list) – List of class names.
- Returns:
(figure for absolute confusion matrix, figure for normalized confusion matrix)
- Return type:
tuple
- lit_ecology_classifier.helpers.helpers.plot_loss_acc(logger)[source]
Plots the training and validation loss and accuracy from the logger’s metrics file.
- Parameters:
logger (Logger) – The logger object containing the save directory, name, and version.
- Saves:
loss_accuracy.png: A plot of the training and validation loss and accuracy over steps.
- lit_ecology_classifier.helpers.helpers.plot_reduced_classes(model, priority_classes)[source]
Plots the confusion matrix for reduced classes.
- Parameters:
model (LightningModule) – The trained model.
priority_classes (list) – List of priority classes.
- Saves:
reduced_confusion_matrix.png: A confusion matrix of the reduced classes. reduced_confusion_matrix_norm.png: A normalized confusion matrix of the reduced classes.
- lit_ecology_classifier.helpers.helpers.plot_score_distributions(all_scores, all_preds, class_names, true_label)[source]
Plot the distribution of prediction scores for each class in separate plots.
- Parameters:
all_scores (torch.Tensor) – Confidence scores of the predictions.
all_preds (torch.Tensor) – Predicted class indices.
class_names (list) – List of class names.
- Returns:
A list of figures, each representing the score distribution for a class.
- Return type:
list
- lit_ecology_classifier.helpers.helpers.setup_callbacks(priority_classes, ckpt_name)[source]
Sets up callbacks for the training process.
- Parameters:
priority_classes (list) – List of priority classes to monitor for false positives.
ckpt_name (str) – The name of the checkpoint file.
- Returns:
A list of configured callbacks including EarlyStopping, ModelCheckpoint, and ModelSummary.
- Return type:
list