lit_ecology_classifier.data package
Submodules
lit_ecology_classifier.data.datamodule module
- class lit_ecology_classifier.data.datamodule.DataModule(datapath: str, batch_size: int, dataset: str, TTA: bool = False, class_map: dict = {}, priority_classes: list = [], rest_classes: list = [], splits: Iterable = [0.7, 0.15], **kwargs)[source]
Bases:
LightningDataModuleA LightningDataModule for handling image datasets stored in a tar file. This module is responsible for preparing and loading data in a way that is compatible with PyTorch training routines using the PyTorch Lightning framework.
- tarpath
Path to the tar file containing the dataset.
- Type:
str
- batch_size
Number of images to load per batch.
- Type:
int
- dataset
Identifier for the dataset being used.
- Type:
str
- testing
Flag to enable testing mode, which includes TTA (Test Time Augmentation).
- Type:
bool
- priority_classes
Path to the JSON file containing a list of the priority classes.
- Type:
str
- splits
Proportions to split the dataset into training, validation, and testing.
- Type:
Iterable
- predict_dataloader()[source]
Constructs the DataLoader for inference on data. :returns: DataLoader object for the inference dataset. :rtype: DataLoader
- setup(stage=None)[source]
Prepares the datasets for training, validation, and testing by applying appropriate splits. This method also handles the TTA mode adjustments.
- Parameters:
stage (Optional[str]) – Current stage of the model training/testing. Not used explicitly in the method.
- test_dataloader()[source]
Constructs the DataLoader for testing data. :returns: DataLoader object for the testing dataset. :rtype: DataLoader
lit_ecology_classifier.data.tardataset module
- class lit_ecology_classifier.data.tardataset.TarImageDataset(tar_path: str, class_map: dict, priority_classes: list, rest_classes: list, TTA: bool = False, train: bool = False)[source]
Bases:
DatasetA Dataset subclass for managing and accessing image data stored in tar files. This class supports optional image transformations, and Test Time Augmentation (TTA) for enhancing model evaluation during testing.
- tar_path
Path to the tar file containing image data.
- Type:
str
- class_map_path
Path to the JSON file mapping class names to labels.
- Type:
str
- priority_classes
Path to a JSON file specifying priority classes for targeted training or evaluation.
- Type:
str
- train
Specifies whether the dataset will be used for training. Determines the type of transformations applied.
- Type:
bool
- TTA
Indicates if Test Time Augmentation should be applied during testing.
- Type:
bool