fishy.data.datasets

Dataset definitions and types for the deep learning pipeline.

Classes

class fishy.data.datasets.BalancedBatchSampler(pair_labels: Tensor | ndarray, batch_size: int)[source]

Bases: Sampler

Restored from original implementation to ensure 1:1 ratio of positive/negative pairs.

__init__(pair_labels: Tensor | ndarray, batch_size: int) None[source]
class fishy.data.datasets.BaseDataset(samples: ndarray, labels: ndarray, random_projection: bool = False, quantize: bool = False, turbo_quant: bool = False, polar: bool = False, normalize: bool = False, snv: bool = False, minmax: bool = False, log_transform: bool = False, savgol: bool = False, seed: int = 42)[source]

Bases: Dataset

__init__(samples: ndarray, labels: ndarray, random_projection: bool = False, quantize: bool = False, turbo_quant: bool = False, polar: bool = False, normalize: bool = False, snv: bool = False, minmax: bool = False, log_transform: bool = False, savgol: bool = False, seed: int = 42) None[source]
class fishy.data.datasets.CustomDataset(samples: ndarray, labels: ndarray, random_projection: bool = False, quantize: bool = False, turbo_quant: bool = False, polar: bool = False, normalize: bool = False, snv: bool = False, minmax: bool = False, log_transform: bool = False, savgol: bool = False, seed: int = 42)[source]

Bases: BaseDataset

Standard PyTorch Dataset.

class fishy.data.datasets.SiameseDataset(samples: ndarray, labels: ndarray)[source]

Bases: BaseDataset

Dataset for contrastive learning, generating pairs of samples. Restored to return (x1, x2, pair_label, y1, y2) for balanced sampling.

__init__(samples: ndarray, labels: ndarray) None[source]

s