🐟 Fishy Business

Machine Learning for Rapid Evaporative Ionization Mass Spectrometry

A Doctoral Thesis by Jesse Wood Victoria University of Wellington

Documentation Status Format Python Code Unit Tests Coverage Status Doctests

A configuration-driven framework for analyzing mass spectrometry data using Deep Learning, Classic Machine Learning, and Evolutionary Algorithms.

Important

While the source code of this framework is open-source under the MIT license, the accompanying REIMS dataset is private. Authorized users must use the provided download command to fetch the data files.

Quickstart

Train a state-of-the-art Transformer model and view results in just 4 lines:

from fishy import TrainingConfig, run_unified_training, display_final_summary

config = TrainingConfig(model="transformer", dataset="species")
results = run_unified_training(config)
display_final_summary(results)

Key Features

  • Universal API: Use the same interface for PyTorch, Scikit-Learn, and DEAP models.

  • Auto-Validation: Built-in K-Fold cross-validation and statistical significance testing.

  • Research Ready: Specialized support for pre-training, transfer learning, and contrastive suites.

  • XAI Integrated: Visual explanations using Grad-CAM and LIME out of the box.

Citation

If you use this framework in your research, please cite the following paper:

@article{wood2025hook,
  title={Hook, line, and spectra: machine learning for fish species identification and body part classification using rapid evaporative ionization mass spectrometry},
  author={Wood, Jesse and Nguyen, Bach and Xue, Bing and Zhang, Mengjie and Killeen, Daniel},
  journal={Intelligent Marine Technology and Systems},
  volume={3},
  number={1},
  pages={16},
  year={2025},
  publisher={Springer}
}

For a full list of related research and publications, see the author’s Google Scholar page.

Indices and tables