NOSE—Neural-Olfactory-Sensing-and-Evaluation

1 minute read

Published: June 03, 2025

NOSE: Neural Olfactory Sensing and Evaluation

Repository for Introduction to Machine Learning Course Project, 2025 Spring.

This project provides code for neural-based olfactory (smell) sensing and evaluation using various machine learning models.

Contains scripts for model training and experiments:

Helper functions, dataset preparation, and visualization:

Custom utilities for argument parsing, data handling, encoding, and configuration:

args.py, args_finetune.py: Argument parsing.
data_utils.py: Data loading and preprocessing.
pubchem_encoder.py: PubChem encoding utilities.
hparams.yaml: Hyperparameter configuration.
train_pubchem_light.py: Training on PubChem-light.
pubchem_canon_zinc_final_vocab_sorted.pth: Precomputed vocabulary (PyTorch format).
Folders: rotate_attention/, tokenizer/

(For the full file listing, see the custom_utils folder here.)

Install Requirements
The environment for this project can be quite tricky. We encourage you follow the exact steps of IBM’s MoLFormer repository to set up the environment. You can find the instructions here
Warning: The environment includes apex, which may fail in certain CUDA versions. If you encounter issues, try using a different CUDA version or change the optimization method to Adam in the training scripts.
Prepare Datasets
We use the curated GS-LF dataset. You can download it from here.
For the Keller-2016 dataset used as test set, you can download it from here. We also add an extra binarization step to the dataset.
Train Models
Before running the training scripts, ensure you have the datasets prepared and placed in the correct directories. You will also need to download the MoLFormer_Pretrained model from here Notice that the checkpoint files are vital for your fine-tuning process. Make sure to have them before you run the training scripts.
For fine-tuning specific models:
```
python train/finetune_multitask.py
python train/fine-tuned\ MolFormer.py
```
For training classification or regression models:
```
 python train/classification.py
 python train/regression.py
```
Customize Arguments and Hyperparameters
- Edit YAML config in custom_utils/hparams.yaml for hyperparameters.
- Use scripts in custom_utils/args.py or custom_utils/args_finetune.py for advanced argument parsing.