This repository contains a full tutorial for nuclei instance segmentation using Mask R-CNN, including image pre-processing, Mask R-CNN with training augmentation, test stage ensemble and post-processing.
The code for Mask R-CNN model is adapted from MatterPort implementation.
Example data in data/
is from Kaggle DSB18 and Hand-segmented 2D Nuclear Images.
- TensorFlow 1.4.0
- Keras 2.1.3
- NumPy
- SciPy
- OpenCV
- scikit-image
- scikit-learn
- Pandas
- Put your training and test images under data/train and data/test.
(Skip if you do not need mosaic)
- Some small training images (e.g. in Kaggle DSB18) may come from the same large image;
- Run nuclei_mosaic.py to recover the original image - this is useful for data augmentation and the segmentation of objects on the image boundaries.
python nuclei_mosaic.py --TRAIN_DIR data/train --MOSAIC_TRAIN_DIR data/mosaic_train
- Split training and validation set.
python nuclei_trainvalsplit.py
- Begin training
python nuclei_train.py --dir_log logs
- Inference on validation and test images
- For validation images: also compute mAP
- model_path = the model you want to use; check the name in logs/.
python nuclei_inf.py --dir_log logs --model_path logs/nuclei_train20180101T0000/mask_rcnn_nuclei_train_0000.h5
- Ensemble segmentation results
- model_names = list of models you want to ensemble
- You can set test_flag = False to ensemble validation results instead of test results
python nuclei_ensemble.py --test_flag True --model_names nuclei_train20180101T0000_0000 nuclei_train20180102T0000_0001
- Post-processing and generate run-length encoding
python nuclei_postprocess.py
We tested on Kaggle DSB18 stage 1 training data.
We split the total N = 670 training images into train vs. validation sets (9 : 1).
Best single model performance on validation set: mAP = 0.624; after ensemble and post-processing: mAP = 0.645.