SPIN

This repository contains the official implementation for the ECCV'22 paper, "SPIN: An Empirical Evaluation on Sharing Parameters of Isotropic Networks".

Code Overview

We provide the implementation of weight sharing version of the ConvMixer model. The main code for the implementation are in the models directory. The model can be configured by the files in configs. We provide three example configs.

configs/ConvMixer.yaml for vanilla ConvMixer model.
configs/WS-ConvMixer.yaml for Weight-shared ConvMixer (WS-ConvMixer) model.
configs/WFWS-ConvMixer.yaml for Weight-fusion Weight-shared ConvMixer (WFWS-ConvMixer) model.

Note that in order to run the model configs/WF-WSConvMixer.yaml, you must have a corresponding pretrained ConvMixer model. Please refer to our paper for each technique.

Installation

First, clone this repo with

git clone https://github.com/apple/ml-spin.git

The implementation of SPIN reuses the infrastructure of Meta Research's open source project SlowFast. Our modification to the SlowFast code is stored in the spin-slowfast.patch. To download the SlowFast code and apply our changes, run

bash setup.sh

After getting the codebase ready, follow this link from SlowFast repo to setup your environment and install other dependencies.

Training

After the environment is set up, you can run the following example training script to train a weight sharing ConvMixer model. The script assumes you have a machine with 4-GPUs.

bash run.sh

Pre-trained ConvMixer Models on ImageNet1K

We provide our pretrained models of ConvMixer, WS-ConvMixer and WFWS-ConvMixer in the following table. For the WFWS-ConvMixer, we first initialized the model using the proposed weight fusion technique with mean strategy, and then run the models/fuse_weights.py to export the fused model after training. In order to re-run the model, please use the WS-ConvMixer configuration. Please note we did a light hyperparameter tunning so the accuracy is slightly higher than the numbers reported in the paper.

C/D/P/K	Weight Sharing?	Weight Fusion?	Sharing Rate	Share Distribution	Sharing Mapping	Accuracy	Model Size
768/32/14/3	No	No	-	-	-	76.32%	79MB
768/32/14/3	Yes	No	2	Uniform	Sequential	74.27%	43MB
768/32/14/3	Yes	Mean	2	Uniform	Sequential	75.21%	43MB

Citation

If you find our code or paper helps, please consider citing:

@article{spin_eccv22,
    author    = {Lin, Chien-Yu and Prabhu, Anish and Merth, Thomas and Mehta, Sachin and Ranjan, Anurag and Horton, Maxwell and Rastegari, Mohammad}
    title     = {SPIN: An Empirical Evaluation on Sharing Parameters of Isotropic Networks},
    booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
    year      = {2022}
}

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
configs		configs
models		models
pretrained		pretrained
.gitattributes		.gitattributes
ACKNOWLEDGEMENTS		ACKNOWLEDGEMENTS
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
run.sh		run.sh
setup.sh		setup.sh
spin-slowfast.patch		spin-slowfast.patch

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SPIN

Code Overview

Installation

Training

Pre-trained ConvMixer Models on ImageNet1K

Citation

About

Releases

Packages

Languages

License

apple/ml-spin

Folders and files

Latest commit

History

Repository files navigation

SPIN

Code Overview

Installation

Training

Pre-trained ConvMixer Models on ImageNet1K

Citation

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages