I-ViT: Integer-only Quantization for Efficient Vision Transformer Inference

This repository contains the official implementation for the paper "I-ViT: Integer-only Quantization for Efficient Vision Transformer Inference". To the best of our knowledge, this is the first work on integer-only quantization for vision transformers.

Below are instructions of Pytorch code to reproduce the accuracy results of quantization-aware training (QAT). TVM benchmark is the TVM deployment project for reproducing latency results.

Installation

TVM version is recommended to be 0.9.dev0.
Timm version is recommended to be 0.4.12.
To install I-ViT and develop locally:

git clone https://github.com/zkkli/I-ViT.git
cd I-ViT

QAT Experiments

You can quantize and fine-tune a single model using the following command:

python quant_train.py [--model] [--data] [--epochs] [--lr]

optional arguments:
--model: Model architecture, the choises can be: 
         deit_tiny, deit_small, deit_base, swin_tiny, swin_small, swin_base.
--data: Path to ImageNet dataset.
--epochs: recommended values are: [30, 60, 90], default=90.
--lr: recommended values are: [2e-7, 5e-7, 1e-6, 2e-6], default=1e-6.

Example: Quantize and fine-tune DeiT-T:

python quant_train.py --model deit_tiny --data <YOUR_DATA_DIR> --epochs 30 --lr 5e-7

Results

Below are the Top-1 (%) accuracy results of our proposed I-ViT that you should get on ImageNet dataset.

Model	FP32	INT8 (I-ViT)	Diff.
ViT-S	81.39	81.27	-0.12
ViT-B	84.53	84.76	+0.23
DeiT-T	72.21	72.24	+0.03
DeiT-S	79.85	80.12	+0.27
DeiT-B	81.85	81.74	-0.11
Swin-T	81.35	81.50	+0.15
Swin-S	83.20	83.01	-0.19

Citation

We appreciate it if you would please cite the following paper if you found the implementation useful for your work:

@inproceedings{li2023vit,
  title={I-vit: Integer-only quantization for efficient vision transformer inference},
  author={Li, Zhikai and Gu, Qingyi},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={17065--17075},
  year={2023}
}

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
TVM_benchmark		TVM_benchmark
models		models
utils		utils
LICENSE		LICENSE
README.md		README.md
overview.png		overview.png
quant_train.py		quant_train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

I-ViT: Integer-only Quantization for Efficient Vision Transformer Inference

Installation

QAT Experiments

Results

Citation

About

Releases

Packages

Languages

License

Akash-guna/I-ViT

Folders and files

Latest commit

History

Repository files navigation

I-ViT: Integer-only Quantization for Efficient Vision Transformer Inference

Installation

QAT Experiments

Results

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages