Jongbeom Kim KimRass

1. From-scratch PyTorch Implementations

분류 및 연도	이름	저자	구현 내용
Vision
2014	VAE	Kingma and Welling	[✓] Training on MNIST [✓] Encoder output visualization [✓] Decoder output visualization
2015	CAM	Zhou et al.	[✓] Application to GoogleNet [✓] Bounding box generation from Class Activation Map
2016	Gatys et al., 2016 (image style transfer)	Gatys et al.	[✓] Application to VGGNet-19
	YOLO	Redmon et al.	[✗] Training on VOC 2012 [✗] Class probability map [✗] Ground truth vlisualization on grid
	DCGAN	Radford et al.	[✓] Training on CelebA at 64 × 64 [✓] Sampling [✓] Latent space interpolation
	Noroozi et al., 2016	Noroozi et al.	[✓] Model architecture [✓] Chromatic aberration [✓] Permutation set
	Zhang et al., 2016 (image colorization)	Zhang et al.	[✓] Empirical probability distribution visualization [✗] Color space
2014 2017	Conditional GAN WGAN-GP	Mirza et al. Gulrajani et al.	[✓] Training on MNIST
2016 2017	VQ-VAE & PixelCNN	Oord et al. Oord et al.	[✓] Training on Fashion MNIST [✓] Training on CIFAR-10
2017	Pix2Pix	Isola et al.	[✓] Training on Google Maps [✓] Training on Facades [✗] Inference on larger resolution
	CycleGAN	Zhu et al.	[✓] Training on monet2photo [✓] Training on vangogh2photo [✓] Training on cezanne2photo [✓] Training on ukiyoe2photo [✓] Training on horse2zebra [✓] Training on summer2winter_yosemite
	Noroozi et al., 2017	Noroozi et al.	[✓] Constrastive loss
2018	PGGAN	Karras et al.	[✓] Training on CelebA-HQ at 512 × 512
	DeepLab v3	Chen et al.	[✓] Training on VOC 2012 [✓] Prediction on VOC 2012 validation set [✓] Average mIoU [✓] Model output visualization
	RotNet	Gidaris et al	[✓] Attention map visualization
	StarGAN	Yunjey Choi et al.	[✓] Model architecture
2020	STEFANN	Roy et al.	[✓] FANnet architecture [✓] Training FANnet on Google Fonts [✓] Custom Google Fonts dataset [✓] Average SSIM
	DDPM	Ho et al.	[✓] Training on CelebA at 32 × 32 [✓] Training on CelebA at 64 × 64 [✓] Denoising process visualization [✓] Sampling using linear interpolation [✓] Sampling using coarse-to-fine interpolation
	DDIM	Song et al.	[✓] Normal sampling [✓] Sampling using spherical linear interpolation [✓] Sampling using grid interpolation [✓] Truncated normal
	ViT	Dosovitskiy et al.	[✓] Training on CIFAR-10 [✓] Training on CIFAR-100 [✓] Attention map visualization using Attention Roll-out [✓] Position embedding similarity visualization [✓] Position embedding interpolation [✓] CutOut [✓] CutMix [✓] Hide-and-Seek
	SimCLR	Chen et al.	[✓] Normalized temperature-scaled cross entropy loss [✓] Data augmentation [✓] Pixel intensity histogram
	DETR	Carion et al.	[✓] Model architecture [✗] Bipartite matching & loss [✗] Batch normalization freezing [✗] Data preparation [✗] Training on COCO 2017
2021	Improved DDPM	Nichol and Dhariwal	[✓] Cosine diffusion schedule
	Classifier-Guidance	Dhariwal and Nichol	[✓] Training on CIFAR-10[✗] AdaGN [✗] BiGGAN Upsample/Downsample [✗] Improved DDPM sampling [✗] Conditional/Unconditional models [✗] Super-resolution model [✗] Interpolation
	ILVR	Choi et al.	[✓] Sampling using single reference [✓] Sampling using various downsampling factors [✓] Sampling using various conditioning range
	SDEdit	Meng et al.	[✓] User input stroke simulation [✓] Application to CelebA at 64 × 64
	MAE	He et al.	[✓] Model architecture for pre-training [✗] Model architecture for self-supervised learning [✗] Training on ImageNet-1K [✗] Fine-tuning [✗] Linear probing
	Copy-Paste	Ghiasi et al.	[✓] COCO dataset processing [✓] Large scale jittering [✓] Copy-Paste (within mini-batch) [✓] Data visualization [✗] Gaussian filter
	ViViT	Arnab et al.	[✓] 'Spatio-temporal attention' architecture [✓] 'Factorised encoder' architecture [✓] 'Factorised self-attention' architecture
2022	CFG	Ho et al.
Language
2017	Transformer	Vaswani et al.	[✓] Model architecture [✓] Position encoding visualization
2019	BERT	Devlin et al.	[✓] Model architecture [✓] Masked language modeling [✓] BookCorpus data pre-processing [✓] SQuAD data pre-processing [✓] SWAG data pre-processing
	Sentence-BERT	Reimers et al.	[✓] Classification loss [✓] Regression loss [✓] Constrastive loss [✓] STSb data pre-processing [✓] WikiSection data pre-processing [✗] NLI data pre-processing
	RoBERTa	Liu et al.	[✓] BookCorpus data pre-processing [✓] Masked language modeling [✗] BookCorpus data pre-processing (SEGMENT-PAIR + NSP) [✗] BookCorpus data pre-processing (SENTENCE-PAIR + NSP) [✓] BookCorpus data pre-processing (FULL-SENTENCES) [✗] BookCorpus data pre-processing (DOC-SENTENCES)
2021	Swin Transformer	Liu et al.	[✓] Patch partition [✓] Patch merging [✓] Relative position bias [✓] Feature map padding [✓] Self-attention in non-overlapped windows [✗] Shifted Window based Self-Attention
2024	RoPE	Su et al.	[✓] Rotary Positional Embedding
Vision-Language
2021	CLIP	Radford et al.	[✓] Training on Flickr8k + Flickr30k [✓] Zero-shot classification on ImageNet1k (mini) [✓] Linear classification on ImageNet1k (mini)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Jongbeom Kim KimRass

Achievements

Achievements

Block or report KimRass

1. From-scratch PyTorch Implementations

Pinned Loading