Skip to content
View albertmundu's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report albertmundu

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 30,366 4,597 Updated Nov 18, 2024

VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)

Python 1,995 159 Updated Oct 31, 2024

Machine Learning Engineering Open Book

Python 11,650 711 Updated Nov 12, 2024

[ICCV 2023] Tracking Anything with Decoupled Video Segmentation

Python 1,268 129 Updated Aug 1, 2024

High-Resolution Image Synthesis with Latent Diffusion Models

Python 39,195 5,054 Updated Oct 10, 2024

Official Code for DragGAN (SIGGRAPH 2023)

Python 35,728 3,453 Updated May 18, 2024

A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)

Python 5,500 939 Updated Nov 15, 2024

Code base for MinD-Vis

Python 749 94 Updated May 24, 2023

Baseline model for nocaps benchmark, ICCV 2019 paper "nocaps: novel object captioning at scale".

Python 75 12 Updated Oct 3, 2023

A beautiful, simple, clean, and responsive Jekyll theme for academics

HTML 11,211 11,244 Updated Nov 10, 2024

Code release for ConvNeXt model

Python 5,773 695 Updated Jan 8, 2023

[CVPR 2023] Official Implementation of X-Decoder for generalized decoding for pixel, image and language

Python 1,289 135 Updated Oct 5, 2023

Coarse-to-Fine Vision-Language Pre-training with Fusion in the Backbone

Python 127 11 Updated Oct 10, 2023

Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

Python 2,419 248 Updated Apr 24, 2024

EfficientFormerV2 [ICCV 2023] & EfficientFormer [NeurIPs 2022]

Python 991 92 Updated Aug 13, 2023

Optimized code based on M2 for faster image captioning training

Python 20 3 Updated Nov 18, 2022

Using pretrained encoder and language models to generate captions from multimedia inputs.

Python 95 13 Updated Mar 11, 2023

A collection of resources and papers on Diffusion Models

HTML 11,101 950 Updated Aug 1, 2024

[ICCV2023 Best Paper Finalist] PyTorch implementation of DiffusionDet (https://arxiv.org/abs/2211.09788)

Python 2,097 162 Updated Dec 22, 2022

GFPGAN aims at developing Practical Algorithms for Real-world Face Restoration.

Python 35,894 5,953 Updated Jul 26, 2024

yolov5 + csl_label.(Oriented Object Detection)(Rotation Detection)(Rotated BBox)基于yolov5的旋转目标检测

Python 1,832 427 Updated Oct 13, 2023

Minimal PyTorch implementation of YOLOv3

Python 6 2 Updated May 8, 2019

YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite

Python 50,930 16,396 Updated Nov 18, 2024

Code for ALBEF: a new vision-language pre-training method

Python 1,564 198 Updated Sep 20, 2022

PyTorch original implementation of Cross-lingual Language Model Pretraining.

Python 2,892 498 Updated Feb 14, 2023
Jupyter Notebook 561 90 Updated Oct 18, 2024

Implementation of paper - YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors

Jupyter Notebook 13,384 4,226 Updated Aug 19, 2024

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch

Python 1,888 507 Updated Jun 8, 2023

Pure Javascript OCR for more than 100 Languages 📖🎉🖥

JavaScript 35,306 2,231 Updated Oct 19, 2024
Next