Stars
Digital Avatar Conversational System - Linly-Talker. 😄✨ Linly-Talker is an intelligent AI system that combines large language models (LLMs) with visual models to create a novel human-AI interaction…
[ECCV 2024 Oral] EDTalk - Official PyTorch Implementation
Production First and Production Ready End-to-End Speech Recognition Toolkit
HuBERT content encoders for: A Comparison of Discrete and Soft Speech Units for Improved Voice Conversion
SoftVC VITS Singing Voice Conversion
chinese speech pretrained models
Real time interactive streaming digital human
DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.
High-performance multiple object tracking based on YOLO, Deep SORT, and KLT 🚀
MOT using deepsort and yolov3 with pytorch
SOTA Re-identification Methods and Toolbox
BoxMOT: pluggable SOTA tracking modules for segmentation, object detection and pose estimation models
Deformable Convolutional Networks v2 with Pytorch
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
Deformable DETR: Deformable Transformers for End-to-End Object Detection.
Official inference repo for FLUX.1 models
This repository is an official implementation of the ICCV 2021 paper "Conditional DETR for Fast Training Convergence". (https://arxiv.org/abs/2108.06152)
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
[CGI 2020] Official PyTorch Implementation for "Deep Color Transfer using Histogram Analogy"
A Deep Learning based project for colorizing and restoring old images (and video!)
CVNets: A library for training computer vision networks
👔IMAGDressing👔: Interactive Modular Apparel Generation for Virtual Dressing
Official Implementation of Fast End-to-End Trainable Guided Filter, CVPR 2018
[CVPR 2024 Highlight] Putting the Object Back Into Video Object Segmentation
[Preprint] VMFormer: End-to-End Video Matting with Transformer
This is the repo for our new project Highly Accurate Dichotomous Image Segmentation
Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditioning