AI
[ICCV2023 Best Paper Finalist] PyTorch implementation of DiffusionDet (https://arxiv.org/abs/2211.09788)
RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference,…
Implementation of MusicLM, Google's new SOTA model for music generation using attention networks, in Pytorch
Hackable and optimized Transformers building blocks, supporting a composable construction.
DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code
AudioLDM: Generate speech, sound effects, music and beyond, with text.
Code and documentation to train Stanford's Alpaca models, and generate the data.
Reflexion: an autonomous agent with dynamic memory and self-reflection
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
[ICLR 2024] Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
A collection of scientific methods, processes, algorithms, and systems to build stories & models.
Kandinsky 2 — multilingual text2image latent diffusion model
Generative Models by Stability AI
Imitation learning algorithms with Co-training for Mobile ALOHA: ACT, Diffusion Policy, VINN
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
Official Code for Stable Cascade
llama3 implementation one matrix multiplication at a time