Stars
[ICLR2023] Discrete Contrastive Diffusion for Cross-Modal Music and Image Generation (CDCD).
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable…
ImageBind One Embedding Space to Bind Them All
MU-LLaMA: Music Understanding Large Language Model
Evaluation functions for music/audio information retrieval/signal processing algorithms.
A curated list of Video to Audio Generation
Video2Music: Suitable Music Generation from Videos using an Affective Multimodal Transformer model
Manually annotated chord data set of US pop songs and Popular Music Collection of RWC Music Database
LoRA & Dreambooth training scripts & GUI use kohya-ss's trainer, for diffusion model.
A large-scale dataset of caption-annotated MIDI files.
This reporsitory contains metadata of WavCaps dataset and codes for downstream tasks.
The Song Describer dataset is an evaluation dataset made of ~1.1k captions for 706 permissively licensed music recordings.
Stable Diffusion web UI
提取微信聊天记录,将其导出成HTML、Word、Excel文档永久保存,对聊天记录进行分析生成年度聊天报告,用聊天数据训练专属于个人的AI聊天助手
Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
A curated list of awesome 3d generation papers
Responsive Resume Cv Website Using HTML CSS And JavaScript
A modern static resume template and theme. Powered by Jekyll and GitHub pages.
Codes for our ACL21 paper: Language Model as an Annotator: Exploring DialoGPT for Dialogue Summarization
Unsupervised Extractive Summarization based on Position-Augmented Centrality
official code repo for paper "CogView2: Faster and Better Text-to-Image Generation via Hierarchical Transformers"
ChatGLM2-6B: An Open Bilingual Chat LLM | 开源双语对话语言模型