A unified multimodal understanding and generation model.
Generate images with SD3.5
Generate images fast with SD3.5 turbo
Generate spatial audio from images (and optionally text)