Introducing Stable Cascade [JP]
New text-to-image conversion model based on the Würstchen using a three-step approach that make it easy to train and fine-tune on consumer hardware
Introduction
A research preview of Stable Cascade is now available. This innovative text-to-image model sets new benchmarks for quality, flexibility, fine-tuning, and efficiency, with an interesting three-step focus on further eliminating hardware barriers. We are introducing an approach. As soon as they are ready, we will publish the training and inference code on the Stability AI GitHub page. Additionally, the model can be inferred with the diffusers library. We will publish this as soon as it is ready.
Technical details
Stable Cascade is unique compared to the Stable Diffusion model lineup because it is built with a pipeline of three different models (Stage A, B, and C). This architecture enables hierarchical compression of images, allowing us to take advantage of a highly compressed latent space while still achieving superior results.
Comments
None