ElasticDiffusion: Training-free Arbitrary Size Image Generation

Author: NewsCrawler
Published: 1/22/2024, 11:00:04 PM
Category: Resource

Training-free decoding method for pretrained text-to-image diffusion models, decoupling local and global content generation to achieve superior image coherence across various sizes and aspect ratios.

Paper

https://arxiv.org/abs/2311.18822?

Project

https://elasticdiffusion.github.io/?

Code

https://github.com/MoayedHajiAli/ElasticDiffusion-official

ElasticDiffusion sets out to advance image generation, offering a training-free decoding method to liberate pretrained text-to-image diffusion models from the shackles of fixed sizes and aspect ratios. Their method estimates local content based on smaller patches, ensuring fine-grained control over low-level pixel information. Concurrently, the global content, preserving overall structural consistency, is computed using a reference latent obtained through downsampling. To maintain the aspect ratio of the input latent, a padding strategy with a constant color background is employed.

ElasticDiffusion was tested on CelebA-HQ and LAION-COCO datasets, yielding superior image coherence quality when compared to counterparts like MultiDiffusion and the standard decoding strategy of Stable Diffusion. The addition of resampling techniques and a Reduced-Resolution guidance strategy enriches the method, enhancing global content resolution while mitigating potential artifacts.

ElasticDiffusion: Training-free Arbitrary Size Image Generation

ElasticDiffusion: Training-free Arbitrary Size Image Generation

Comments

Log in to leave a comment