PeRFlow: Piecewise Rectified Flow as Universal Plug-and-Play Accelerator
Fast image generation, image enhancement, and efficient multiview generation
Rectified Flow is a promising way for accelerating pre-trained diffusion models. However, the generation quality of prior fast flow-based models on Stable Diffusion (such as InstaFlow) is unsatisfactory. In this work, we did several improvements to the original reflow pipeline to significantly boost the performance of flow-based fast SD. Our new model learns a piecewise linear probability flow which can efficiently generate high-quality images in just 4 steps, termed piecewise rectified flow (PeRFlow). Moreover, we found the difference of model weights can be used as a plug-and-play accelerator module on a wide-range of SD-based models.
Specifically, PeRFlow has several features:
- Fast Generation : PeRFlow can generate high-fidelity images in just 4 steps. The images generated from PeRFlow are more diverse than other fast-sampling models (such as LCM). Moreover, as PeRFlow is a continuous probability flow, it supports 8-step, 16-step, or even higher number of sampling steps to monotonically increase the generation quality.
- Efficient Training: Fine-tuning PeRFlow based on SD 1.5 converges in just 4,000 training iterations (with a batch size of 1024). In comparison, previous fast flow-based text-to-image model, InstaFlow, requires 25,000 training iteration with the same batch size in fine-tuning. Besides, PeRFlow does not require heavy data generation for reflow.
- Compatible with SD Workflows: PeRFlow works with various stylized LORAs and generation/editing pipelines of the pretrained SD model. As a plug-and-play module, can be directly combined with other conditional generation pipelines, such as ControlNet, IP-Adaptor, multi-view generation.
- Classifier-Free Guidance : PeRFlow is fully compatible with classifier-free guidance and supports negative prompts, which are important for pushing the generation quality to even higher level. Empirically, the CFG scale is similar to the original diffusion model.
Comments
None