Ranni v1 Code Released
Taming Text-to-Image Diffusion for Accurate Prompt Following
github.com
https://github.com/ali-vilab/RanniOfficial implementation of CVPR 2024 paper "Ranni: Taming Text-to-Image Diffusion for Accurate Instruction Following". It contains two main components: 1) a LLM-based planning model that maps text instructions into visual elements in image, 2) a diffusion-based painting model that draws image following the visual elements in first stage. Ranni achieves better semantic understanding thanks to the powerful ability of LLM. Currently, we release the model weights including a LoRA-finetuned LLaMa-2-7B, and a fully-finetuned SDv2.1 model.
Comments
None