Ranni v1 Code Released

Author:
Published: 4/16/2024, 4:00:38 AM
Category: Resource

Taming Text-to-Image Diffusion for Accurate Prompt Following

github.com

Official implementation of CVPR 2024 paper "Ranni: Taming Text-to-Image Diffusion for Accurate Instruction Following". It contains two main components: 1) a LLM-based planning model that maps text instructions into visual elements in image, 2) a diffusion-based painting model that draws image following the visual elements in first stage. Ranni achieves better semantic understanding thanks to the powerful ability of LLM. Currently, we release the model weights including a LoRA-finetuned LLaMa-2-7B, and a fully-finetuned SDv2.1 model.

Ranni v1 Code Released

Comments

Log in to leave a comment