CharacterFactory: Sampling Consistent Characters with GANs for Diffusion Models
Generates new characters with consistent identities using GANs and word embeddings of celebrity names
Recent developments in text-to-image models have expanded possibilities in human-centric generation. However, these models face limitations in producing images with consistent identities. CharacterFactory, our proposed framework, addresses this challenge by leveraging a Generative Adversarial Network (GAN) architecture within the realm of diffusion models. Specifically, we utilize word embeddings representing celebrity names as benchmarks for ensuring identity consistency during image generation. Through training, our GAN model learns to map latent space to the embedding space of celebrity identities.
In addition to this mapping process, we introduce a context-consistent loss function. This function plays a crucial role in guaranteeing that the generated identity embeddings translate into images that maintain identity consistency across various contexts. Notably, the training of the entire model is achieved within a mere 10 minutes, and the inference stage allows for the generation of an infinite array of characters in a seamless manner.
Comments
None