Customizing Text-to-Image Models with a Single Image Pair
Captures artistic styles from image pairs and applies the learned style to generate new images without overfitting to specific content.
Pair Customization presents a method for customizing pre-trained text-to-image models while preserving content fidelity. Unlike existing methods that mimic single concepts from image collections, Pair Customization learns stylistic differences from a single image pair, allowing for style application without overfitting to specific image content.
The method employs LoRA weight modifications to adapt the model to new styles, introducing style LoRA weights to generate stylized outputs from noise seeds and style-specific text prompts. To disentangle style from content, separate content LoRA weights are learned to reconstruct the content image, with orthogonality between style and content LoRA weights enforced to encourage the representation of separate concepts.
Furthermore, Pair Customization introduces style guidance to enhance control over style application in the customized models, amplifying text guidance for improved stylization results. Evaluation against several baselines, including DreamBooth LoRA, Concept Sliders, IP-adapters, and StyleDrop, demonstrates superior performance in terms of style similarity and content preservation.


Comments
None