ThemeStation: Generating Theme-Aware 3D Assets from Few Exemplars
The paper presents ThemeStation, a novel approach for synthesizing theme-aware 3D assets based on input exemplars to achieve unity and diversity in the generated assets.
ThemeStation is a two-stage framework for theme-aware 3D-to-3D generation. In the first stage, a pre-trained text-to-image (T2I) diffusion model is fine-tuned to generate concept images based on input exemplars. These concept images share a consistent theme with the exemplars, mimicking the concept art design process. In the second stage, an optimization-based method is used to convert the concept images into final 3D models. This process involves leveraging two diffusion priors - a concept prior from the concept images and a reference prior from the input exemplars. A novel dual score distillation (DSD) loss function is introduced to combine and guide the generation process using these two priors.
The first stage of the framework involves customizing a pre-trained T2I diffusion model to produce concept images that align with the theme of the input exemplars. These concept images serve as rough guidance for the subsequent 3D modeling process. In the second stage, a reference-informed 3D asset modeling approach is used to refine the initial 3D models generated from the concept images. This stage involves leveraging the concept image and initial model to meticulously develop the final 3D model. The DSD loss function is employed to utilize the priors from both the concept images and the exemplars during the optimization process.
Dual score distillation (DSD) is a critical component of the approach, enabling the joint usage of two diffusion priors - concept prior and reference prior. These priors are derived through fine-tuning a pre-trained T2I diffusion model. The DSD loss function is designed to disentangle these two priors and guide the generation process effectively. By leveraging the concept prior from the concept images and the reference prior from the input exemplars at different noise levels, the DSD loss function helps in generating diverse and theme-consistent 3D models. The approach allows for theme-aware 3D-to-3D generation, producing compelling and diverse 3D assets with more details, even with just one or a few input exemplars.


Comments
Project code was released: ThemeStation