DreamReward: Text-to-3D Generation with Human Preference
Improves text-to-3D models using human preference feedback, resulting in high-quality and accurate 3D content generation aligned with human intentions.
DreamReward is a text-to-3D generation framework designed to align with human preferences. The framework consists of two main components: Reward3D and Reward3D Feedback Learning (DreamFL). The Reward3D component involves the construction of a 3D dataset, development of a data annotation pipeline, and training of the Reward Model (RM). The dataset is created using a diverse selection of prompts from cap3D, and a graph-based algorithm is employed to ensure prompt diversity. The dataset is then filtered to address mode collapses in the generated 3D assets. The Reward Model is trained to assign scores to the 3D assets based on their alignment with the provided prompts.
The DreamFL component aims to bridge the gap between the distribution obtained by existing diffusion models for distilling 3D assets and the desired distribution aligned with human preferences and possessing 3D awareness. It leverages the existing distribution to approximate the challenging distribution and uses the Reward3D model to approximate the predicted noise of the distribution. This is achieved through a detailed mathematical derivation and demonstration.
Comments
None