3D-SceneDreamer: Text-Driven 3D-Consistent Scene Generation
Generates 3D scenes by refining local views with global 3D information
The paper presents a novel approach for text-driven 3D scene generation, aiming to address the limitations of existing methods in terms of 3D consistency and visual quality. The proposed method leverages a tri-plane features-based Neural Radiance Fields (NeRF) as a unified representation of the 3D scene to ensure global 3D consistency. Additionally, a generative refinement network is introduced to synthesize new contents with higher quality by utilizing the natural image prior from a 2D diffusion model and the global 3D information of the current scene. This approach allows for the progressive generation of 3D scenes, ensuring both semantic and geometric consistency.
The method is designed to support a wide variety of scene generation and arbitrary camera trajectories, providing improved visual quality and 3D consistency compared to previous methods. By employing a hybrid NeRF as the scene representation, the proposed approach can effectively handle complex scenes, mitigating the effect of accumulated errors caused by inaccurate prior signals. Furthermore, the generative refinement network enables the synthesis of new contents with higher quality by exploiting both 2D diffusion model priors and global 3D information, ensuring the generation of high-fidelity scenes with stable 3D consistency in indoor, outdoor, and unreal-style scenes.


Comments
None