ART3D: 3D Gaussian Splatting for Text-Guided Artistic Scenes Generation

Author: VRAMrod
Published: 5/21/2024, 5:14:21 AM
Category: Research

Generates high-quality 3D artistic scenes by combining diffusion models and 3D Gaussian splatting techniques.

arxiv.org

ART3D is a novel approach for generating high-quality 3D artistic scenes from textual descriptions or reference images. The method leverages Stable Diffusion models and 3D Gaussian splatting techniques to bridge the gap between artistic and realistic images. The process involves an image semantic transfer algorithm to extract feature maps and ensure semantic consistency between artistic and realistic images. By predicting depth information accurately and generating point clouds, the method addresses challenges in obtaining precise 3D information for artistic scenes.

The implementation of ART3D involves several key components. Firstly, the image semantic transfer algorithm aligns semantic information between artistic and realistic images using the internal features of the Stable Diffusion model. This step enhances the accuracy of obtaining depth information from artistic images. Subsequently, a point cloud map is established to transform depth information into a point cloud, which is then used for generating new images through camera reprojection. The depth consistency module is introduced to improve consistency between multiple views and seamlessly integrate new point clouds into the existing map.

Furthermore, the method employs 3D Gaussian splatting to render high-quality 3D artistic scenes and novel views. By optimizing the point cloud map and enhancing multi-view consistency, the approach successfully generates structurally consistent and diverse 3D artistic scenes. Through a series of iterative processes, including point cloud generation, camera reprojection, and inpainting, the method ensures the completion of hollow areas in the novel view images, resulting in visually appealing 3D scenes.