Generic 3D Diffusion Adapter Using Controlled Multi-View Editing

Author: VRAMrod
Published: 3/20/2024, 2:32:51 AM
Category: Resource

3D object synthesis that ensures 3D consistency, visual quality, and efficiency through a training-free 3D Adapter and ancestral sampling.

Paper

https://arxiv.org/abs/2403.12032

Code

https://github.com/Lakonik/MVEdit

Project

https://lakonik.github.io/mvedit/

MVEdit is a generic method for adapting 2D diffusion models into 3D diffusion pipelines. This approach addresses the challenge of achieving 3D-consistent multi-view ancestral sampling while generating sharp details. The key innovation is the development of a training-free 3D Adapter, leveraging off-the-shelf ControlNets and a robust NeRF/mesh optimization scheme. This allows for precise 3D consistency when adapting pretrained image diffusion models into multi-view generators. Also introduced is StableSSDNeRF, a domain-specific 3D initialization method that can follow never-seen prompts and generate diverse, high-quality, photorealistic 3D objects within a short time frame.

The MVEdit approach involves a ControlNet-based 3D Adapter, which plays a crucial role in producing crisp textures and enhancing geometric details. The 3D Adapter design demonstrates effectiveness and the versatility in the associated pipelines, showcasing state-of-the-art performance in both image-to-3D and texture generation tasks. The method is also capable of handling text-guided 3D-to-3D and instruct 3D-to-3D pipelines, resulting in prompt-accurate appearances, intricate textures, and detailed geometry, highlighting the versatility of the approach.

Generic 3D Diffusion Adapter Using Controlled Multi-View Editing

Generic 3D Diffusion Adapter Using Controlled Multi-View Editing

Comments

Log in to leave a comment