DGE: Direct Gaussian 3D Editing by Consistent Multi-view Editing
Edits 3D objects and scenes efficiently based on open-ended language instructions
The Direct Gaussian Editor (DGE) introduces a new approach to 3D editing with three main goals: high fidelity, high efficiency, and selective editing. To achieve these goals, the method changes both the representation of the 3D model and the way it is updated. Gaussian Splatting (GS) is used as an alternative representation of a radiance field, offering advantages in speed and efficiency compared to NeRF models. GS supports local edits efficiently by using local 3D primitives, Gaussians, and identifying them through a 2D instance segmenter.
In the implementation, InstructPix2Pix is used as the image editor, fine-tuned for image-to-image translation based on Stable Diffusion. Scenes from various datasets are utilized to demonstrate the method's editing capabilities on 3D models. 3D GS is employed as the 3D representation, and a segmentation pipeline is adopted for partial editing. The editing process involves using multiple views for editing, training G′ with iterations based on scene complexity, and incorporating classifier-free guidance for control over editing effects.
Comments
None