3D Gaussian as a New Vision Era: A Survey
Comprehensive review of 3D Gaussian Splatting that covers its optimization methods, applications in perception and human-centric research, and future directions
The paper provides a comprehensive survey of 3D Gaussian Splatting (3DGS), a scene representation technique based on explicit radiance fields using 3D anisotropic Gaussian distributions. The survey covers various aspects of 3DGS, including optimization methods, mesh reconstruction, manipulation techniques, 3D generation, applications in perception and human-centric research, as well as future directions in the field. The survey categorizes and summarizes the methods used for optimizing 3DGS, considering factors such as rendering efficiency, realism, costs, and physics involved in 3DGS. It also explores the applications of 3DGS in perception, human body research, and manipulation tasks, highlighting the advantages and challenges associated with each application. Additionally, the survey identifies promising future directions for 3DGS, such as personalized generation and enhancing language understanding for text-to-3D methods.
The survey delves into the technical aspects of 3DGS, discussing the need to constrain the Gaussian kernel to adhere to the object's surface while balancing rendering accuracy. It also highlights opportunities for future work, such as integrating 3DGS with diffusion models for few-shot reconstruction and introducing lighting decomposition for more realistic surface textures in surface mesh extraction. The paper emphasizes the advantages of 3DGS for editing tasks, particularly in text-guided manipulation and non-rigid manipulation, and discusses the challenges and opportunities in these areas.
Furthermore, the survey explores the potential applications of 3DGS in perception, including its role in enhancing open-vocabulary semantic object detection and localization, 3D segmentation, tracking of moving objects, and Simultaneous Localization and Mapping (SLAM) systems. It discusses recent advancements in language-embedded 3DGS for open-vocabulary query tasks and the integration of 3DGS for efficient language embedding in room-scale scenes. The survey also addresses challenges and opportunities in human-centric applications of 3DGS, particularly in human body modeling and avatar reconstruction, highlighting the need for improved modeling of loose garments and environment lighting, as well as the potential for personalized generation and mesh extraction from learned 3D Gaussians.
Comments
None