CRM: Single Image to 3D Textured Mesh with Convolutional Reconstruction Model

Author: VRAMrod
Published: 3/11/2024, 3:10:18 PM
Category: Resource

High-fidelity feed-forward single image-to-3D generative model that leverages geometric priors for efficient generation of textured meshes.

Paper

https://arxiv.org/abs/2403.05034

Code

https://github.com/thu-ml/CRM

Project

https://ml.cs.tsinghua.edu.cn/~zhengyi/CRM/

Convolutional Reconstruction Model (CRM) is a breakthrough framework for generating high-quality 3D models from a single image. This approach effectively utilizes the spatial relationship between input images and the output triplane, resulting in improved textured meshes. Unlike previous transformer-based methods, the CRM operates on an end-to-end training basis, directly outputting textured meshes. The model can produce detailed textured meshes in just 10 seconds, significantly reducing training costs.

The paper introduces the design of the multi-view diffusion models, which are essential components of the CRM. The work sequentially adds proposed techniques to the training process and examine the results on a subset of the dataset, comparing the similarity of the generated novel view images with the ground truth images using various metrics. The results show that the Zero-SNR trick and random resizing are beneficial, while contour augmentation does not improve quantitative metrics. However, the authors find that the contour augmentation trick makes the model more robust to diverse input images.

CRM: Single Image to 3D Textured Mesh with Convolutional Reconstruction Model

CRM: Single Image to 3D Textured Mesh with Convolutional Reconstruction Model

Comments

Log in to leave a comment