Rethinking Inductive Biases for Surface Normal Estimation
Method for surface normal estimation, utilizing a fully convolutional design and decomposing rotation matrices to angles and axes to reduce reliance on large datasets and synthetic scenes.
DSINE is a revolutionary method for estimating surface normals, essential for understanding the geometry of objects in images. The novel approach focuses on reassessing inductive biases to bolster the model's ability to generalize across diverse scenes and objects.
At the heart of this innovation lies a fully convolutional design, enabling translational weight sharing and significantly enhancing sample efficiency. Moreover, the model rethinks the estimation of rotation matrices, cleverly decomposing them into angles and axes, thereby diminishing the need for extensive datasets and synthetic scene rendering.
To train the model, researchers curated a compact meta-dataset comprising images from various RGB-D datasets, prioritizing scene diversity over sheer image quantity. Rigorous experimental evaluation, including angular error measurements and data preprocessing techniques such as aggressive 2D augmentation, underscored the efficacy of the proposed approach.
The network architecture incorporates a convolutional neural network (CNN) for extracting surface normals and contextual features, augmented by a ConvGRU cell and convex upsampling layer. During training, the model undergoes rigorous data augmentation, facilitated by PyTorch implementation and optimized using the AdamW optimizer.
Comments
None