
Share
Researchers at Imperial College London challenge traditional methods by using per-pixel ray direction and relative pixel rotations to estimate surface normals more efficiently, slashing data and compute needs.
In a recent paper presented at CVPR 2024, researchers from the Dyson Robotics Lab at Imperial College London have introduced a novel approach to surface normal estimation by rethinking the inductive biases used in training models. The key contributions of this work are the utilization of per-pixel ray direction and the estimation of surface normals through learning the relative rotation between nearby pixels. This approach significantly reduces the data and computational requirements, making it a compelling alternative to existing methods.
Traditional methods for surface normal estimation often require large datasets and extensive computational resources. For instance, Omnidata V2, which is based on the DPT architecture, is trained on 12 million images over two weeks using four NVIDIA V100 GPUs. In contrast, the proposed model in this paper is trained on just 160,000 images for 12 hours on a single NVIDIA 4090 GPU. This efficiency makes it more accessible and practical for real-world applications.

Surface normal estimation is a fundamental task in computer vision with applications in various domains:
Despite its importance, there has been limited discussion on the right inductive biases needed for surface normal estimation. This paper addresses that gap by proposing practical and efficient solutions.
The researchers have provided a video demonstration of their model's performance on input videos from the DAVIS dataset. The predictions are made per-frame, and the results can be viewed in 4K resolution. Here is the link to the demo:
[CVPR 2024] Rethinking Inductive Biases for Surface Normal Estimation - YouTube
By introducing per-pixel ray direction and relative rotation estimation as key inductive biases, this work significantly advances the field of surface normal estimation. The efficiency gains make it a promising approach for both research and practical applications.
Tags
Original Sources
↗ https://baegwangbin.github.io/DSINE/?utm_source=tldrai
About the author
Kai built ML infrastructure at a Bay Area startup before developing an obsession with transformer architectures and inference optimisation that eventually pulled him out of product work entirely. A stint at a compute research lab sharpened his instinct for what actually matters in a model release versus what is marketing. He writes from the inside — from the perspective of someone who has debugged the systems he is describing at three in the morning. He is allergic to hype and instinctively drawn to the unglamorous plumbing questions that everyone else skips over.
More from The Engineer →This Week's Edition
4 March 2024
133 articles
Related Articles
Related Articles
More Stories