Abstract

In the past few years, convolutional neural nets (CNN) have shown incredible promise for learning visual representations. In this paper, we use CNNs for the task of predicting surface normals from a single image. But what is the right architecture? We propose to build upon the decades of hard work in 3D scene understanding to design a new CNN architecture for the task of surface normal estimation. We show that incorporating several constraints (man-made, Manhattan world) and meaningful intermediate representations (room layout, edge labels) in the architecture leads to state of the art performance on surface normal estimation. We also show that our network is quite robust and show state of the art results on other datasets as well without any fine-tuning.

Keywords

Convolutional neural networkComputer scienceArchitectureTask (project management)Artificial intelligenceEnhanced Data Rates for GSM EvolutionSurface (topology)Image (mathematics)Deep learningState (computer science)Network architecturePattern recognition (psychology)EstimationMachine learningComputer visionAlgorithmMathematicsEngineering

Affiliated Institutions

Related Publications

Publication Info

Year
2015
Type
preprint
Pages
539-547
Citations
345
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

345
OpenAlex

Cite This

Xiaolong Wang, David F. Fouhey, Abhinav Gupta (2015). Designing deep networks for surface normal estimation. , 539-547. https://doi.org/10.1109/cvpr.2015.7298652

Identifiers

DOI
10.1109/cvpr.2015.7298652