Abstract
In the past few years, convolutional neural nets (CNN) have shown incredible promise for learning visual representations. In this paper, we use CNNs for the task of predicting surface normals from a single image. But what is the right architecture? We propose to build upon the decades of hard work in 3D scene understanding to design a new CNN architecture for the task of surface normal estimation. We show that incorporating several constraints (man-made, Manhattan world) and meaningful intermediate representations (room layout, edge labels) in the architecture leads to state of the art performance on surface normal estimation. We also show that our network is quite robust and show state of the art results on other datasets as well without any fine-tuning.
Keywords
Affiliated Institutions
Related Publications
AttentionNet: Aggregating Weak Directions for Accurate Object Detection
We present a novel detection method using a deep convolutional neural network (CNN), named AttentionNet. We cast an object detection problem as an iterative classification probl...
RGB-D Object Recognition via Incorporating Latent Data Structure and Prior Knowledge
For the task of RGB-D object recognition, it is important to identify suitable representations of images, which can boost the performance of object recognition. In this work, we...
Very Deep Convolutional Networks for Large-Scale Image Recognition
In this work we investigate the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting. Our main contribution is a thorough evalu...
CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes
We propose a network for Congested Scene Recognition called CSRNet to provide a data-driven and deep learning method that can understand highly congested scenes and perform accu...
VoxNet: A 3D Convolutional Neural Network for real-time object recognition
Robust object recognition is a crucial skill for robots operating autonomously in real world environments. Range sensors such as LiDAR and RGBD cameras are increasingly found in...
Publication Info
- Year
- 2015
- Type
- preprint
- Pages
- 539-547
- Citations
- 345
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.1109/cvpr.2015.7298652