A scalable 3D HOG model for fast object detection and viewpoint estimation

A scalable 3D HOG model for fast object detection and viewpoint estimation

Pedersoli, Marco and Tuytelaars, Tinne

Proceedings – 2014 International Conference on 3D Vision, 3DV 2014 2015

Abstract : In this paper we present a scalable way to learn and detect objects using a 3D representation based on HOG patches placed on a 3D cuboid. The model consists of a single 3D representation that is shared among views. Similarly to the work of Fidler et al. [5], at detection time this representation is projected on the image plane over the desired viewpoints. However, whereas in [5] the projection is done at image-level and therefore the computational cost is linear in the number of views, in our model every view is approximated at feature level as a linear combination of the pre-computed fronto-parallel views. As a result, once the fronto-parallel views have been computed, the cost of computing new views is almost negligible. This allows the model to be evaluated on many more viewpoints. In the experimental results we show that the proposed model has a comparable detection and pose estimation performance to standard multiview HOG detectors, but it is faster, it scales very well with the number of views and can better generalize to unseen views. Finally, we also show that with a procedure similar to label propagation it is possible to train the model even without using pose annotations at training time.