Projects (New)

Learning to Localize Little Landmarks.
Saurabh Singh, Derek Hoiem, David Forsyth
Accepted in CVPR 2016

We interact everyday with tiny objects such as the door handle of a car or the light switch in a room. These little landmarks are barely visible and hard to localize in images. We describe a method to find such landmarks by finding a sequence of latent landmarks, each with a prediction model. Each latent landmark predicts the next in sequence, and the last localizes the target landmark. For example, to find the door handle of a car, our method learns to start with a latent landmark near the wheel, as it is globally distinctive; subsequent latent landmarks use the context from the earlier ones to get closer to the target. Our method is supervised solely by the location of the little landmark and displays strong performance on more difficult variants of established tasks and on two new tasks.

Where To Look: Focus Regions for Visual Question Answering
Kevin J. Shih, Saurabh Singh, Derek Hoiem
Accepted in CVPR 2016

Part Localization using Multi-Proposal Consensus for Fine-Grained Categorization
Kevin J. Shih, Arun Mallya, Saurabh Singh, Derek Hoiem
Accepted in BMVC 2015

We present a simple deep learning framework to simultaneously predict keypoint locations and their respective visibilities and use those to achieve state-of-the-art performance for fine-grained classification. We show that by conditioning the predictions on object proposals with sufficient image support, our method can do well without complicated spatial reasoning. Instead, inference methods with robustness to outliers, yield state-of-the-art for keypoint localization. We demonstrate the effectiveness of our accurate keypoint localization and visibility prediction on the fine-grained bird recognition task with and without ground truth bird bounding boxes, and outperform existing state-of-the-art methods by over 2%.

Learning a Sequential Search for Landmarks
Saurabh Singh, Derek Hoiem, David Forsyth
Accepted in CVPR 2015

We propose a general method to find landmarks in images of objects using both appearance and spatial context. This method is applied without changes to two problems: parsing human body layouts, and finding landmarks in images of birds. Our method learns a sequential search for localizing landmarks, iteratively detecting new landmarks given the appearance and contextual information from the already detected ones. Our method represents a novel spatial model for the kinematics of groups of landmarks, and displays strong performance on two different model problems.

Unsupervised Discovery of Mid-Level Discriminative Patches
Saurabh Singh, Abhinav Gupta, Alexei A. Efros
Accepted in ECCV 2012

The goal of this paper is to discover a set of discriminative patches which can serve as a fully unsupervised mid-level visual representation. We pose this as an unsupervised discriminative clustering problem on a huge dataset of image patches. We use an iterative procedure which alternates between clustering and training discriminative classifiers, while applying careful cross-validation at each step to prevent overfitting.

What Makes Paris Look like Paris?
Carl Doersch, Saurabh Singh, Abhinav Gupta, Josef Sivic, Alexei A. Efros
ACM Transactions on Graphics (SIGGRAPH 2012).

Given a large repository of geotagged imagery, the goal is to automatically find visual elements, e.g. windows, balconies, and street signs, that are most distinctive for a certain geo-spatial area, for example the city of Paris.

Constrained Semi-Supervised Learning using Attributes and Comparative Attributes.
Abhinav Shrivastava, Saurabh Singh, Abhinav Gupta
Accepted in ECCV 2012 (Oral)

Existing semi-supervised approaches are typically unreliable and face semantic drift because the learning task is under-constrained. This is primarily because they ignore the strong interactions that often exist between categories, such as the common attributes shared across categories as well as the attributes which make one different from another. The goal of this paper is to exploit these relationships and constrain the semi-supervised learning problem leading to better learned classifiers.

Here is a list of some undergrad projects and masters projects that I did.