Unsupervised Discovery of Mid-Level Discriminative Patches
http://graphics.cs.cmu.edu/projects/discriminativePatches/
Discriminative Patches, Unsupervised Discovery
People
An algorithm to automatically discover discriminative patterns in large corpuses of images. These patterns could be used as features to represent images. Visit website for more information.
Abstract
The goal of this paper is to discover a set of discriminative patches which can
serve as a fully unsupervised mid-level visual representation. The desired
patches need to satisfy two requirements: 1) to be representative, they need to
occur frequently enough in the visual world; 2) to be discriminative, they need
to be different enough from the rest of the visual world. The patches could
correspond to parts, objects, “visual phrases”, etc. but are not restricted to
be any one of them. We pose this as an unsupervised discriminative clustering
problem on a huge dataset of image patches. We use an iterative procedure which
alternates between clustering and training discriminative classifiers, while
applying careful cross-validation at each step to prevent overfitting. The paper
experimentally demonstrates the effectiveness of discriminative patches as an
unsupervised mid-level visual representation, suggesting that it could be used
in place of visual words for many tasks. Furthermore, discrim- inative patches
can also be used in a supervised regime, such as scene classification, where
they demonstrate state-of-the-art performance on the MIT Indoor-67 dataset.
Video
Paper
Paper: ECCV 2012 Pdf (7.5 MB)
Poster: ECCV 2012 Pdf (28.2 MB) Citation Saurabh Singh, Abhinav Gupta and Alexei A. Efros. Unsupervised Discovery of Mid-Level Discriminative Patches. In European Conference on Computer Vision (2012). Visit Arxiv Entry (http://arxiv.org/abs/1205.3137) |
BibTeX
@inproceedings{Singh2012DiscPat, author = {Saurabh Singh and Abhinav Gupta and Alexei A. Efros}, title = {Unsupervised Discovery of Mid-level Discriminative Patches}, booktitle={European Conference on Computer Vision}, year = {2012}, eprint= {1205.3137}, archivePrefix = {arXiv}, primaryClass = {cs.CV}, url = {http://arxiv.org/abs/1205.3137}, }
Code
Code is available on Github
Data
Discovered discriminative patches for the Pascal 2007 subset used in the paper and MIT
Indoor-67 dataset are here (331MB).
Related Papers
What Makes Paris
Look like Paris?,
Carl Doersch, Saurabh Singh, Abhinav Gupta, Josef Sivic, and Alexei A. Efros.
in ACM Transactions on Graphics, SIGGRAPH 2012
Funding
This research is supported by:
- ONR Grant N000141010766