(16-824) Visual Learning and Recognition: Schedule


The bolded paper is required reading. The other papers will be summarized in class.

Date Presenter Topic Papers Slides
Jan 11 Abhinav Gupta
David Fouhey
Introduction - slides (pdf)
Jan 13 Abhinav Gupta Theories of Vision J. Mundy. Object Recognition in the Geometric Era: a Retrospective Springer Berlin Heidelberg 2006. slides (pdf)
Jan 18 No Class Martin Luther King Day No Class -
Jan 21 Abhinav Gupta Theories of Vision
slides (pdf)
Jan 25 Abhinav Gupta Introduction to Data A. Halevy, P. Norvig, and F. Pereira. The Unreasonable Effectiveness of Data. IEEE Intelligent Systems, 24 8–12, 2009.

A. Torralba, and A. Efros. Unbiased Look at Dataset Bias. CVPR 2011.
slides (pdf)
Jan 27 David Fouhey
Abhinav Gupta
Introduction to Deep Learning
and Image Classification
A. Krizhevsky, I. Sutskever, and G.E. Hinton. ImageNet Classification with Deep Convolutional Neural Networks. NIPS 2012

K. Simonyan, A. Zisserman. Very Deep Convolutional Networks for Large-Scale Image Recognition. ICLR 2015
slides (pdf)
Feb 1 David Fouhey
Abhinav Gupta
Introduction to Deep Learning
and Image Classification Continued
A. Mahendran, A. Vedaldi. Understanding Deep Image Representations by Inverting Them. CVPR 2015

M.D. Zeiler, R. Fergus. Visualizing and Understanding Convolutional Networks. ECCV 2014
Feb 3 Abhinav Gupta
David Fouhey
Intro to Deep Learning Cont'd No Reading -
Visualization Link
Feb 8 Rohit Girdhar Introduction to Caffe No Reading Caffe Tutorial Slides (pdf)
Visualizing Slides (pdf)
Feb 10 David Fouhey
Zhihao Li
Yuxin Wu
Chun-Liang Li
Edges and Regions P. Arbelaez, M. Maire, C. Fowlkes, J. Malik. Contour Detection and Hierarchical Image Segmentation. TPAMI 2010

P. Dollar, C.L. Zitnick. Fast Edge Detection Using Structured Forests. TPAMI 2015.

J.R.R. Uijlings. K.E.A. van de Sande, T. Gevers, A.W.M. Smeulders. Selective Search for Object Recognition. IJCV 2013.

S. Xie, Z. Tu. Holistically-Nested Edge Detection. ICCV 2015
(David slides)
(Zhihao slides)
(Yuxin slides)
(Chun-Liang Li slides)
Feb 15 Chenchen Zhu
Satwik Kottur
Object Detection R. Girshick, J. Donahue, T. Darrell, J. Malik.
Region-based Convolutional Networks for Accurate Object Detection and Semantic Segmentation.
TPAMI 2015

R. Girshick. Fast R-CNN. ICCV 2015

S. Ren, K. He, R. Girshick, J. Sun Faster R-CNN. NIPS 2015.
Feb 17 Ravi Teja Mullapudi
Mengxin Li
Yilin Yang
Semantic Segmentation J. Shotton, J. Winn, C. Rother, A. Criminisi.
TextonBoost for Image Understanding: Multi-Class Object Recognition and Segmentation by Jointly Modeling Texture, Layout, and Context.
IJCV 2007.

J. Long, E. Shelhamer, T. Darrell. Fully Convolutional Networks for Semantic Segmentation. ICCV 2015

B. Hariharan, P. Arbelaez, R. Girshick, J. Malik. Hypercolumns for Object Segmentation and Fine-grained Localization. CVPR 2015.

L-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, A. Yuille. Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs. ICLR 2015.
Feb 22 Paloma Sodhi
Yu Fang Chang
Dan Spagnolo
3D Understanding – Primitives H. Barrow, J. Tenenbaum. Recovering Intrinsic Scene Characteristics From Images. In Computer Vision Systems, 1978.

D. Fouhey, A. Gupta, M. Hebert. Data-Driven 3D Primitives for Single Image Understanding. ICCV 2013.

D. Eigen, R. Fergus. Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-Scale Convolutional Architecture. ICCV 2015.

D. Zoran, P. Isola, D. Krishnan, W.T. Freeman Learning Ordinal Relationships for Mid-Level Vision. ICCV 2015.
Feb 24 Luong Nguyen
Shichao Yang
Yi Hua
Yujia Huang
3D Understanding – Reasoning A. Gupta, A.A. Efros, M. Hebert. Blocks World Revisited. ECCV 2010

L. Del Pero, J. Bowdish, D. Fried, B. Kermgard, E. Hartley, K. Barnard. Bayesian geometric modeling of indoor scenes CVPR 2012.

D.F. Fouhey, A. Gupta, M. Hebert. Unfolding an Indoor Origami World. ECCV 2014.

X. Wang, D.F. Fouhey, A. Gupta. Designing Deep Networks for Surface Normal Estimation. CVPR 2015.

Feb 29 Rui Zhu
Zhe Cao
Vivek Krishnan
Object Detection with 3D Models S. Satkin, J. Lin, M. Hebert. Data-Driven Scene Understanding from 3D Models. BMVC 2012.

H. Su, C.R. Qi, Y. Li, L.J. Guibas
Render for CNN: Viewpoint Estimation in Images Using CNNs Trained with Rendered 3D Model Views. ICCV 2015.

Z. Wu, S. Song, A. Khosla, F. Yu, L. Zhang, X. Tang and J. Xiao.
3D ShapeNets: A Deep Representation for Volumetric Shape Modeling. CVPR 2015.

A. Bansal, B. Russell, A. Gupta. Marr Revisited: 2D-3D Alignment via Surface Normal Prediction. Arxiv 2016
Mar 7 No Class Spring Break No Class Spring Break
Mar 9 No Class Spring Break No Class Spring Break
Mar 14 No Class ECCV Deadline No Class ECCV Deadline
Mar 16 Aishanou Rait
Chen Ma
Chenyang Li
Utkarsh Sinha
Divyaa Ravichandran
Classic Problems A. Dosovitskiy, P. Fischer, E. Ilg, P. Hausser, C. Hazirbas, V. Golkov, P. Smagt, D. Cremers. T. Brox.
FlowNet: Learning Optical Flow with Convolutional Networks.
ICCV 2015.

M. Cimpoi, S. Maji, A. Vedaldi, Deep filter banks for texture recognition and segmentation. CVPR 2015.

J. Long, N. Zhang, T. Darrell. Do Convnets Learn Correspondence?. NIPS 2014.

J. Zbontar, Y. LeCun. Computing the Stereo Matching Cost with a Convolutional Neural Network. CVPR 2015.

J. Flynn, I. Neulander, J. Philbin, N. Snavely. DeepStereo: Learning to Predict New Views from the World's Imagery. Arxiv 2015.
Mar 21 Jennifer Lake
Alex Poms
Ligong Han
Context-based Reasoning Z. Tu, X. Bai. Auto-context and Its Application to High-level Vision Tasks and 3D Brain Image Segmentation. TPAMI 2010.

S.K. Divvala, D. Hoiem, J.H. Hays, A.A. Efros, M. Hebert. An Empirical Study of Context in Object Detection. CVPR 2009.

D. Hoiem, A.A. Efros, M. Hebert. Putting Objects in Perspective. CVPR 2006.
Mar 23 Harsha Vardhan Pokkalla
Noranart Vesdapunt
Jonathan Shen
Pose Estimation Y. Yang, D. Ramanan. Articulated Human Detection with Flexible Mixtures-of-Parts. TPAMI 2013.

J. Tompson, R. Goroshin, A. Jain, Y. LeCun, C. Bregler. Efficient Object Localization Using Convolutional Networks. Arxiv 2015.

J. Carreira, P. Agrawal, K. Fragkiadaki, J. Malik. Human Pose Estimation with Iterative Error Feedback. Arxiv 2015.

M. Oberweger, P. Wohlhart, V. Lepetit. Training a Feedback Loop for Hand Pose Estimation. ICCV 2015.

Mar 28 Guanhang Wu
Yiming Wu
Jakob Bauer
Action Recognition K. Simonyan, A. Zisserman Two-Stream Convolutional Networks for Action Recognition in Videos. NIPS 2014.

D. Tran, L. Bourdev, R. Fergus, L. Torresani, M. Paluri. Learning Spatiotemporal Features with 3D Convolutional Networks. ICCV 2015.

X. Wang, A. Farhadi, A. Gupta. Actions   Transformations. Arxiv 2015.
Mar 30 Shreyas Joshi
Lekha Walajapet Mohan
James Laney
Jai Prakash
Humans and Objects B. Yao, F. Li. Modeling Mutual Context of Object and Human Pose in Human-Object Interaction Activities CVPR 2010.

A. Gupta, S. Satkin, A.A. Efrod, M. Hebert. From 3D Scene Geometry to Human Workspace. CVPR 2011.

Y. Jiang, H. Koppula, A. Saxena Hallucinated Humans as the Hidden Context for Labeling 3D Scenes. CVPR 2013.

D.F. Fouhey, V. Delaitre, A. Gupta, A.A. Efros, J. Sivic, I. Laptev. People Watching: Human Actions as a Cue for Single View Geometry. In ECCV 2012
Apr 4 Shumian Xin
Xiaofang Wang
Ching-Hang Chen
Discriminative Unsupervised Learning C. Doersch, A. Gupta, A.A. Efros Unsupervised Visual Representation Learning by Context Prediction. ICCV 2015.

X. Wang, A. Gupta. Unsupervised Learning of Visual Representations using Videos. ICCV 2015.

P. Agrawal, J. Carreira, J. Malik. Learning to See by Moving. ICCV 2015.

R. Goroshin, J. Bruna, J. Tompson, D. Eigen, Y. LeCun Unsupervised Learning of Spatiotemporally Coherent Metrics. ICCV 2015.
Apr 6 Rahul Nallamothu
Sai Ganesh Bandiatmakuri
Fish Tung
Chen-Hsuan Lin
Generative Unsupervised Learning D. Kingma, M. Welling Auto-Encoding Variational Bayes. ICLR 2014.

E. Denton, S. Chintala, A. Szlam, R. Fergus. Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks. NIPS 2015.

A. Radford, L. Metz, S. Chintala, Unsupervised Representation Learning With Deep Convolutional Generative Adversarial Networks. Arxiv 2015.

A. Dosovitskiy, J. Springenberg, M. Tatarchenko, T. Brox, Learning to Generate Chairs, Tables and Cars with Convolutional Networks. CVPR 2015.
Apr 11 Senthil Purushwalkam
Tanmay Batra
Michael Jaison G
Weakly and Semi-supervised Learning D. Pathak, P. Krahenbuhl, T. Darrell Constrained Convolutional Neural Networks for Weakly Supervised Segmentation. ICCV 2015.

A. Shrivastava, S. Singh, A. Gupta Constrained Semi-Supervised Learning using Attributes and Comparative Attributes. ECCV 2012.

J. Hoffman, D. Pathak, T. Darrell, K. Saenko Detector Discovery in the Wild: Joint Multiple Instance and Representation Learning. NIPS 2014.
Apr 13 Debidatta Dwibedi
Yi Shi
Cheng-An Hou
Learning from the Web X. Chen, A. Shrivastava, A. Gupta. NEIL: Extracting Visual Knowledge from Web Data. ICCV 2013

S. Divvala, A. Farhadi, C. Guestrin.Learning Everything about Anything: Webly-Supervised Visual Concept Learning. CVPR 2014.

X. Chen, A. Gupta Webly Supervised Learning of Convolutional Networks. ICCV 2015.

A. Joulin, L. van der Maaten, A. Jabri, N. Vasilache. Learning Visual Features from Large Weakly Supervised Data. Arxiv 2015.
Apr 18 TBA Text and Images G. Kulkarni, V. Premraj, S. Dhar, S. Li, Y. Choi, A. Berg and T. Berg. Baby Talk: Understanding and Generating Image Descriptions. CVPR 2011.

O. Vinyals, A. Toshev, S. Bengio, D. Erhan. Show and Tell: A Neural Image Caption Generator. CVPR 2015.

J. Devlin, S. Gupta, R. Girshick, M. Mitchell, C.L. Zitnick. Exploring Nearest Neighbor Approaches for Image Captioning. Arxiv 2015.
Apr 20 TBA Visual Question/Answer S. Antol, A. Agrawal, J. Lu, M. Mitchell, D. Batra, C.L. Zitnick, D. Parikh VQA: Visual Question Answering ICCV 2015.

B. Zhou, Y. Tian, S. Sukhbaatar, A. Szlam, R. Fergus Simple Baseline for Visual Question Answering Arxiv 2015

L. Yu, E. Park A.C. Berg, T.L. Berg Visual Madlibs ICCV 2015
Apr 25 - Final Project Presentations -
Apr 27 - Final Project Presentations -