(16-824) Visual Learning and Recognition: Schedule
Schedule
The bolded paper is required reading. The other papers will be summarized in class.
Date | Presenter | Topic | Papers | Slides |
Jan 11 | Abhinav Gupta David Fouhey | Introduction | - | slides (pdf) |
Jan 13 | Abhinav Gupta | Theories of Vision | J. Mundy. Object Recognition in the Geometric Era: a Retrospective Springer Berlin Heidelberg 2006. | slides (pdf) |
Jan 18 | No Class | Martin Luther King Day | No Class | - |
Jan 21 | Abhinav Gupta | Theories of Vision Continued | | slides (pdf) |
Jan 25 | Abhinav Gupta | Introduction to Data | A. Halevy, P. Norvig, and F. Pereira. The Unreasonable Effectiveness of Data. IEEE Intelligent Systems, 24 8–12, 2009.
A. Torralba, and A. Efros. Unbiased Look at Dataset Bias. CVPR 2011. | slides (pdf) |
Jan 27 | David Fouhey Abhinav Gupta | Introduction to Deep Learning and Image Classification | A. Krizhevsky, I. Sutskever, and G.E. Hinton. ImageNet Classification with Deep Convolutional Neural Networks. NIPS 2012
K. Simonyan, A. Zisserman. Very Deep Convolutional Networks for Large-Scale Image Recognition. ICLR 2015 | slides (pdf) |
Feb 1 | David Fouhey Abhinav Gupta | Introduction to Deep Learning and Image Classification Continued | A. Mahendran, A. Vedaldi. Understanding Deep Image Representations by Inverting Them. CVPR 2015
M.D. Zeiler, R. Fergus. Visualizing and Understanding Convolutional Networks. ECCV 2014 | - |
Feb 3 | Abhinav Gupta David Fouhey | Intro to Deep Learning Cont'd | No Reading | - Visualization Link |
Feb 8 | Rohit Girdhar | Introduction to Caffe | No Reading | Caffe Tutorial Slides (pdf) Visualizing Slides (pdf) |
Feb 10 | David Fouhey Zhihao Li Yuxin Wu Chun-Liang Li | Edges and Regions | P. Arbelaez, M. Maire, C. Fowlkes, J. Malik. Contour Detection and Hierarchical Image Segmentation. TPAMI 2010
P. Dollar, C.L. Zitnick. Fast Edge Detection Using Structured Forests. TPAMI 2015.
J.R.R. Uijlings. K.E.A. van de Sande, T. Gevers, A.W.M. Smeulders. Selective Search for Object Recognition. IJCV 2013.
S. Xie, Z. Tu. Holistically-Nested Edge Detection. ICCV 2015 | (David slides) (Zhihao slides) (Yuxin slides) (Chun-Liang Li slides) |
Feb 15 | Chenchen Zhu Satwik Kottur | Object Detection |
R. Girshick, J. Donahue, T. Darrell, J. Malik. Region-based Convolutional Networks for Accurate Object Detection and Semantic Segmentation. TPAMI 2015
R. Girshick. Fast R-CNN. ICCV 2015
S. Ren, K. He, R. Girshick, J. Sun Faster R-CNN. NIPS 2015. | - |
Feb 17 | Ravi Teja Mullapudi Mengxin Li Yilin Yang | Semantic Segmentation | J. Shotton, J. Winn, C. Rother, A. Criminisi. TextonBoost for Image Understanding: Multi-Class Object Recognition and Segmentation by Jointly Modeling Texture, Layout, and Context. IJCV 2007.
J. Long, E. Shelhamer, T. Darrell. Fully Convolutional Networks for Semantic Segmentation. ICCV 2015
B. Hariharan, P. Arbelaez, R. Girshick, J. Malik. Hypercolumns for Object Segmentation and Fine-grained Localization. CVPR 2015.
L-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, A. Yuille. Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs. ICLR 2015. | - |
Feb 22 | Paloma Sodhi Yu Fang Chang Dan Spagnolo | 3D Understanding – Primitives | H. Barrow, J. Tenenbaum. Recovering Intrinsic Scene Characteristics From Images. In Computer Vision Systems, 1978.
D. Fouhey, A. Gupta, M. Hebert. Data-Driven 3D Primitives for Single Image Understanding. ICCV 2013.
D. Eigen, R. Fergus. Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-Scale Convolutional Architecture. ICCV 2015.
D. Zoran, P. Isola, D. Krishnan, W.T. Freeman Learning Ordinal Relationships for Mid-Level Vision. ICCV 2015. | - |
Feb 24 | Luong Nguyen Shichao Yang Yi Hua Yujia Huang | 3D Understanding – Reasoning | A. Gupta, A.A. Efros, M. Hebert. Blocks World Revisited. ECCV 2010
L. Del Pero, J. Bowdish, D. Fried, B. Kermgard, E. Hartley, K. Barnard. Bayesian geometric modeling of indoor scenes CVPR 2012.
D.F. Fouhey, A. Gupta, M. Hebert. Unfolding an Indoor Origami World. ECCV 2014.
X. Wang, D.F. Fouhey, A. Gupta. Designing Deep Networks for Surface Normal Estimation. CVPR 2015.
| - |
Feb 29 | Rui Zhu Zhe Cao Vivek Krishnan | Object Detection with 3D Models | S. Satkin, J. Lin, M. Hebert. Data-Driven Scene Understanding from 3D Models. BMVC 2012.
H. Su, C.R. Qi, Y. Li, L.J. Guibas Render for CNN: Viewpoint Estimation in Images Using CNNs Trained with Rendered 3D Model Views. ICCV 2015.
Z. Wu, S. Song, A. Khosla, F. Yu, L. Zhang, X. Tang and J. Xiao. 3D ShapeNets: A Deep Representation for Volumetric Shape Modeling. CVPR 2015.
A. Bansal, B. Russell, A. Gupta. Marr Revisited: 2D-3D Alignment via Surface Normal Prediction. Arxiv 2016 | - |
Mar 7 | No Class | Spring Break | No Class | Spring Break |
Mar 9 | No Class | Spring Break | No Class | Spring Break |
Mar 14 | No Class | ECCV Deadline | No Class | ECCV Deadline |
Mar 16 | Aishanou Rait Chen Ma Chenyang Li Utkarsh Sinha Divyaa Ravichandran | Classic Problems | A. Dosovitskiy, P. Fischer, E. Ilg, P. Hausser, C. Hazirbas, V. Golkov, P. Smagt, D. Cremers. T. Brox. FlowNet: Learning Optical Flow with Convolutional Networks. ICCV 2015.
M. Cimpoi, S. Maji, A. Vedaldi, Deep filter banks for texture recognition and segmentation. CVPR 2015.
J. Long, N. Zhang, T. Darrell. Do Convnets Learn Correspondence?. NIPS 2014.
J. Zbontar, Y. LeCun. Computing the Stereo Matching Cost with a Convolutional Neural Network. CVPR 2015.
J. Flynn, I. Neulander, J. Philbin, N. Snavely. DeepStereo: Learning to Predict New Views from the World's Imagery. Arxiv 2015.
| - |
Mar 21 | Jennifer Lake Alex Poms Ligong Han | Context-based Reasoning | Z. Tu, X. Bai. Auto-context and Its Application to High-level Vision Tasks and 3D Brain Image Segmentation. TPAMI 2010.
S.K. Divvala, D. Hoiem, J.H. Hays, A.A. Efros, M. Hebert. An Empirical Study of Context in Object Detection. CVPR 2009.
D. Hoiem, A.A. Efros, M. Hebert. Putting Objects in Perspective. CVPR 2006. | - |
Mar 23 | Harsha Vardhan Pokkalla Noranart Vesdapunt Jonathan Shen | Pose Estimation | Y. Yang, D. Ramanan. Articulated Human Detection with Flexible Mixtures-of-Parts. TPAMI 2013.
J. Tompson, R. Goroshin, A. Jain, Y. LeCun, C. Bregler. Efficient Object Localization Using Convolutional Networks. Arxiv 2015.
J. Carreira, P. Agrawal, K. Fragkiadaki, J. Malik. Human Pose Estimation with Iterative Error Feedback. Arxiv 2015.
M. Oberweger, P. Wohlhart, V. Lepetit. Training a Feedback Loop for Hand Pose Estimation. ICCV 2015.
| - |
Mar 28 | Guanhang Wu Yiming Wu Jakob Bauer | Action Recognition | K. Simonyan, A. Zisserman Two-Stream Convolutional Networks for Action Recognition in Videos. NIPS 2014.
D. Tran, L. Bourdev, R. Fergus, L. Torresani, M. Paluri. Learning Spatiotemporal Features with 3D Convolutional Networks. ICCV 2015.
X. Wang, A. Farhadi, A. Gupta. Actions Transformations. Arxiv 2015. | - |
Mar 30 | Shreyas Joshi Lekha Walajapet Mohan James Laney Jai Prakash | Humans and Objects |
B. Yao, F. Li. Modeling Mutual Context of Object and Human Pose in Human-Object Interaction Activities CVPR 2010.
A. Gupta, S. Satkin, A.A. Efrod, M. Hebert. From 3D Scene Geometry to Human Workspace. CVPR 2011.
Y. Jiang, H. Koppula, A. Saxena Hallucinated Humans as the Hidden Context for Labeling 3D Scenes. CVPR 2013.
D.F. Fouhey, V. Delaitre, A. Gupta, A.A. Efros, J. Sivic, I. Laptev. People Watching: Human Actions as a Cue for Single View Geometry. In ECCV 2012 | - |
Apr 4 | Shumian Xin Xiaofang Wang Ching-Hang Chen | Discriminative Unsupervised Learning | C. Doersch, A. Gupta, A.A. Efros Unsupervised Visual Representation Learning by Context Prediction. ICCV 2015.
X. Wang, A. Gupta. Unsupervised Learning of Visual Representations using Videos. ICCV 2015.
P. Agrawal, J. Carreira, J. Malik. Learning to See by Moving. ICCV 2015.
R. Goroshin, J. Bruna, J. Tompson, D. Eigen, Y. LeCun Unsupervised Learning of Spatiotemporally Coherent Metrics. ICCV 2015.
| - |
Apr 6 | Rahul Nallamothu Sai Ganesh Bandiatmakuri Fish Tung Chen-Hsuan Lin | Generative Unsupervised Learning | D. Kingma, M. Welling Auto-Encoding Variational Bayes. ICLR 2014.
E. Denton, S. Chintala, A. Szlam, R. Fergus. Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks. NIPS 2015.
A. Radford, L. Metz, S. Chintala, Unsupervised Representation Learning With Deep Convolutional Generative Adversarial Networks. Arxiv 2015.
A. Dosovitskiy, J. Springenberg, M. Tatarchenko, T. Brox, Learning to Generate Chairs, Tables and Cars with Convolutional Networks. CVPR 2015. | - |
Apr 11 | Senthil Purushwalkam Tanmay Batra Michael Jaison G | Weakly and Semi-supervised Learning | D. Pathak, P. Krahenbuhl, T. Darrell Constrained Convolutional Neural Networks for Weakly Supervised Segmentation. ICCV 2015.
A. Shrivastava, S. Singh, A. Gupta Constrained Semi-Supervised Learning using Attributes and Comparative Attributes. ECCV 2012.
J. Hoffman, D. Pathak, T. Darrell, K. Saenko Detector Discovery in the Wild: Joint Multiple Instance and Representation Learning. NIPS 2014. | - |
Apr 13 | Debidatta Dwibedi Yi Shi Cheng-An Hou | Learning from the Web | X. Chen, A. Shrivastava, A. Gupta. NEIL: Extracting Visual Knowledge from Web Data. ICCV 2013
S. Divvala, A. Farhadi, C. Guestrin.Learning Everything about Anything: Webly-Supervised Visual Concept Learning. CVPR 2014.
X. Chen, A. Gupta Webly Supervised Learning of Convolutional Networks. ICCV 2015.
A. Joulin, L. van der Maaten, A. Jabri, N. Vasilache. Learning Visual Features from Large Weakly Supervised Data. Arxiv 2015. | - |
Apr 18 | TBA | Text and Images | G. Kulkarni, V. Premraj, S. Dhar, S. Li, Y. Choi, A. Berg and T. Berg. Baby Talk: Understanding and Generating Image Descriptions. CVPR 2011.
O. Vinyals, A. Toshev, S. Bengio, D. Erhan. Show and Tell: A Neural Image Caption Generator. CVPR 2015.
J. Devlin, S. Gupta, R. Girshick, M. Mitchell, C.L. Zitnick. Exploring Nearest Neighbor Approaches for Image Captioning. Arxiv 2015. | - |
Apr 20 | TBA | Visual Question/Answer | S. Antol, A. Agrawal, J. Lu, M. Mitchell, D. Batra, C.L. Zitnick, D. Parikh VQA: Visual Question Answering ICCV 2015.
B. Zhou, Y. Tian, S. Sukhbaatar, A. Szlam, R. Fergus Simple Baseline for Visual Question Answering Arxiv 2015
L. Yu, E. Park A.C. Berg, T.L. Berg Visual Madlibs ICCV 2015 | - |
Apr 25 | - | Final Project Presentations | | - |
Apr 27 | - | Final Project Presentations | | -
|
|