(16-824) Visual Learning and Recognition: Schedule

Schedule

The bolded paper is required reading. The other papers will be summarized in class.

Date	Presenter	Topic	Papers	Slides
Jan 11	Abhinav Gupta David Fouhey	Introduction	-	slides (pdf)
Jan 13	Abhinav Gupta	Theories of Vision	J. Mundy. Object Recognition in the Geometric Era: a Retrospective Springer Berlin Heidelberg 2006.	slides (pdf)
Jan 18	No Class	Martin Luther King Day	No Class	-
Jan 21	Abhinav Gupta	Theories of Vision Continued		slides (pdf)
Jan 25	Abhinav Gupta	Introduction to Data	A. Halevy, P. Norvig, and F. Pereira. The Unreasonable Effectiveness of Data. IEEE Intelligent Systems, 24 8–12, 2009. A. Torralba, and A. Efros. Unbiased Look at Dataset Bias. CVPR 2011.	slides (pdf)
Jan 27	David Fouhey Abhinav Gupta	Introduction to Deep Learning and Image Classification	A. Krizhevsky, I. Sutskever, and G.E. Hinton. ImageNet Classification with Deep Convolutional Neural Networks. NIPS 2012 K. Simonyan, A. Zisserman. Very Deep Convolutional Networks for Large-Scale Image Recognition. ICLR 2015	slides (pdf)
Feb 1	David Fouhey Abhinav Gupta	Introduction to Deep Learning and Image Classification Continued	A. Mahendran, A. Vedaldi. Understanding Deep Image Representations by Inverting Them. CVPR 2015 M.D. Zeiler, R. Fergus. Visualizing and Understanding Convolutional Networks. ECCV 2014	-
Feb 3	Abhinav Gupta David Fouhey	Intro to Deep Learning Cont'd	No Reading	- Visualization Link
Feb 8	Rohit Girdhar	Introduction to Caffe	No Reading	Caffe Tutorial Slides (pdf) Visualizing Slides (pdf)
Feb 10	David Fouhey Zhihao Li Yuxin Wu Chun-Liang Li	Edges and Regions	P. Arbelaez, M. Maire, C. Fowlkes, J. Malik. Contour Detection and Hierarchical Image Segmentation. TPAMI 2010 P. Dollar, C.L. Zitnick. Fast Edge Detection Using Structured Forests. TPAMI 2015. J.R.R. Uijlings. K.E.A. van de Sande, T. Gevers, A.W.M. Smeulders. Selective Search for Object Recognition. IJCV 2013. S. Xie, Z. Tu. Holistically-Nested Edge Detection. ICCV 2015	(David slides) (Zhihao slides) (Yuxin slides) (Chun-Liang Li slides)
Feb 15	Chenchen Zhu Satwik Kottur	Object Detection	R. Girshick, J. Donahue, T. Darrell, J. Malik. Region-based Convolutional Networks for Accurate Object Detection and Semantic Segmentation. TPAMI 2015 R. Girshick. Fast R-CNN. ICCV 2015 S. Ren, K. He, R. Girshick, J. Sun Faster R-CNN. NIPS 2015.	-
Feb 17	Ravi Teja Mullapudi Mengxin Li Yilin Yang	Semantic Segmentation	J. Shotton, J. Winn, C. Rother, A. Criminisi. TextonBoost for Image Understanding: Multi-Class Object Recognition and Segmentation by Jointly Modeling Texture, Layout, and Context. IJCV 2007. J. Long, E. Shelhamer, T. Darrell. Fully Convolutional Networks for Semantic Segmentation. ICCV 2015 B. Hariharan, P. Arbelaez, R. Girshick, J. Malik. Hypercolumns for Object Segmentation and Fine-grained Localization. CVPR 2015. L-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, A. Yuille. Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs. ICLR 2015.	-
Feb 22	Paloma Sodhi Yu Fang Chang Dan Spagnolo	3D Understanding – Primitives	H. Barrow, J. Tenenbaum. Recovering Intrinsic Scene Characteristics From Images. In Computer Vision Systems, 1978. D. Fouhey, A. Gupta, M. Hebert. Data-Driven 3D Primitives for Single Image Understanding. ICCV 2013. D. Eigen, R. Fergus. Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-Scale Convolutional Architecture. ICCV 2015. D. Zoran, P. Isola, D. Krishnan, W.T. Freeman Learning Ordinal Relationships for Mid-Level Vision. ICCV 2015.	-
Feb 24	Luong Nguyen Shichao Yang Yi Hua Yujia Huang	3D Understanding – Reasoning	A. Gupta, A.A. Efros, M. Hebert. Blocks World Revisited. ECCV 2010 L. Del Pero, J. Bowdish, D. Fried, B. Kermgard, E. Hartley, K. Barnard. Bayesian geometric modeling of indoor scenes CVPR 2012. D.F. Fouhey, A. Gupta, M. Hebert. Unfolding an Indoor Origami World. ECCV 2014. X. Wang, D.F. Fouhey, A. Gupta. Designing Deep Networks for Surface Normal Estimation. CVPR 2015.	-
Feb 29	Rui Zhu Zhe Cao Vivek Krishnan	Object Detection with 3D Models	S. Satkin, J. Lin, M. Hebert. Data-Driven Scene Understanding from 3D Models. BMVC 2012. H. Su, C.R. Qi, Y. Li, L.J. Guibas Render for CNN: Viewpoint Estimation in Images Using CNNs Trained with Rendered 3D Model Views. ICCV 2015. Z. Wu, S. Song, A. Khosla, F. Yu, L. Zhang, X. Tang and J. Xiao. 3D ShapeNets: A Deep Representation for Volumetric Shape Modeling. CVPR 2015. A. Bansal, B. Russell, A. Gupta. Marr Revisited: 2D-3D Alignment via Surface Normal Prediction. Arxiv 2016	-
Mar 7	No Class	Spring Break	No Class	Spring Break
Mar 9	No Class	Spring Break	No Class	Spring Break
Mar 14	No Class	ECCV Deadline	No Class	ECCV Deadline
Mar 16	Aishanou Rait Chen Ma Chenyang Li Utkarsh Sinha Divyaa Ravichandran	Classic Problems	A. Dosovitskiy, P. Fischer, E. Ilg, P. Hausser, C. Hazirbas, V. Golkov, P. Smagt, D. Cremers. T. Brox. FlowNet: Learning Optical Flow with Convolutional Networks. ICCV 2015. M. Cimpoi, S. Maji, A. Vedaldi, Deep filter banks for texture recognition and segmentation. CVPR 2015. J. Long, N. Zhang, T. Darrell. Do Convnets Learn Correspondence?. NIPS 2014. J. Zbontar, Y. LeCun. Computing the Stereo Matching Cost with a Convolutional Neural Network. CVPR 2015. J. Flynn, I. Neulander, J. Philbin, N. Snavely. DeepStereo: Learning to Predict New Views from the World's Imagery. Arxiv 2015.	-
Mar 21	Jennifer Lake Alex Poms Ligong Han	Context-based Reasoning	Z. Tu, X. Bai. Auto-context and Its Application to High-level Vision Tasks and 3D Brain Image Segmentation. TPAMI 2010. S.K. Divvala, D. Hoiem, J.H. Hays, A.A. Efros, M. Hebert. An Empirical Study of Context in Object Detection. CVPR 2009. D. Hoiem, A.A. Efros, M. Hebert. Putting Objects in Perspective. CVPR 2006.	-
Mar 23	Harsha Vardhan Pokkalla Noranart Vesdapunt Jonathan Shen	Pose Estimation	Y. Yang, D. Ramanan. Articulated Human Detection with Flexible Mixtures-of-Parts. TPAMI 2013. J. Tompson, R. Goroshin, A. Jain, Y. LeCun, C. Bregler. Efficient Object Localization Using Convolutional Networks. Arxiv 2015. J. Carreira, P. Agrawal, K. Fragkiadaki, J. Malik. Human Pose Estimation with Iterative Error Feedback. Arxiv 2015. M. Oberweger, P. Wohlhart, V. Lepetit. Training a Feedback Loop for Hand Pose Estimation. ICCV 2015.	-
Mar 28	Guanhang Wu Yiming Wu Jakob Bauer	Action Recognition	K. Simonyan, A. Zisserman Two-Stream Convolutional Networks for Action Recognition in Videos. NIPS 2014. D. Tran, L. Bourdev, R. Fergus, L. Torresani, M. Paluri. Learning Spatiotemporal Features with 3D Convolutional Networks. ICCV 2015. X. Wang, A. Farhadi, A. Gupta. Actions Transformations. Arxiv 2015.	-
Mar 30	Shreyas Joshi Lekha Walajapet Mohan James Laney Jai Prakash	Humans and Objects	B. Yao, F. Li. Modeling Mutual Context of Object and Human Pose in Human-Object Interaction Activities CVPR 2010. A. Gupta, S. Satkin, A.A. Efrod, M. Hebert. From 3D Scene Geometry to Human Workspace. CVPR 2011. Y. Jiang, H. Koppula, A. Saxena Hallucinated Humans as the Hidden Context for Labeling 3D Scenes. CVPR 2013. D.F. Fouhey, V. Delaitre, A. Gupta, A.A. Efros, J. Sivic, I. Laptev. People Watching: Human Actions as a Cue for Single View Geometry. In ECCV 2012	-
Apr 4	Shumian Xin Xiaofang Wang Ching-Hang Chen	Discriminative Unsupervised Learning	C. Doersch, A. Gupta, A.A. Efros Unsupervised Visual Representation Learning by Context Prediction. ICCV 2015. X. Wang, A. Gupta. Unsupervised Learning of Visual Representations using Videos. ICCV 2015. P. Agrawal, J. Carreira, J. Malik. Learning to See by Moving. ICCV 2015. R. Goroshin, J. Bruna, J. Tompson, D. Eigen, Y. LeCun Unsupervised Learning of Spatiotemporally Coherent Metrics. ICCV 2015.	-
Apr 6	Rahul Nallamothu Sai Ganesh Bandiatmakuri Fish Tung Chen-Hsuan Lin	Generative Unsupervised Learning	D. Kingma, M. Welling Auto-Encoding Variational Bayes. ICLR 2014. E. Denton, S. Chintala, A. Szlam, R. Fergus. Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks. NIPS 2015. A. Radford, L. Metz, S. Chintala, Unsupervised Representation Learning With Deep Convolutional Generative Adversarial Networks. Arxiv 2015. A. Dosovitskiy, J. Springenberg, M. Tatarchenko, T. Brox, Learning to Generate Chairs, Tables and Cars with Convolutional Networks. CVPR 2015.	-
Apr 11	Senthil Purushwalkam Tanmay Batra Michael Jaison G	Weakly and Semi-supervised Learning	D. Pathak, P. Krahenbuhl, T. Darrell Constrained Convolutional Neural Networks for Weakly Supervised Segmentation. ICCV 2015. A. Shrivastava, S. Singh, A. Gupta Constrained Semi-Supervised Learning using Attributes and Comparative Attributes. ECCV 2012. J. Hoffman, D. Pathak, T. Darrell, K. Saenko Detector Discovery in the Wild: Joint Multiple Instance and Representation Learning. NIPS 2014.	-
Apr 13	Debidatta Dwibedi Yi Shi Cheng-An Hou	Learning from the Web	X. Chen, A. Shrivastava, A. Gupta. NEIL: Extracting Visual Knowledge from Web Data. ICCV 2013 S. Divvala, A. Farhadi, C. Guestrin.Learning Everything about Anything: Webly-Supervised Visual Concept Learning. CVPR 2014. X. Chen, A. Gupta Webly Supervised Learning of Convolutional Networks. ICCV 2015. A. Joulin, L. van der Maaten, A. Jabri, N. Vasilache. Learning Visual Features from Large Weakly Supervised Data. Arxiv 2015.	-
Apr 18	TBA	Text and Images	G. Kulkarni, V. Premraj, S. Dhar, S. Li, Y. Choi, A. Berg and T. Berg. Baby Talk: Understanding and Generating Image Descriptions. CVPR 2011. O. Vinyals, A. Toshev, S. Bengio, D. Erhan. Show and Tell: A Neural Image Caption Generator. CVPR 2015. J. Devlin, S. Gupta, R. Girshick, M. Mitchell, C.L. Zitnick. Exploring Nearest Neighbor Approaches for Image Captioning. Arxiv 2015.	-
Apr 20	TBA	Visual Question/Answer	S. Antol, A. Agrawal, J. Lu, M. Mitchell, D. Batra, C.L. Zitnick, D. Parikh VQA: Visual Question Answering ICCV 2015. B. Zhou, Y. Tian, S. Sukhbaatar, A. Szlam, R. Fergus Simple Baseline for Visual Question Answering Arxiv 2015 L. Yu, E. Park A.C. Berg, T.L. Berg Visual Madlibs ICCV 2015	-
Apr 25	-	Final Project Presentations		-
Apr 27	-	Final Project Presentations		-