A graduate seminar course in Computer Vision with emphasis on representation and reasoning for large amounts of data (images, videos and associated tags, text, gps-locations etc) toward the ultimate goal of Image Understanding. We will be reading an eclectic mix of classic and recent papers on topics including: Theories of Perception, Mid-level Vision (Grouping, Segmentation, Poselets), Object and Scene Recognition, 3D Scene Understanding, Action Recognition, Contextual Reasoning, Image Parsing, Joint Language and Vision Models, etc. We will be covering a wide range of supervised, semi-supervised and unsupervised approaches for each of the topics above.
While there are no formal prerequisites, this course assumes familiarity with computer vision (16-720 or similar) and machine learning (10-601 or similar). If you have not taken courses covering this material, consult with the instructor.
- 16-824 Fall 2013 (Abhinav Gupta)
- 16-824 Spring 2012 (Alexei Efros)
- Visual Object and Activity Recognition (Trevor Darrell and Alexei Efros, Fall 2014)
- Visual Recognition (Kristen Grauman, Texas-Austin, Fall 2012)
- Advances in Computer Vision (Antonio Torralba and Bill Freeman, MIT, Fall 2014)
- Computer Vision (Ali Farhadi, University of Washington, Winter 2014)
- Visual Scene Understanding (Derek Hoiem, UIUC, Spring 2012)
- Computer Vision: Foundations and Applications (Fei-Fei Li, Stanford, Fall 2014)
- Statistical Models for Visual Recognition (Deva Ramanan, UCI, Winter 2009)