[SCS dragon logo] Programming Meta Project #5
15-463: Computational Photography

Due Date: April 30th (Last  Day of Classes) 2010 11:59 pm
What to submit: Code + webpage write up
Project 5 (a.k.a Meta Project 5) is an alternative to doing a Final Project. Since it is META it requires you to implement 2 (TWO) out of the list of options below:



SEAM CARVING
One of the (not so) recent (anymore) papers at SIGGRAPH that made news was Seam Carving for Content-Aware Image Resizing by Shai Avidan and Ariel Shamir. In this assignment, you'll be implementing the basic algorithm presented therein. To wit, you'll be designing a program which can shrink an image (either horizontally or vertically) to a given dimension. For a brief overview of the algorithm and some inspiring results, check out the videoProject 2 from CS 15463 fall 2008 has more details.


 SINGLE VIEW MODELING
The goal of this project is to create a simple, planar 3D scene from a single photograph.  The project will follow the description in Tour into the Picture by Horry et al. in modeling the scene as a 3D axis-parallel box.  First, we will let the user specify simple constraints on that box (the back wall plus the vanishing point).  Then, it’s just a matter of extracting the coordinates of the box in 3D, and texture-mapping the faces of the box. The paper has a rather poor description of the process, so consult the lecture notes or section 2.2 of this thesis for details. Some sample starting code can be found here.  It includes a (rather messy) interface for specifying the user constraints.  It also shows how to setup texture-mapped surfaces in Matlab.   You part of the project is actually pretty straightforward: you will need to compute 3D coordinates of each vertex of each of the five planes.  Then you will define 3D geometry corresponding to these planes.  Finally, you will use your homography warping code from previous projects to rectify the textures for the planes and then texturemap them onto the 3D model.   You will then be able to move and rotate the camera and look at the scene from different viewpoints. The only other parameter that you will need to worry about is the focal length f.  You can just guess it (a wrong f will simply mean that your scene is too deep or too shallow).  Alternatively, you can get f from the EXIF data (it’s not exactly correct focal length, but it’s in the right ballpark).  Project 5 from CS 15463 fall 2008 has more details.

TRIANGULATION MATTING AND COMPOSITING

Pick an interesting, semitransparent object and produce an alpha matte by shooting the object against two different backgrounds.  That is, given the four images C1, C2 (object against two backgrounds) and Ck1, Ck2 (just the backgrounds), compute the image of the object Co and the corresponding alpha matte.  This can be done simply by solving a system of linear equations on Slide 15 of the Matting lecture.  The only trick to remember is that in solving for the unknowns, it’s very useful here to consider the colors of Co to be premultiplied by the alpha, e.g. Co = [aR, aG, aB].  
  While the backgrounds can be any images at all, we advise to use uniform colors since the refraction (which we are not modeling) in some transparent objects can change the pixel correspondence between foreground and background shots.  Another consideration is having enough light in the scene to avoid noise in the resulting images (but be careful not to create too many shadows).  Make sure that you take all the images under the same settings (set camera to manual mode). 
  After successfully acquiring an object with its matte, you will composite the object into a novel (and interesting!) scene.  For example, you can put a giant coke bottle at the entrance to Wean Hall.  Since the orientation of the camera will most likely be different for your object and the novel scene, you will need to compute a homography and warp one of the images using code from the Mosaicing project.  Put a square somewhere when you capture your object, and another one on the ground of the target composite scene (or use one of the square-like tiles on the ground).  Use these 4->4 points to estimate a homography

 

VIDEO TEXTURES

Implement a simple version of the Video Textures paper. Acquire a video sequence of some repeatable phenomenon, and compute transitions between frames of this sequence so that it can be run forever.  Don’t worry about the fancy stuff like blending/warping and Q-learning for detecting dead-ends.  Just precompute a set of good transitions and jump between them randomly (but always take the last good transition of the sequence to avoid dead ends).
   The most challenging part of this project is dealing with video, which is not very easy.  Generally, pick a simple video texture for which you don’t need a lot of video data (30 sec. at most).  Processing video in Matlab is a bit tricky.  Theoretically, there is aviread but, under linux, it will only read uncompressed AVIs.  Most current digital cameras produce video in DV AVI format.  One way to deal with this is to splice up the video into individual frames and then read them into Matlab one by one.  On the graphics cluster, you can do (some variant of) the following to produce the frames from a video:

mplayer -vo jpeg -jpeg quality=100 -fps 30 mymovie.avi

 Also note that handling video is a time-consuming thing (not just for you, but for the computer as well).  If you shoot a minute of video, that’s already 60*30=1800 images!  So, start early and don’t be afraid to let Matlab crunch numbers overnight.

 

RECOVERING HIGH DYNAMIC RANGE IMAGES

  Follow the procedure outlined in the Debevec paper to recover a HDR image from a set of images with varying exposure.  You will need to carefully read and completely understand the paper first but then the implementation is straightforward (you can use the Matlab code in the paper and shown in lecture). Your result should be a radiance map of the image.  When acquiring the images, make sure that you only vary the shutter speed and not the aperture (switch the camera to fully manual mode). Also, try to experiment with some of the simple tone-mapping approaches, like global scaling, taking the first 0…255 values, the global operator presented in class (L/(1+L)).  Which one works best for you?