What is done


Process_all
process_all.m list all the images with filenames started with 0, and then apply each of them with colorize_skel. A result.txt is generated at the end of runtime. See a example of result.txt at the bottom of the result section.

60%: single scale implementation

See naive_align.m
The code is the basis for the base case of the pyramid implementation.

tweak:
Usually the border areas of the 3 channels have low correlation. The signal could be very strong in R, and very weak in G. Therefore, only 81% of the pixels (center 90% of either dimension) are considered for the sum squared difference measurement.

40%: pyramid implementation

pyramid_align_help.m is the code. (There used to be a wrapper function that calls the helper, but that is no longer required).

The base case of the pyramid_align_helper is determined by the number of pixels of the inputs (<= 65536 or 256*256). In base case, the naive alignment is executed with a search radius of 10 pixels and the funtnion returns a 1 x 2 array of the best offset. Also, an edge filter is applied to the image in question and the template image before taking sum squared difference. The edge filter is the average of horizontal and vertical filters. Without this step, wierd result occurs on 0600v.jpg and 0398v.jpg. This is the middle 1/3 of the pyramid_align_help.m file.

In the recursive case, the input images are first resized (0.5 * 0.5), and the best offset for the smaller versions are determined. Once that offset is obtained, we need to make adjustment for the original scale. the offset is multiplied by 2, and a search radius of (-2, 1). The reason for a (-2, 1) radius is the following: imagine a array of 8 elements (1-based), and the recursive call says the best alignment of the half-sized array is 3. In the original scale, only 5 or 6 could be mapped to 3 in the smaller scale. To be safe, 4, 5, 6, 7 are all searched in the original scale. This is the bottom 1/3 of the pyramid_align_help.m file.

There is another tweak. The running time is not depends on the search radius, but also the image size for the sum squared error. For image larger than a threashold, only a subset of the pixels in the center are selected for the difference measurement. This is the top 1/3 of the pyramid_align_help.m file.

Result



Large image 1 (this is one of the extra images I picked)
Large image 2 (this is one of the extra images I picked)
Large image 3 (this is one of the extra images I picked)
Large image 4
Large image 5

name		G offset	R offset	running time
00001u.jpg	36 4		97 -7		46.607000
00082u.jpg	29 7		76 9		47.592000
00149v.jpg	4 2		9 2		2.734000
00153v.jpg	7 3		13 6		2.579000
00163v.jpg	-3 1		-4 1		2.468000
00194v.jpg	4 2		8 4		2.547000
00398v.jpg	5 3		11 4		2.531000
00458u.jpg	42 6		85 32		48.061000
00458v.jpg	4 1		9 3		2.781000
00600v.jpg	6 4		13 6		2.640000
01167v.jpg	5 0		12 -2		2.594000
01657u.jpg	54 6		119 9		50.233000

Misc

The work is done on a public windows machine in wean hall