What is done
Process_all
process_all.m list all the images with filenames started with 0, and then apply each of them with colorize_skel. A result.txt is generated at the end of runtime. See a example of result.txt at the bottom of the result section.
60%: single scale implementation
See naive_align.m
The code is the basis for the base case of the pyramid implementation.
tweak:
Usually the border areas of the 3 channels have low correlation. The signal could be very strong in R, and very weak in G. Therefore, only 81% of the pixels (center 90% of either dimension) are considered for the sum squared difference measurement.
40%: pyramid implementation
pyramid_align_help.m is the code. (There used to be a wrapper function that calls the helper, but that is no longer required).
The base case of the pyramid_align_helper is determined by the number of pixels of the inputs (<= 65536 or 256*256). In base case, the naive alignment is executed with a search radius of 10 pixels and the funtnion returns a 1 x 2 array of the best offset. Also, an edge filter is applied to the image in question and the template image before taking sum squared difference. The edge filter is the average of horizontal and vertical filters. Without this step, wierd result occurs on 0600v.jpg and 0398v.jpg. This is the middle 1/3 of the pyramid_align_help.m file.
In the recursive case, the input images are first resized (0.5 * 0.5), and the best offset for the smaller versions are determined. Once that offset is obtained, we need to make adjustment for the original scale. the offset is multiplied by 2, and a search radius of (-2, 1). The reason for a (-2, 1) radius is the following: imagine a array of 8 elements (1-based), and the recursive call says the best alignment of the half-sized array is 3. In the original scale, only 5 or 6 could be mapped to 3 in the smaller scale. To be safe, 4, 5, 6, 7 are all searched in the original scale. This is the bottom 1/3 of the pyramid_align_help.m file.
There is another tweak. The running time is not depends on the search radius, but also the image size for the sum squared error. For image larger than a threashold, only a subset of the pixels in the center are selected for the difference measurement. This is the top 1/3 of the pyramid_align_help.m file.
Result
Large image 1 (this is one of the extra images I picked)
Large image 2 (this is one of the extra images I picked)
Large image 3 (this is one of the extra images I picked)
Large image 4
Large image 5
name G offset R offset running time
00001u.jpg 36 4 97 -7 46.607000
00082u.jpg 29 7 76 9 47.592000
00149v.jpg 4 2 9 2 2.734000
00153v.jpg 7 3 13 6 2.579000
00163v.jpg -3 1 -4 1 2.468000
00194v.jpg 4 2 8 4 2.547000
00398v.jpg 5 3 11 4 2.531000
00458u.jpg 42 6 85 32 48.061000
00458v.jpg 4 1 9 3 2.781000
00600v.jpg 6 4 13 6 2.640000
01167v.jpg 5 0 12 -2 2.594000
01657u.jpg 54 6 119 9 50.233000
Misc
The work is done on a public windows machine in wean hall