Computer Vision Project 1: Image Alignment and Color Compositing

Goal:

Take the digitized Prokudin-Gorskii glass plate images and automatically produce a color image with as few visual artifacts as possible. Extract the three color channel images and align them so that they form a single RGB color image.

Baseline (Naive) Implementation:

To align the RGB images, I decided to use the Normalized Cross Correlation metric as a baseline, because I noticed that it outperfomed sum of squared differences (SSD) on a small sample of test images.
My program divides the image into 3 equal parts, aligns the second and the third parts (G and R) to the first (B). For each image, I record the displacement vector that was used to align the parts.

Sample Single-Scale Processing Results

NCC final image

G

R

B

However, since exhaustive search becomes expensive if the displacement search range/image resolution are too large, I implemented a second method for the high-res images: a multi-scale aligning algorithm that searches over a user-specified window of displacements.

Image Pyramids

To handle alignment for larger images, I used image pyramids to implement a faster search algorithm. Given two images, my image pyramid alignment algorithm recursively resizes images by factors of 2 at multiple scales, averages them by taking a uniform average of each, and respectively aligns them from the coarsest scale to the finest scale (largest image).

Image Pyramid Sample Results (from high res. scans)

Bells and Whistles: Cropping with Edge Detection

Our overall goal is to reduce boundary noise in the dataset; thus, we need to crop the colorful, shifted borders. Hence, I used a combination of edge detection and averages to compute the cropping dimensions for each image. After generating the final color image by the alignment algorithms above, I use the following algorithm to crop the final image based on those computed dimensions.

Cropping Algorithm

Input: channel image I

Run edge detection on I to create an edge response map
average the edge response map horizontally to make vector V_h with length = image's height
calculate a threshold equal to 2 standard deviations above the mean over all values in V_h
sequentially search the first tenth of values in V from right to left, getting the first value that's higher than the threshold
sequential search from left to right over the last tenth of V_h
Save indices as the top and bottom cropping bounds
Do similar search with V_v, vector generated by vertical average of edge response map to get left and right cropping bounds