As Lukas Graf
hints, you are looking for cross-correlation. It works well, if:
- The scale of your images does not change considerably.
- There is no rotation change in the images.
- There is no significant illumination change in the images.
For plain translations cross-correlation is very good.
The simplest cross-correlation tool is scipy.signal.correlate
. However, it uses the trivial method for cross-correlation, which is O(n^4) for a two-dimensional image with side length n. In practice, with your images it'll take very long.
The better too is scipy.signal.fftconvolve
as convolution and correlation are closely related.
Something like this:
import numpy as np
import scipy.signal
def cross_image(im1, im2):
# get rid of the color channels by performing a grayscale transform
# the type cast into 'float' is to avoid overflows
im1_gray = np.sum(im1.astype('float'), axis=2)
im2_gray = np.sum(im2.astype('float'), axis=2)
# get rid of the averages, otherwise the results are not good
im1_gray -= np.mean(im1_gray)
im2_gray -= np.mean(im2_gray)
# calculate the correlation image; note the flipping of onw of the images
return scipy.signal.fftconvolve(im1_gray, im2_gray[::-1,::-1], mode='same')
The funny-looking indexing of im2_gray[::-1,::-1]
rotates it by 180° (mirrors both horizontally and vertically). This is the difference between convolution and correlation, correlation is a convolution with the second signal mirrored.
Now if we just correlate the first (topmost) image with itself, we get:
This gives a measure of self-similarity of the image. The brightest spot is at (201, 200), which is in the center for the (402, 400) image.
The brightest spot coordinates can be found:
np.unravel_index(np.argmax(corr_img), corr_img.shape)
The linear position of the brightest pixel is returned by argmax
, but it has to be converted back into the 2D coordinates with unravel_index
.
Next, we try the same by correlating the first image with the second image:
The correlation image looks similar, but the best correlation has moved to (149,200), i.e. 52 pixels upwards in the image. This is the offset between the two images.
This seems to work with these simple images. However, there may be false correlation peaks, as well, and any of the problems outlined in the beginning of this answer may ruin the results.
In any case you should consider using a windowing function. The choice of the function is not that important, as long as something is used. Also, if you have problems with small rotation or scale changes, try correlating several small areas agains the surrounding image. That will give you different displacements at different positions of the image.