You can use the imagehash library to compare similar images.
from PIL import Image
import imagehash
hash0 = imagehash.average_hash(Image.open('quora_photo.jpg'))
hash1 = imagehash.average_hash(Image.open('twitter_photo.jpeg'))
cutoff = 5 # maximum bits that could be different between the hashes.
if hash0 - hash1 < cutoff:
print('images are similar')
else:
print('images are not similar')
Since the images are not exactly the same, there will be some differences, so therefore we use a cutoff value with an acceptable maximum difference. That difference between the hash objects is the number of bits that are flipped. But imagehash will work even if the images are resized, compressed, different file formats or with adjusted contrast or colors.
The hash (or fingerprint, really) is derived from a 8x8 monochrome thumbnail of the image. But even with such a reduced sample, the similarity comparisons give quite accurate results. Adjust the cutoff to find a balance between false positives and false negatives that is acceptable.
With 64 bit hashes, a difference of 0 means the hashes are identical. A difference of 32 means that there's no similarity at all. A difference of 64 means that one hash is the exact negative of the other.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…