I'm trying to look for shapes in an image using OpenCV. I know the shapes I want to match (there are some shapes I don't know about, but I don't need to find them) and their orientations. I don't know their sizes (scale) and locations.
My current approach:
- Detect contours
- For each contour, calculate the maximum bounding box
- Match each bounding box to one of the known shapes separately. In my real project, I'm scaling the region to the template size and calculating differences in Sobel gradient, but for this demo, I'm just using the aspect ratio.
Where this approach comes undone is where shapes touch. The contour detection picks up the two adjacent shapes as a single contour (single bounding box). The matching step will then obviously fail.
Is there a way to modify my approach to handle adjacent shapes separately? Also, is there a better way to perform step 3?
For example: (Es colored green, Ys colored blue)
Failed case: (unknown shape in red)
Source code:
import cv
import sys
E = cv.LoadImage('e.png')
E_ratio = float(E.width)/E.height
Y = cv.LoadImage('y.png')
Y_ratio = float(Y.width)/Y.height
EPSILON = 0.1
im = cv.LoadImage(sys.argv[1], cv.CV_LOAD_IMAGE_GRAYSCALE)
storage = cv.CreateMemStorage(0)
seq = cv.FindContours(im, storage, cv.CV_RETR_EXTERNAL,
cv.CV_CHAIN_APPROX_SIMPLE)
regions = []
while seq:
pts = [ pt for pt in seq ]
x, y = zip(*pts)
min_x, min_y = min(x), min(y)
width, height = max(x) - min_x + 1, max(y) - min_y + 1
regions.append((min_x, min_y, width, height))
seq = seq.h_next()
rgb = cv.LoadImage(sys.argv[1], cv.CV_LOAD_IMAGE_COLOR)
for x,y,width,height in regions:
pt1 = x,y
pt2 = x+width,y+height
if abs(float(width)/height - E_ratio) < EPSILON:
color = (0,255,0,0)
elif abs(float(width)/height - Y_ratio) < EPSILON:
color = (255,0,0,0)
else:
color = (0,0,255,0)
cv.Rectangle(rgb, pt1, pt2, color, 2)
cv.ShowImage('rgb', rgb)
cv.WaitKey(0)
e.png:
y.png:
good:
bad:
Before anybody asks, no, I'm not trying to break a captcha :) OCR per se isn't really relevant here: the actual shapes in my real project aren't characters -- I'm just lazy, and characters are the easiest thing to draw (and still get detected by trivial methods).
See Question&Answers more detail:
os