I am attempting to consistently find the darkest region in a series of depth map images generated from a video. The depth maps are generated using the PyTorch implementation here
Their sample run script generates a prediction of the same size as the input where each pixel is a floating point value, with the highest/brightest value being the closest. Standard depth estimation using ConvNets.
The depth prediction is then normalized as follows to make a png for review
bits = 2
depth_min = prediction.min()
depth_max = prediction.max()
max_val = (2**(8*bits))-1
out = max_val * (prediction - depth_min) / (depth_max - depth_min)
I am attempting to identify the darkest region in each image in the video, with the assumption that this region has the most "open space".
I've tried several methods:
Using cv2
template matching and minMaxLoc
I created a template of np.zeros(100,100), then applied the template similar to the docs
img2 = out.copy().astype("uint8")
template = np.zeros((100, 100)).astype("uint8")
w, h = template.shape[::-1]
res = cv2.matchTemplate(img2,template,cv2.TM_SQDIFF)
min_val, max_val, min_loc, max_loc = cv2.minMaxLoc(res)
top_left = min_loc
bottom_right = (top_left[0] + w, top_left[1] + h)
val = out.max()
cv2.rectangle(out,top_left, bottom_right, int(val) , 2)
As you can see, this implementation is very inconsistent with many false positives
Using np.argmin(out, axis=1)
which generates many indices. I take the first two, and write the word MIN
at those coordinates
text = "MIN"
textsize = cv2.getTextSize(text, font, 1, 2)[0]
textX, textY = np.argmin(prediction, axis=1)[:2]
cv2.putText(out, text, (textX, textY), font, 1, (int(917*max_val), int(917*max_val), int(917*max_val)), 2)
This is less inconsistent but still lacking
Using np.argwhere(prediction == np.min(preditcion)
then write the word MIN
at the coordanites. I imagined this would give me the darkest pixel on the image, but this is not the case
I've also thought of running a convolution operation with a kernel of 50x50, then taking the region with the smallest value as the darkest region
My question is why are there inconsistencies and false positives. How can I fix that? Intuitively this seems like a very simple thing to do.
UPDATE
Thanks to Hans for the idea. Please follow this link to download the output depths in png format.
question from:
https://stackoverflow.com/questions/65931512/finding-the-darkest-region-in-a-depth-map-using-numpy-and-or-cv2