I want to convert a 3 channel RGB image to a index image with Python. It's used for handling the labels of training a deep net for semantic segmentation. By index image I mean it has one channel and each pixel is the index, which should starts with zero. And certainly they should have the same size. The conversion is based on the following mapping in Python dict:
color2index = {
(255, 255, 255) : 0,
(0, 0, 255) : 1,
(0, 255, 255) : 2,
(0, 255, 0) : 3,
(255, 255, 0) : 4,
(255, 0, 0) : 5
}
I've implemented a naive function:
def im2index(im):
"""
turn a 3 channel RGB image to 1 channel index image
"""
assert len(im.shape) == 3
height, width, ch = im.shape
assert ch == 3
m_lable = np.zeros((height, width, 1), dtype=np.uint8)
for w in range(width):
for h in range(height):
b, g, r = im[h, w, :]
m_lable[h, w, :] = color2index[(r, g, b)]
return m_lable
The input im
is a numpy array created by cv2.imread()
. However, this code is really slow.
Since the im
is in numpy array I firstly tried the ufunc
of numpy with something like this:
RGB2index = np.frompyfunc(lambda x: color2index(tuple(x)))
indices = RGB2index(im)
But it turns out that the ufunc
takes only one element each time. I was unable to give the function three arguments(RGB value) one time.
So is there any other ways to do the optimization?
The mapping has not to be that way, if a more efficient data structure exists. I noticed that the access of a Python dict dose not cost much time, but the casting from numpy array to tuple(which is hashable) does.
PS:
One idea I got is to implement a kernel in CUDA. But it would be more complicated.
UPDATA1:
Dan Ma?ek's Answer works fine. But first we have to convert the RGB image to grayscale. It could be problematic when two colors have the same grayscale value.
I paste the working code here. Hope it could help others.
lut = np.ones(256, dtype=np.uint8) * 255
lut[[255,29,179,150,226,76]] = np.arange(6, dtype=np.uint8)
im_out = cv2.LUT(cv2.cvtColor(im, cv2.COLOR_BGR2GRAY), lut)
See Question&Answers more detail:
os