Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
222 views
in Technique[技术] by (71.8m points)

python - Can this be done faster with numpy?

There's a color image, a numpy array of shape (h,w,3) with N=h*w pixels; there's an array labels of shape (h,w), each label an integer between 1 and M. N is 10^6-10^7, M is 10^3-10^4.

I need to produce a result image (h,w,3) where the color of each pixel labelled l is the mean color of all pixels labelled l. I.e.:

def recolor1(image, labels):
    result = np.empty(shape=(h,w,3))
    for label in np.unique(labels):
        mask = labels==label
        mean = np.mean(image[mask], axis=0)
        result[mask] = mean
    return result

The code is straightforward, but runs in O(M.N) (the computation of mask is O(N) and the loop runs M times).

An O(N) recolor2 is possible. Basically you go over the labels and image pixels twice. First to compute an auxiliary array, indexed by label, where you keep the sums of each primary and the number of pixels for that label. Then you compute the averages for each label. Then you go over labels and pixels again, computing result. The O(M) time to find the averages is noise.

With recolor2 written in Python, recolor1 and recolor2 break even for N=1000000 and M=1000 at ~4s. As expected, recolor1's time grows linearly to ~20s for M=5000, while recolor2's remains essentially the same.

4s for a relatively small image is not great and it will get much worse for larger images. I'm no expert in numpy and associated libraries. Is there an O(N) solution there?

question from:https://stackoverflow.com/questions/66065633/can-this-be-done-faster-with-numpy

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Let's try np.bincount and loop over the channels:

result = np.stack([np.bincount(labels.flat, weights=img[...,i].flat)[labels-1] 
                   for i in range(3)],
                  axis=-1)

which takes about 35ms on my system with h,w,M = 1000,1000,1000.

Note This compute the sum, but mean should be easy enough.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...