Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
153 views
in Technique[技术] by (71.8m points)

python - Is there a way to make list processing as fast as np.array?

I am currently replacing some code which I wrote with the assumption that the inputs are numpy arrays such that it takes arbitrary lists as input. Unfortunately the solutions I produced so far are substantially slower than the original code. Can someone give advise how I might reach back to the speed of the original solution?

The code is supposed to produce a boolean index for the upper triangular matrix representation. Without input checks and stuff like this this is the meat of the code:

some import and example input:

import numpy as np
descriptor = list(range(100))
descriptor_arr = np.array(descriptor)
value = [0, 2, 13, 14, 11, 23, 45, 16]

This is my current list based version:

def get_idx_slow(descriptor, value):
    ix, iy = np.triu_indices(len(descriptor), 1)
    pattern_in_value = [p in value for p in descriptor]
    return [(pattern_in_value[idx_x] & pattern_in_value[idx_y]) for idx_x, idx_y in zip(ix, iy)]

This is the previous array based version:

def get_idx_fast(descriptor, value):
    ix, iy = np.triu_indices(len(descriptor), 1)
    selection_x = np.any(np.array([descriptor[ix] == v for v in value]), axis=0)
    selection_y = np.any(np.array([descriptor[iy] == v for v in value]), axis=0)
    return selection_x & selection_y

My timing results:

%timeit get_idx_slow(descriptor, value)
1.2 ms ± 33.6 μs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%timeit get_idx_fast(descriptor_arr, value)
217 μs ± 1.88 μs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
question from:https://stackoverflow.com/questions/65923287/is-there-a-way-to-make-list-processing-as-fast-as-np-array

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

It's definitely the lazy solution, but just converting the list in the slow function to an array, calling the other function, and converting back to a list. It seemed to be reasonably effective.

Update:

def get_idx_slow(descriptor, value):
    return get_idx_fast(np.asarray(descriptor), value).tolist()

Results:

%timeit get_idx_slow_orig(descriptor, value)
892 μs ± 15.3 μs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%timeit get_idx_slow(descriptor, value)
182 μs ± 1.11 μs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

%timeit get_idx_fast(descriptor_arr, value)
150 μs ± 1.21 μs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...