Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
262 views
in Technique[技术] by (71.8m points)

python - Is there a list of big O complexities for the numpy library?

I'm doing a time complexity analysis of an algorithm and need to know what kind of complexities certain numpy operations have.

For some, I assume they match the underlying mathematical operation. Like np.dot(array1, array2) would be O(n). For others, I am not as sure. For example, is np.array(my_array) O(1)? or is it O(n)? Does it simply reassign a pointer or is it iterating over the list and copying out each value?

I want to be sure of each operation's complexity. Is there somewhere I can find this information? Or should I just assume they match the mathematical operation?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

BigO complexity is not often used with Python and numpy. It's a measure of how the code scales with problem size. That's useful in a compiled language like C. But here the code is a mix of interpreted Python and compiled code. Both can have the same bigO, but the interpreted version will be orders of magnitude slower. That's why most of the SO questions about improving numpy speed, talk about 'removing loops' and 'vectorizing'.

Also few operations are pure O(n); most are a mix. There's a setup cost, plus a per element cost. If the per element cost is small, the setup cost dominates.

If starting with lists, it's often faster to iterate on the list, because converting a list to an array has a substantial overhead (O(n)).

If you already have arrays, then avoid (python level) iteration where possible. Iteration is part of most calculations, but numpy lets you do a lot of that in faster compiled code (faster O(n)).

At some point you have to understand how numpy stores its arrays. The distinction between view and copy is important. A view is in effect O(1), a copy O(n).

Often you'll see SO answers do timeit speed comparisons. I often add the caution that results might vary with problem size. The better answers will time various size problems, and show the results on a nice plot. The results are often a mix of straight lines (O(n)), and curves (varying blends of O(1) and O(n) components).


You asked specifically about np.array. Here are some sample timings:

In [134]: %%timeit alist = list(range(1000))
     ...: np.array(alist)
67.9 μs ± 839 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In [135]: %%timeit alist = list(range(10))
     ...: np.array(alist)

2.19 μs ± 9.88 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

In [136]: %%timeit alist = list(range(2000))
     ...: np.array(alist)
134 μs ± 1.98 μs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

copy an array:

In [137]: %%timeit alist = list(range(2000)); arr=np.array(alist)
     ...: np.array(arr)
1.77 μs ± 24.3 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

no copy:

In [138]: %%timeit alist = list(range(2000)); arr=np.array(alist)
     ...: np.array(arr, copy=False) 
237 ns ± 1.1 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

from a list of strings:

In [139]: %%timeit alist = [str(i) for i in range(2000)]
     ...: np.array(alist, dtype=int)
286 μs ± 4.8 μs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

Almost all calculations in numpy are O(n). If it involves each element of an array it, speed will depend on the size of the array. Some array manipulations are O(1), such as reshaping, because they don't actually do anything with the data; they change properties like shape and strides.

Search problems often grow faster than O(n); usually numpy is not the best choice for that kind of problem. Smart of use Python lists and dictionaries can be faster.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...