Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
172 views
in Technique[技术] by (71.8m points)

python - What is the Big O Complexity of Reversing the Order of Columns in Pandas DataFrame?

So lets say I have a DataFrame in pandas with a m rows and n columns. Let's also say that I wanted to reverse the order of the columns, which can be done with the following code:

df_reversed = df[df.columns[::-1]]

What is the Big O complexity of this operation? I'm assuming this would depend on the number of columns, but would it also depend on the number of rows?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

I don't know how Pandas implements this, but I did test it empirically. I ran the following code (in a Jupyter notebook) to test the speed of the operation:

def get_dummy_df(n):
    return pd.DataFrame({'a': [1,2]*n, 'b': [4,5]*n, 'c': [7,8]*n})

df = get_dummy_df(100)
print df.shape
%timeit df_r = df[df.columns[::-1]]

df = get_dummy_df(1000)
print df.shape
%timeit df_r = df[df.columns[::-1]]

df = get_dummy_df(10000)
print df.shape
%timeit df_r = df[df.columns[::-1]]

df = get_dummy_df(100000)
print df.shape
%timeit df_r = df[df.columns[::-1]]

df = get_dummy_df(1000000)
print df.shape
%timeit df_r = df[df.columns[::-1]]

df = get_dummy_df(10000000)
print df.shape
%timeit df_r = df[df.columns[::-1]]

The output was:

(200, 3)
1000 loops, best of 3: 419 μs per loop
(2000, 3)
1000 loops, best of 3: 425 μs per loop
(20000, 3)
1000 loops, best of 3: 498 μs per loop
(200000, 3)
100 loops, best of 3: 2.66 ms per loop
(2000000, 3)
10 loops, best of 3: 25.2 ms per loop
(20000000, 3)
1 loop, best of 3: 207 ms per loop

As you can see, in the first 3 cases, the overhead of the operation is what takes most of the time (400-500μs), but from the 4th case, the time it takes starts to be proportional to the amount of data, increasing in an order of magnitude each time.

So, assuming there must also be a proportion to n, it seems that we are dealing with O(m*n)


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...