We could use NumPy to get views into those sliding windows with its esoteric strided tricks
. If you are using this new dimension for some reduction like matrix-multiplication, this would be ideal. If for some reason, you want to have a 2D
output, we need to use a reshape at the end, which will result in creating a copy though.
Thus, the implementation would look something like this -
from numpy.lib.stride_tricks import as_strided as strided
def get_sliding_window(df, W, return2D=0):
a = df.values
s0,s1 = a.strides
m,n = a.shape
out = strided(a,shape=(m-W+1,W,n),strides=(s0,s0,s1))
if return2D==1:
return out.reshape(a.shape[0]-W+1,-1)
else:
return out
Sample run for 2D/3D output -
In [68]: df
Out[68]:
A B
0 0.44 0.41
1 0.46 0.47
2 0.46 0.02
3 0.85 0.82
4 0.78 0.76
In [70]: get_sliding_window(df, 3,return2D=1)
Out[70]:
array([[ 0.44, 0.41, 0.46, 0.47, 0.46, 0.02],
[ 0.46, 0.47, 0.46, 0.02, 0.85, 0.82],
[ 0.46, 0.02, 0.85, 0.82, 0.78, 0.76]])
Here's how the 3D views output would look like -
In [69]: get_sliding_window(df, 3,return2D=0)
Out[69]:
array([[[ 0.44, 0.41],
[ 0.46, 0.47],
[ 0.46, 0.02]],
[[ 0.46, 0.47],
[ 0.46, 0.02],
[ 0.85, 0.82]],
[[ 0.46, 0.02],
[ 0.85, 0.82],
[ 0.78, 0.76]]])
Let's time it for views 3D
output for various window sizes -
In [331]: df = pd.DataFrame(np.random.rand(1000, 3).round(2))
In [332]: %timeit get_3d_shfted_array(df,2) # @Yakym Pirozhenko's soln
10000 loops, best of 3: 47.9 μs per loop
In [333]: %timeit get_sliding_window(df,2)
10000 loops, best of 3: 39.2 μs per loop
In [334]: %timeit get_3d_shfted_array(df,5) # @Yakym Pirozhenko's soln
10000 loops, best of 3: 89.9 μs per loop
In [335]: %timeit get_sliding_window(df,5)
10000 loops, best of 3: 39.4 μs per loop
In [336]: %timeit get_3d_shfted_array(df,15) # @Yakym Pirozhenko's soln
1000 loops, best of 3: 258 μs per loop
In [337]: %timeit get_sliding_window(df,15)
10000 loops, best of 3: 38.8 μs per loop
Let's verify that we are indeed getting views -
In [338]: np.may_share_memory(get_sliding_window(df,2), df.values)
Out[338]: True
The almost constant timings with get_sliding_window
even across various window sizes suggest the huge benefit of getting the view instead of copying.