Here's an approach using NumPy strides
to vectorize the creation of output_x
-
nrows = input_x.shape[0] - window_size + 1
p,q = input_x.shape
m,n = input_x.strides
strided = np.lib.stride_tricks.as_strided
out = strided(input_x,shape=(nrows,window_size,q),strides=(m,m,n))
Sample run -
In [83]: input_x
Out[83]:
array([[ 0.73089384, 0.98555845, 0.59818726],
[ 0.08763718, 0.30853945, 0.77390923],
[ 0.88835985, 0.90506367, 0.06204614],
[ 0.21791334, 0.77523643, 0.47313278],
[ 0.93324799, 0.61507976, 0.40587073],
[ 0.49462016, 0.00400835, 0.66401908]])
In [84]: window_size = 4
In [85]: out
Out[85]:
array([[[ 0.73089384, 0.98555845, 0.59818726],
[ 0.08763718, 0.30853945, 0.77390923],
[ 0.88835985, 0.90506367, 0.06204614],
[ 0.21791334, 0.77523643, 0.47313278]],
[[ 0.08763718, 0.30853945, 0.77390923],
[ 0.88835985, 0.90506367, 0.06204614],
[ 0.21791334, 0.77523643, 0.47313278],
[ 0.93324799, 0.61507976, 0.40587073]],
[[ 0.88835985, 0.90506367, 0.06204614],
[ 0.21791334, 0.77523643, 0.47313278],
[ 0.93324799, 0.61507976, 0.40587073],
[ 0.49462016, 0.00400835, 0.66401908]]])
This creates a view into the input array and as such memory-wise we are being efficient. In most cases, this should translate to benefits on performance too with further operations involving it. Let's verify that its a view indeed -
In [86]: np.may_share_memory(out,input_x)
Out[86]: True # Doesn't guarantee, but is sufficient in most cases
Another sure-shot way to verify would be to set some values into output
and check the input -
In [87]: out[0] = 0
In [88]: input_x
Out[88]:
array([[ 0. , 0. , 0. ],
[ 0. , 0. , 0. ],
[ 0. , 0. , 0. ],
[ 0. , 0. , 0. ],
[ 0.93324799, 0.61507976, 0.40587073],
[ 0.49462016, 0.00400835, 0.66401908]])