Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
251 views
in Technique[技术] by (71.8m points)

python - How to understand numpy strides for layman?

I am currently going through numpy and there is a topic in numpy called "strides". I understand what it is. But how does it work? I did not find any useful information online. Can anyone let me understand in a layman's terms?

Question&Answers:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

The actual data of a numpy array is stored in a homogeneous and contiguous block of memory called data buffer. For more information see NumPy internals. Using the (default) row-major order, a 2D array looks like this:

enter image description here

To map the indices i,j,k,... of a multidimensional array to the positions in the data buffer (the offset, in bytes), NumPy uses the notion of strides. Strides are the number of bytes to jump-over in the memory in order to get from one item to the next item along each direction/dimension of the array. In other words, it's the byte-separation between consecutive items for each dimension.

For example:

>>> a = np.arange(1,10).reshape(3,3)
>>> a
array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

This 2D array has two directions, axes-0 (running vertically downwards across rows), and axis-1 (running horizontally across columns), with each item having size:

>>> a.itemsize  # in bytes
4  

So to go from a[0, 0] -> a[0, 1] (moving horizontally along the 0th row, from the 0th column to the 1st column) the byte-step in the data buffer is 4. Same for a[0, 1] -> a[0, 2], a[1, 0] -> a[1, 1] etc. This means that the number of strides for the horizontal direction (axis-1) is 4 bytes.

However, to go from a[0, 0] -> a[1, 0] (moving vertically along the 0th column, from the 0th row to the 1st row), you need first to traverse all the remaining items on the 0th row to get to the 1st row, and then move through the 1st row to get to the item a[1, 0], i.e. a[0, 0] -> a[0, 1] -> a[0, 2] -> a[1, 0]. Therefore the number of strides for the vertical direction (axis-0) is 3*4 = 12 bytes. Note that going from a[0, 2] -> a[1, 0], and in general from the last item of the i-th row to the first item of the (i+1)-th row, is also 4 bytes because the array a is stored in the row-major order.

That's why

>>> a.strides  # (strides[0], strides[1])
(12, 4)  

Here's another example showing that the strides in the horizontal direction (axis-1), strides[1], of a 2D array is not necessary equal to the item size (e.g. an array with column-major order):

>>> b = np.array([[1, 4, 7],
                  [2, 5, 8],
                  [3, 6, 9]]).T
>>> b
array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

>>> b.strides
(4, 12)

Here strides[1] is a multiple of the item-size. Although the array b looks identical to the array a, it's a different array: internally b is stored as |1|4|7|2|5|8|3|6|9| (because transposing doesn't affect the data buffer but only swaps the strides and the shape), whereas a as |1|2|3|4|5|6|7|8|9|. What makes them look alike is the different strides. That is, the byte-step for b[0, 0] -> b[0, 1] is 3*4=12 bytes and for b[0, 0] -> b[1, 0] is 4 bytes, whereas for a[0, 0] -> a[0, 1] is 4 bytes and for a[0, 0] -> a[1, 0] is 12 bytes.

Last but not least, NumPy allows to create views of existing arrays with the option of modifying the strides and the shape, see stride tricks. For example:

>>> np.lib.stride_tricks.as_strided(a, shape=a.shape[::-1], strides=a.strides[::-1])
array([[1, 4, 7],
       [2, 5, 8],
       [3, 6, 9]])

which is equivalent to transposing the array a.

Let me just add, but without going into much detail, that one can even define strides that are not multiples of the item size. Here's an example:

>>> a = np.lib.stride_tricks.as_strided(np.array([1, 512, 0, 3], dtype=np.int16), 
                                        shape=(3,), strides=(3,))
>>> a
array([1, 2, 3], dtype=int16)

>>> a.strides[0]
3

>>> a.itemsize
2

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...