Computer memory is addressed linearly. Each memory cell corresponds to a number. A block of memory can be addressed in terms of a base, which is the memory address of its first element, and the item index. For example, assuming the base address is 10,000:
item index 0 1 2 3
memory address 10,000 10,001 10,002 10,003
To store multi-dimensional blocks, their geometry must somehow be made to fit into linear memory. In C
and NumPy
, this is done row-by-row. A 2D example would be:
| 0 1 2 3
--+------------------------
0 | 0 1 2 3
1 | 4 5 6 7
2 | 8 9 10 11
So, for example, in this 3-by-4 block the 2D index (1, 2)
would correspond to the linear index 6
which is 1 x 4 + 2
.
unravel_index
does the inverse. Given a linear index, it computes the corresponding ND
index. Since this depends on the block dimensions, these also have to be passed. So, in our example, we can get the original 2D index (1, 2)
back from the linear index 6
:
>>> np.unravel_index(6, (3, 4))
(1, 2)
Note: The above glosses over a few details. 1) Translating the item index to memory address also has to account for item size. For example, an integer typically has 4 or 8 bytes. So, in the latter case, the memory address for item i
would be base + 8 x i
. 2). NumPy is a bit more flexible than suggested. It can organize ND
data column-by-column if desired. It can even handle data that are not contiguous in memory but for example leave gaps, etc.
Bonus reading: internal memory layout of an ndarray
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…