Part A : Accessing and assigning NumPy arrays
Going by the way elements are stored in row-major order for NumPy arrays, you are doing the right thing when storing those elements along the last axis per iteration. These would occupy contiguous memory locations and as such would be the most efficient for accessing and assigning values into. Thus initializations like np.ndarray((512*25,512), dtype='uint16')
or np.ndarray((25,512,512), dtype='uint16')
would work the best as also mentioned in the comments.
After compiling those as funcs for testing on timings and feeding in random arrays instead of images -
N = 512
n = 25
a = np.random.randint(0,255,(N,N))
def app1():
imgs = np.empty((N,N,n), dtype='uint16')
for i in range(n):
imgs[:,:,i] = a
# Storing along the first two axes
return imgs
def app2():
imgs = np.empty((N*n,N), dtype='uint16')
for num in range(n):
imgs[num*N:(num+1)*N, :] = a
# Storing along the last axis
return imgs
def app3():
imgs = np.empty((n,N,N), dtype='uint16')
for num in range(n):
imgs[num,:,:] = a
# Storing along the last two axes
return imgs
def app4():
imgs = np.empty((N,n,N), dtype='uint16')
for num in range(n):
imgs[:,num,:] = a
# Storing along the first and last axes
return imgs
Timings -
In [45]: %timeit app1()
...: %timeit app2()
...: %timeit app3()
...: %timeit app4()
...:
10 loops, best of 3: 28.2 ms per loop
100 loops, best of 3: 2.04 ms per loop
100 loops, best of 3: 2.02 ms per loop
100 loops, best of 3: 2.36 ms per loop
Those timings confirm the performance theory proposed at the start, though I expected the timings for the last setup to have timings in between the ones for app3
and app1
, but maybe the effect of going from last to the first axis for accessing and assigning isn't linear. More investigations on this one could be interesting (follow up question here).
To claify schematically, consider that we are storing image arrays, denoted by x
(image 1) and o
(image 2), we would have :
App1 :
[[[x 0]
[x 0]
[x 0]
[x 0]
[x 0]]
[[x 0]
[x 0]
[x 0]
[x 0]
[x 0]]
[[x 0]
[x 0]
[x 0]
[x 0]
[x 0]]]
Thus, in memory space, it would be : [x,o,x,o,x,o..]
following row-major order.
App2 :
[[x x x x x]
[x x x x x]
[x x x x x]
[o o o o o]
[o o o o o]
[o o o o o]]
Thus, in memory space, it would be : [x,x,x,x,x,x...o,o,o,o,o..]
.
App3 :
[[[x x x x x]
[x x x x x]
[x x x x x]]
[[o o o o o]
[o o o o o]
[o o o o o]]]
Thus, in memory space, it would be same as previous one.
Part B : Reading image from disk as arrays
Now, the part on reading image, I have seen OpenCV's imread
to be much faster.
As a test, I downloaded Mona Lisa's image from wiki page and tested performance on image reading -
import cv2 # OpenCV
In [521]: %timeit io.imread('monalisa.jpg')
100 loops, best of 3: 3.24 ms per loop
In [522]: %timeit cv2.imread('monalisa.jpg')
100 loops, best of 3: 2.54 ms per loop