numpy.take can be applied in 2 dimensions with
np.take(np.take(T,ix,axis=0), iy,axis=1 )
I tested the stencil of the discret 2-dimensional Laplacian
ΔT = T[ix-1,iy] + T[ix+1, iy] + T[ix,iy-1] + T[ix,iy+1] - 4 * T[ix,iy]
with 2 take-schemes and the usual numpy.array scheme. The functions p and q are introduced for a leaner code writing and adress the axis 0 and 1 in different order. This is the code:
nx = 300; ny= 300
T = np.arange(nx*ny).reshape(nx, ny)
ix = np.linspace(1,nx-2,nx-2,dtype=int)
iy = np.linspace(1,ny-2,ny-2,dtype=int)
#------------------------------------------------------------
def p(Φ,kx,ky):
return np.take(np.take(Φ,ky,axis=1), kx,axis=0 )
#------------------------------------------------------------
def q(Φ,kx,ky):
return np.take(np.take(Φ,kx,axis=0), ky,axis=1 )
#------------------------------------------------------------
%timeit ΔT_n = T[0:nx-2,1:ny-1] + T[2:nx,1:ny-1] + T[1:nx-1,0:ny-2] + T[1:nx-1,2:ny] - 4.0 * T[1:nx-1,1:ny-1]
%timeit ΔT_t = p(T,ix-1,iy) + p(T,ix+1,iy) + p(T,ix,iy-1) + p(T,ix,iy+1) - 4.0 * p(T,ix,iy)
%timeit ΔT_t = q(T,ix-1,iy) + q(T,ix+1,iy) + q(T,ix,iy-1) + q(T,ix,iy+1) - 4.0 * q(T,ix,iy)
.
1000 loops, best of 3: 944 μs per loop
100 loops, best of 3: 3.11 ms per loop
100 loops, best of 3: 2.02 ms per loop
The results seem to be obvious:
- usual numpy index arithmeitk is fastest
- take-scheme q takes 100% longer (= C-ordering ?)
- take-scheme p takes 200% longer (= Fortran-ordering ?)
Not even the 1-dimensional
example of the scipy manual indicates that numpy.take is fast:
a = np.array([4, 3, 5, 7, 6, 8])
indices = [0, 1, 4]
%timeit np.take(a, indices)
%timeit a[indices]
.
The slowest run took 6.58 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 4.32 μs per loop
The slowest run took 7.34 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 3.87 μs per loop
Does anybody has experiences how to make numpy.take fast ? It would be an flexible and attractive way for lean code writing that is fast in coding and
is told to be fast in execution as well. Thank your for some hints to improve my approach !
See Question&Answers more detail:
os