In [126]: %%timeit
...: result = np.zeros([n,9])
...: for a in range(3):
...: for b in range(3):
...: result[:, 3*a + b] = x[:, a] * y[:, b] * func_weight
141 μs ± 255 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In [128]: %%timeit
...: result_2 = np.zeros([n,9])
...: for a in range(3):
...: result_2[:, 3*a:3*(a+1)] = np.expand_dims(x[:, a], axis=-1) * y * n
...: p.expand_dims(func_weight, axis=-1)
202 μs ± 10.8 μs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
A fully broadcasted version:
In [130]: %%timeit
...: result_3 = (x[:,:,None]*y[:,None,:]*func_weight[:,None,None]).reshape(
...: n,9)
88.8 μs ± 73.1 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
Replacing the expand_dims
with np.newaxis/None
expansion:
In [131]: %%timeit
...: result_2 = np.zeros([n,9])
...: for a in range(3):
...: result_2[:, 3*a:3*(a+1)] = x[:, a,None] * y * func_weight[:,None]
132 μs ± 315 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
So yes, expand_dims
is a bit slow, I think because it tries to be general purpose. And an extra layer of function calls.
expand_dims
is just a.reshape(shape)
, but it takes a bit of time to translate your axis parameter into the shape
tuple. As an experienced user I find that the None
syntax is clearer (and faster) - visually it stands out as a dimension-adding action.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…