Probably the cleanest way is to use np.repeat
:
a = np.array([[1, 2], [1, 2]])
print(a.shape)
# (2, 2)
# indexing with np.newaxis inserts a new 3rd dimension, which we then repeat the
# array along, (you can achieve the same effect by indexing with None, see below)
b = np.repeat(a[:, :, np.newaxis], 3, axis=2)
print(b.shape)
# (2, 2, 3)
print(b[:, :, 0])
# [[1 2]
# [1 2]]
print(b[:, :, 1])
# [[1 2]
# [1 2]]
print(b[:, :, 2])
# [[1 2]
# [1 2]]
Having said that, you can often avoid repeating your arrays altogether by using broadcasting. For example, let's say I wanted to add a (3,)
vector:
c = np.array([1, 2, 3])
to a
. I could copy the contents of a
3 times in the third dimension, then copy the contents of c
twice in both the first and second dimensions, so that both of my arrays were (2, 2, 3)
, then compute their sum. However, it's much simpler and quicker to do this:
d = a[..., None] + c[None, None, :]
Here, a[..., None]
has shape (2, 2, 1)
and c[None, None, :]
has shape (1, 1, 3)
*. When I compute the sum, the result gets 'broadcast' out along the dimensions of size 1, giving me a result of shape (2, 2, 3)
:
print(d.shape)
# (2, 2, 3)
print(d[..., 0]) # a + c[0]
# [[2 3]
# [2 3]]
print(d[..., 1]) # a + c[1]
# [[3 4]
# [3 4]]
print(d[..., 2]) # a + c[2]
# [[4 5]
# [4 5]]
Broadcasting is a very powerful technique because it avoids the additional overhead involved in creating repeated copies of your input arrays in memory.
* Although I included them for clarity, the None
indices into c
aren't actually necessary - you could also do a[..., None] + c
, i.e. broadcast a (2, 2, 1)
array against a (3,)
array. This is because if one of the arrays has fewer dimensions than the other then only the trailing dimensions of the two arrays need to be compatible. To give a more complicated example:
a = np.ones((6, 1, 4, 3, 1)) # 6 x 1 x 4 x 3 x 1
b = np.ones((5, 1, 3, 2)) # 5 x 1 x 3 x 2
result = a + b # 6 x 5 x 4 x 3 x 2