An array of shape (442, 1)
is 2-dimensional. It has 442 rows and 1 column.
An array of shape (442, )
is 1-dimensional and consists of 442 elements.
Note that their reprs should look different too. There is a difference in the number and placement of parenthesis:
In [7]: np.array([1,2,3]).shape
Out[7]: (3,)
In [8]: np.array([[1],[2],[3]]).shape
Out[8]: (3, 1)
Note that you could use np.squeeze
to remove axes of length 1:
In [13]: np.squeeze(np.array([[1],[2],[3]])).shape
Out[13]: (3,)
NumPy broadcasting rules allow new axes to be automatically added on the left when needed. So (442,)
can broadcast to (1, 442)
. And axes of length 1 can broadcast to any length. So
when you test for equality between an array of shape (442, 1)
and an array of shape (442, )
, the second array gets promoted to shape (1, 442)
and then the two arrays expand their axes of length 1 so that they both become broadcasted arrays of shape (442, 442)
. This is why when you tested for equality the result was a boolean array of shape (442, 442)
.
In [15]: np.array([1,2,3]) == np.array([[1],[2],[3]])
Out[15]:
array([[ True, False, False],
[False, True, False],
[False, False, True]], dtype=bool)
In [16]: np.array([1,2,3]) == np.squeeze(np.array([[1],[2],[3]]))
Out[16]: array([ True, True, True], dtype=bool)