Just apply that isnan
on a row by row basis
In [135]: [row[~np.isnan(row)] for row in arr]
Out[135]: [array([1., 2., 3.]), array([4., 5.]), array([6.])]
Boolean masking as in x[~numpy.isnan(x)]
produces a flattened result because, in general, the result will be ragged like this, and can't be formed into a 2d array.
The source array must be float dtype - because np.nan
is a float:
In [138]: arr = np.array([[1,2,3,np.nan],[4,5,np.nan,np.nan],[6,np.nan,np.nan,np.nan]])
In [139]: arr
Out[139]:
array([[ 1., 2., 3., nan],
[ 4., 5., nan, nan],
[ 6., nan, nan, nan]])
If object
dtype, the numbers can be integer, but np.isnan(arr)
won't work.
If the original is a list, rather than an array:
In [146]: alist = [[1,2,3,np.nan],[4,5,np.nan,np.nan],[6,np.nan,np.nan,np.nan]]
In [147]: alist
Out[147]: [[1, 2, 3, nan], [4, 5, nan, nan], [6, nan, nan, nan]]
In [148]: [[i for i in row if ~np.isnan(i)] for row in alist]
Out[148]: [[1, 2, 3], [4, 5], [6]]
The flat array could be turned into a list of arrays with split
:
In [152]: np.split(arr[~np.isnan(arr)],(3,5))
Out[152]: [array([1., 2., 3.]), array([4., 5.]), array([6.])]
where the (3,5)
split parameter could be determined by counting the non-nan in each row, but that's more work and doesn't promise to be faster than than the row iteration.