python - Removing duplicate columns and rows from a NumPy 2D array

Question

Welcome To Ask or Share your Answers For Others

python - Removing duplicate columns and rows from a NumPy 2D array

posted Oct 17, 2021 in Technique[技术] by 深蓝 (71.8m points)

python - Removing duplicate columns and rows from a NumPy 2D array

I'm using a 2D shape array to store pairs of longitudes+latitudes. At one point, I have to merge two of these 2D arrays, and then remove any duplicated entry. I've been searching for a function similar to numpy.unique, but I've had no luck. Any implementation I've been thinking on looks very "unoptimizied". For example, I'm trying with converting the array to a list of tuples, removing duplicates with set, and then converting to an array again:

coordskeys = np.array(list(set([tuple(x) for x in coordskeys])))

Are there any existing solutions, so I do not reinvent the wheel?

To make it clear, I'm looking for:

>>> a = np.array([[1, 1], [2, 3], [1, 1], [5, 4], [2, 3]])
>>> unique_rows(a)
array([[1, 1], [2, 3],[5, 4]])

BTW, I wanted to use just a list of tuples for it, but the lists were so big that they consumed my 4Gb RAM + 4Gb swap (numpy arrays are more memory efficient).

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-17T00:07:03+0000

This should do the trick:

def unique_rows(a):
    a = np.ascontiguousarray(a)
    unique_a = np.unique(a.view([('', a.dtype)]*a.shape[1]))
    return unique_a.view(a.dtype).reshape((unique_a.shape[0], a.shape[1]))

Example:

>>> a = np.array([[1, 1], [2, 3], [1, 1], [5, 4], [2, 3]])
>>> unique_rows(a)
array([[1, 1],
       [2, 3],
       [5, 4]])

Categories

python - Removing duplicate columns and rows from a NumPy 2D array

python - Removing duplicate columns and rows from a NumPy 2D array

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags