Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
278 views
in Technique[技术] by (71.8m points)

python - Vectorized relabeling of NumPy array to consecutive numbers and retrieving back

I have a huge training dataset with 4 classes. These classes are labeled non-consecutively. To be able to apply a sequential neural network the classes have to be relabeled so that the unique values in the classes are consecutive. In addition, at the end of the script I have to relabel them back to their old values.

I know how to relabel them with loops:

def relabel(old_classes, new_classes):
    indexes=[np.where(old_classes ==np.unique(old_classes)[i]) for i in range(len(new_classes))]
    for i in range(len(new_classes )):
        old_classes [indexes[i]]=new_classes[i]
    return old_classes

>>> old_classes = np.array([0,1,2,6,6,2,6,1,1,0])
>>> new_classes = np.arange(len(np.unique(old_classes)))
>>> relabel(old_classes,new_classes)
array([0, 1, 2, 3, 3, 2, 3, 1, 1, 0])

But this isn't nice coding and it takes quite a lot of time.

Any idea how to vectorize this relabeling?


To be clear, I also want to be able to relabel them back to their old values:

>>> relabeled_classes=np.array([0, 1, 2, 3, 3, 2, 3, 1, 1, 0])
>>> old_classes = np.array([0,1,2,6])
>>> relabel(relabeled_classes,old_classes )
array([0,1,2,6,6,2,6,1,1,0])
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

We can use the optional argument return_inverse with np.unique to get those unique sequential IDs/tags, like so -

unq_arr, unq_tags = np.unique(old_classes,return_inverse=1)

Index into unq_arr with unq_tags to retrieve back -

old_classes_retrieved = unq_arr[unq_tags] 

Sample run -

In [69]: old_classes = np.array([0,1,2,6,6,2,6,1,1,0])

In [70]: unq_arr, unq_tags = np.unique(old_classes,return_inverse=1)

In [71]: unq_arr
Out[71]: array([0, 1, 2, 6])

In [72]: unq_tags
Out[72]: array([0, 1, 2, 3, 3, 2, 3, 1, 1, 0])

In [73]: old_classes_retrieved = unq_arr[unq_tags]

In [74]: old_classes_retrieved
Out[74]: array([0, 1, 2, 6, 6, 2, 6, 1, 1, 0])

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...