I have an array of integer labels and I would like to determine how many of each label is present and store those values in an array of the same size as the input.
This can be accomplished with the following loop:
def counter(labels):
sizes = numpy.zeros(labels.shape)
for num in numpy.unique(labels):
mask = labels == num
sizes[mask] = numpy.count_nonzero(mask)
return sizes
with input:
array = numpy.array([
[0, 1, 2, 3],
[0, 1, 1, 3],
[3, 1, 3, 1]])
counter()
returns:
array([[ 2., 5., 1., 4.],
[ 2., 5., 5., 4.],
[ 4., 5., 4., 5.]])
However, for large arrays, with many unique labels, 60,000 in my case, this takes a considerable amount time. This is the first step in a complex algorithm and I can't afford to spend more than about 30 seconds on this step. Is there a function that already exists that can accomplish this? If not, how can I speed up the existing loop?
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…