Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
126 views
in Technique[技术] by (71.8m points)

python - combine two arrays and sort

Given two sorted arrays like the following:

a = array([1,2,4,5,6,8,9])

b = array([3,4,7,10])

I would like the output to be:

c = array([1,2,3,4,5,6,7,8,9,10])

or:

c = array([1,2,3,4,4,5,6,7,8,9,10])

I'm aware that I can do the following:

c = unique(concatenate((a,b))

I'm just wondering if there is a faster way to do it as the arrays I'm dealing with have millions of elements.

Any idea is welcomed. Thanks

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Since you use numpy, I doubt that bisec helps you at all... So instead I would suggest two smaller things:

  1. Do not use np.sort, use c.sort() method instead which sorts the array in place and avoids the copy.
  2. np.unique must use np.sort which is not in place. So instead of using np.unique do the logic by hand. IE. first sort (in-place) then do the np.unique method by hand (check also its python code), with flag = np.concatenate(([True], ar[1:] != ar[:-1])) with which unique = ar[flag] (with ar being sorted). To be a bit better, you should probably make the flag operation in place itself, ie. flag = np.ones(len(ar), dtype=bool) and then np.not_equal(ar[1:], ar[:-1], out=flag[1:]) which avoids basically one full copy of flag.
  3. I am not sure about this. But .sort has 3 different algorithms, since your arrays maybe are almost sorted already, changing the sorting method might make a speed difference.

This would make the full thing close to what you got (without doing a unique beforehand):

def insort(a, b, kind='mergesort'):
    # took mergesort as it seemed a tiny bit faster for my sorted large array try.
    c = np.concatenate((a, b)) # we still need to do this unfortunatly.
    c.sort(kind=kind)
    flag = np.ones(len(c), dtype=bool)
    np.not_equal(c[1:], c[:-1], out=flag[1:])
    return c[flag]

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...