Haskell lists are lazy. [x] ++ xs
first produces the x
, and then it produces all the elements in xs
.
In e.g. Lisp the lists are singly-linked lists and appending them copies the first list, so prepending a singleton is an O(1) operation.
In Python though the appending copies the second list (as confirmed by @chepner in the comments), i.e. [x] + xs
will copy the whole list xs
and thus is an O(n) operation (where n
is the length of xs
).
This means that both your [xs[0]] + merge(xs[1:], ys)
and [ys[0]] + merge(xs, ys[1:])
lead to quadratic behavior which you observe as the dramatic slowdown you describe.
Python's equivalent to Haskell's lazy lists is not lists, it's generators, which produce their elements one by one on each yield
. Thus the rewrite could look something like
def merge(xs,ys):
if len(xs) == 0:
return ys
elif len(ys) == 0:
return xs
else:
a = (x for x in xs) # or maybe iter(xs)
b = (y for y in ys) # or maybe iter(ys)
list( merge_gen(a,b))
Now what's left is to re-implement your merge
logic as merge_gen
which expects two generators (or should that be iterators? do find out) as its input and generates the ordered stream of elements which it gets by pulling them one by one from the two sources as needed. The resulting stream of elements is converted back to list, as expected by the function's caller. No redundant copying will be performed.
If I've made some obvious Python errors, please treat the above as a pseudocode.
Your other option is to pre-allocate a second list of the same length and copy the elements between the two lists back and forth while merging, using indices to reference the elements of the arrays and mutating the contents to store the results.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…