This is somehow a follow-up to this question
So first, you'll notice that you cannot perform a sum
on a list of strings to concatenate them, python tells you to use str.join
instead, and that's good advice because no matter how you use +
on strings, the performance is bad.
The "cannot use sum
" restriction doesn't apply to list
, and though, itertools.chain.from_iterable
is the preferred way to perform such list flattening.
But sum(x,[])
when x
is a list of lists is definitively bad.
But should it stay that way?
I compared 3 approaches
import time
import itertools
a = [list(range(1,1000)) for _ in range(1000)]
start=time.time()
sum(a,[])
print(time.time()-start)
start=time.time()
list(itertools.chain.from_iterable(a))
print(time.time()-start)
start=time.time()
z=[]
for s in a:
z += s
print(time.time()-start)
results:
sum()
on the list of lists: 10.46647310256958. Okay, we knew.
itertools.chain
: 0.07705187797546387
- custom accumulated sum using in-place addition: 0.057044029235839844 (can be faster than
itertools.chain
as you see)
So sum
is way behind because it performs result = result + b
instead of result += b
So now my question:
Why can't sum
use this accumulative approach when available?
(That would be transparent for already existing applications and would make possible the use of the sum
built-in to flatten lists efficiently)
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…