performance - How does python optimize conditional list comprehensions

Question

Welcome To Ask or Share your Answers For Others

performance - How does python optimize conditional list comprehensions

posted Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)

performance - How does python optimize conditional list comprehensions

I read about List comprehension without [ ] in Python so now I know that

''.join([str(x) for x in mylist])

is faster than

''.join(str(x) for x in mylist)

because "list comprehensions are highly optimized"

So I suppose that the optimization relies on the parsing of the for expression, sees mylist, computes its length, and uses it to pre-allocate the exact array size, which saves a lot of reallocation.

When using ''.join(str(x) for x in mylist), join recieves a generator blindly and has to build its list without knowing the size in advance.

But now consider this:

mylist = [1,2,5,6,3,4,5]
''.join([str(x) for x in mylist if x < 4])

How does python decide of the size of the list comprehension? Is it computed from the size of mylist, and downsized when iterations are done (which could be very bad if the list is big and the condition filters out 99% of the elements), or does it revert back to the "don't know the size in advance" case?

EDIT: I've done some small benchmarks and it seems to confirm that there's an optimization:

without a condition:

import timeit

print(timeit.timeit("''.join([str(x) for x in [1,5,6,3,5,23,334,23234]])"))
print(timeit.timeit("''.join(str(x) for x in [1,5,6,3,5,23,334,23234])"))

yields (as expected):

3.11010817019474
3.3457350077491026

with a condition:

print(timeit.timeit("''.join([str(x) for x in [1,5,6,3,5,23,334,23234] if x < 50])"))
print(timeit.timeit("''.join(str(x) for x in [1,5,6,3,5,23,334,23234] if x < 50)"))

yields:

2.7942209702566965
3.0316467566203276

so conditional listcomp still is faster.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-23T19:02:59+0000

List comprehensions don't pre-size the list, even when they totally could. You're assuming the presence of an optimization that isn't actually done.

The list comprehension is faster because all the iterator machinery and the work of entering and exiting the genexp stack frame has a cost. The list comprehension doesn't need to pay that cost.

Categories

performance - How does python optimize conditional list comprehensions

performance - How does python optimize conditional list comprehensions

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags