Building a string through repeated string concatenation is an anti-pattern, but I'm still curious why its performance switches from linear to quadratic after string length exceeds approximately 10 ** 6:
# this will take time linear in n with the optimization
# and quadratic time without the optimization
import time
start = time.perf_counter()
s = ''
for i in range(n):
s += 'a'
total_time = time.perf_counter() - start
time_per_iteration = total_time / n
For example, on my machine (Windows 10, python 3.6.1):
- for
10 ** 4 < n < 10 ** 6
, the time_per_iteration
is almost perfectly constant at 170±10 μs
- for
10 ** 6 < n
, the time_per_iteration
is almost perfectly linear, reaching 520 μs at n == 10 ** 7
.
Linear growth in time_per_iteration
is equivalent to quadratic growth in total_time
.
The linear complexity results from the optimization in the more recent CPython versions (2.4+) that reuse the original storage if no references remain to the original object. But I expected the linear performance to continue indefinitely rather than switch to quadratic at some point.
My question is based made on this comment. For some odd reason running
python -m timeit -s"s=''" "for i in range(10**7):s+='a'"
takes incredibly long time (much longer than quadratic), so I never got the actual timing results from timeit
. So instead, I used a simple loop as above to obtain performance numbers.
Update:
My question might as well have been titled "How can a list-like append
have O(1)
performance without over-allocation?". From observing constant time_per_iteration
on small-size strings, I assumed the string optimization must be over-allocating. But realloc
is (unexpectedly to me) quite successful at avoiding memory copy when extending small memory blocks.
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…