The Question
Why, in CPython, does
def add_string(n):
s = ''
for _ in range(n):
s += ' '
take linear time, but
def add_string_in_list(n):
l = ['']
for _ in range(n):
l[0] += ' '
take quadratic time?
Proof:
Timer(partial(add_string, 1000000)).timeit(1)
#>>> 0.1848409200028982
Timer(partial(add_string, 10000000)).timeit(1)
#>>> 1.1123797750042286
Timer(partial(add_string_in_list, 10000)).timeit(1)
#>>> 0.0033865350123960525
Timer(partial(add_string_in_list, 100000)).timeit(1)
#>>> 0.25131178900483064
What I know
CPython has an optimisation for string addition when the string being added to has a reference count of 1.
This is because strings in Python are immutable, and so normally they cannot be edited. If multiple references exist to a string and it is mutated, both references will see the changed string. This is obviously not wanted, so mutation cannot happen with multiple references.
If there is only one reference to the string, however, mutating the value will only change the string for that one reference, which wants it changed. You can test that this is a likely cause as so:
from timeit import Timer
from functools import partial
def add_string_two_references(n):
s = ''
for _ in range(n):
s2 = s
s += ' '
Timer(partial(add_string_two_references, 20000)).timeit(1)
#>>> 0.032532954995986074
Timer(partial(add_string_two_references, 200000)).timeit(1)
#>>> 1.0898985149979126
I'm unsure why the factor is only 30x, instead of the expected 100x, but I believe that it's overhead.
What I don't know
So why is the list version creating two references? Is this even what's preventing the optimisation?
You can check that it's not treating normal objects any differently:
class Counter:
def __iadd__(self, other):
print(sys.getrefcount(self))
s = Counter()
s += None
#>>> 6
class Counter:
def __iadd__(self, other):
print(sys.getrefcount(self))
l = [Counter()]
l[0] += None
#>>> 6
See Question&Answers more detail:
os