I have not worked with threading in Python at all and asking this question as a complete stranger.
I am wondering if defaultdict
is thread-safe. Let me explain it:
I have
d = defaultdict(list)
which creates a list for missing keys by default. Let's say I have multiple threads started doing this at the same time:
d['key'].append('value')
At the end, I'm supposed to end up with ['value', 'value']
. However, if the defaultdict
is not thread-safe, if the thread 1 yields to thread 2 after checking if 'key' in dict
and before d['key'] = default_factory()
, it will cause interleaving, and the other thread will create list in d['key']
and append 'value'
maybe.
Then when thread 1 is executing again, it will continue from d['key'] = default_factory()
which will destroy the existing list and value, and we will end up in ['key']
.
I looked at CPython source code for defaultdict. However, I could not find any locks or mutexes. I guess it is not thread-safe as long as it is documented so.
Some guys last night on IRC said that there is GIL on Python, so it is conceptually thread-safe. Some said threading should not be done in Python. I'm pretty confused. Ideas?
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…