I was tinkering around with Python's set
and frozenset
collection types.
Initially, I assumed that frozenset
would provide a better lookup performance than set
, as its immutable and thus could exploit the structure of the stored items.
However, this does not seem to be the case, regarding the following experiment:
import random
import time
import sys
def main(n):
numbers = []
for _ in xrange(n):
numbers.append(random.randint(0, sys.maxint))
set_ = set(numbers)
frozenset_ = frozenset(set_)
start = time.time()
for number in numbers:
number in set_
set_duration = time.time() - start
start = time.time()
for number in numbers:
number in frozenset_
frozenset_duration = time.time() - start
print "set : %.3f" % set_duration
print "frozenset: %.3f" % frozenset_duration
if __name__ == "__main__":
n = int(sys.argv[1])
main(n)
I executed this code using both CPython and PyPy, which gave the following results:
> pypy set.py 100000000
set : 6.156
frozenset: 6.166
> python set.py 100000000
set : 16.824
frozenset: 17.248
It seems that frozenset
is actually slower regarding the lookup performance, both in CPython and in PyPy. Does anybody have an idea why this is the case? I did not look into the implementations.
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…