I'm optimizing some code whose main bottleneck is running through and accessing a very large list of struct-like objects. Currently I'm using namedtuples, for readability. But some quick benchmarking using 'timeit' shows that this is really the wrong way to go where performance is a factor:
Named tuple with a, b, c:
>>> timeit("z = a.c", "from __main__ import a")
0.38655471766332994
Class using __slots__
, with a, b, c:
>>> timeit("z = b.c", "from __main__ import b")
0.14527461047146062
Dictionary with keys a, b, c:
>>> timeit("z = c['c']", "from __main__ import c")
0.11588272541098377
Tuple with three values, using a constant key:
>>> timeit("z = d[2]", "from __main__ import d")
0.11106188992948773
List with three values, using a constant key:
>>> timeit("z = e[2]", "from __main__ import e")
0.086038238242508669
Tuple with three values, using a local key:
>>> timeit("z = d[key]", "from __main__ import d, key")
0.11187358437882722
List with three values, using a local key:
>>> timeit("z = e[key]", "from __main__ import e, key")
0.088604143037173344
First of all, is there anything about these little timeit
tests that would render them invalid? I ran each several times, to make sure no random system event had thrown them off, and the results were almost identical.
It would appear that dictionaries offer the best balance between performance and readability, with classes coming in second. This is unfortunate, since, for my purposes, I also need the object to be sequence-like; hence my choice of namedtuple.
Lists are substantially faster, but constant keys are unmaintainable; I'd have to create a bunch of index-constants, i.e. KEY_1 = 1, KEY_2 = 2, etc. which is also not ideal.
Am I stuck with these choices, or is there an alternative that I've missed?
See Question&Answers more detail:
os