python - "lambda" vs. "operator.attrgetter('xxx')" as a sort key function

Question

Welcome To Ask or Share your Answers For Others

python - "lambda" vs. "operator.attrgetter('xxx')" as a sort key function

posted Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)

python - "lambda" vs. "operator.attrgetter('xxx')" as a sort key function

I am looking at some code that has a lot of sort calls using comparison functions, and it seems like it should be using key functions.

If you were to change seq.sort(lambda x,y: cmp(x.xxx, y.xxx)), which is preferable:

seq.sort(key=operator.attrgetter('xxx'))

or:

seq.sort(key=lambda a:a.xxx)

I would also be interested in comments on the merits of making changes to existing code that works.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-23T17:51:28+0000

When choosing purely between attrgetter('attributename') and lambda o: o.attributename as a sort key, then using attrgetter() is the faster option of the two.

Remember that the key function is only applied once to each element in the list, before sorting, so to compare the two we can use them directly in a time trial:

>>> from timeit import Timer
>>> from random import randint
>>> from dataclasses import dataclass, field
>>> @dataclass
... class Foo:
...     bar: int = field(default_factory=lambda: randint(1, 10**6))
...
>>> testdata = [Foo() for _ in range(1000)]
>>> def test_function(objects, key):
...     [key(o) for o in objects]
...
>>> stmt = 't(testdata, key)'
>>> setup = 'from __main__ import test_function as t, testdata; '
>>> tests = {
...     'lambda': setup + 'key=lambda o: o.bar',
...     'attrgetter': setup + 'from operator import attrgetter; key=attrgetter("bar")'
... }
>>> for name, tsetup in tests.items():
...     count, total = Timer(stmt, tsetup).autorange()
...     print(f"{name:>10}: {total / count * 10 ** 6:7.3f} microseconds ({count} repetitions)")
...
    lambda: 130.495 microseconds (2000 repetitions)
attrgetter:  92.850 microseconds (5000 repetitions)

So applying attrgetter('bar') 1000 times is roughly 40 μs faster than a lambda. That's because calling a Python function has a certain amount of overhead, more than calling into a native function such as produced by attrgetter().

This speed advantage translates into faster sorting too:

>>> def test_function(objects, key):
...     sorted(objects, key=key)
...
>>> for name, tsetup in tests.items():
...     count, total = Timer(stmt, tsetup).autorange()
...     print(f"{name:>10}: {total / count * 10 ** 6:7.3f} microseconds ({count} repetitions)")
...
    lambda: 218.715 microseconds (1000 repetitions)
attrgetter: 169.064 microseconds (2000 repetitions)

Categories

python - "lambda" vs. "operator.attrgetter('xxx')" as a sort key function

python - "lambda" vs. "operator.attrgetter('xxx')" as a sort key function

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags