Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
582 views
in Technique[技术] by (71.8m points)

performance - Python: iterating over list vs over dict items efficiency

Is iterating over some_dict.items() as efficient as iterating over a list of the same items in CPython?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

It depends on which version of Python you're using. In Python 2, some_dict.items() creates a new list, which takes up some additional time and uses up additional memory. On the other hand, once the list is created, it's a list, and so should have identical performance characteristics after the overhead of list creation is complete.

In Python 3, some_dict.items() creates a view object instead of a list, and I anticipate that creating and iterating over items() would be faster than in Python 2, since nothing has to be copied. But I also anticipate that iterating over an already-created view would be a bit slower than iterating over an already-created list, because dictionary data is stored somewhat sparsely, and I believe there's no good way for python to avoid iterating over every bin in the dictionary -- even the empty ones.

In Python 2, some timings confirm my intuitions:

>>> some_dict = dict(zip(xrange(1000), reversed(xrange(1000))))
>>> some_list = zip(xrange(1000), xrange(1000))
>>> %timeit for t in some_list: t
10000 loops, best of 3: 25.6 us per loop
>>> %timeit for t in some_dict.items(): t
10000 loops, best of 3: 57.3 us per loop

Iterating over the items is roughly twice as slow. Using iteritems is a tad bit faster...

>>> %timeit for t in some_dict.iteritems(): t
10000 loops, best of 3: 41.3 us per loop

But iterating over the list itself is basically the same as iterating over any other list:

>>> some_dict_list = some_dict.items()
>>> %timeit for t in some_dict_list: t
10000 loops, best of 3: 26.1 us per loop

Python 3 can create and iterate over items faster than Python 2 can (compare to 57.3 us above):

>>> some_dict = dict(zip(range(1000), reversed(range(1000))))
>>> %timeit for t in some_dict.items(): t      
10000 loops, best of 3: 33.4 us per loop 

But the time to create a view is negligable; it is actually slower to iterate over than a list.

>>> some_list = list(zip(range(1000), reversed(range(1000))))
>>> some_dict_view = some_dict.items()
>>> %timeit for t in some_list: t
10000 loops, best of 3: 18.6 us per loop
>>> %timeit for t in some_dict_view: t
10000 loops, best of 3: 33.3 us per loop

This means that in Python 3, if you want to iterate many times over the items in a dictionary, and performance is critical, you can get a 30% speedup by caching the view as a list.

>>> some_list = list(some_dict_view)
>>> %timeit for t in some_list: t
100000 loops, best of 3: 18.6 us per loop

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...