Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
320 views
in Technique[技术] by (71.8m points)

Best and/or fastest way to create lists in python

In python, as far as I know, there are at least 3 to 4 ways to create and initialize lists of a given size:

Simple loop with append:

my_list = []
for i in range(50):
    my_list.append(0)

Simple loop with +=:

my_list = []
for i in range(50):
    my_list += [0]

List comprehension:

my_list = [0 for i in range(50)]

List and integer multiplication:

my_list = [0] * 50

In these examples I don't think there would be any performance difference given that the lists have only 50 elements, but what if I need a list of a million elements? Would the use of xrange make any improvement? Which is the preferred/fastest way to create and initialize lists in python?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Let's run some time tests* with timeit.timeit:

>>> from timeit import timeit
>>>
>>> # Test 1
>>> test = """
... my_list = []
... for i in xrange(50):
...     my_list.append(0)
... """
>>> timeit(test)
22.384258893239178
>>>
>>> # Test 2
>>> test = """
... my_list = []
... for i in xrange(50):
...     my_list += [0]
... """
>>> timeit(test)
34.494779364416445
>>>
>>> # Test 3
>>> test = "my_list = [0 for i in xrange(50)]"
>>> timeit(test)
9.490926919482774
>>>
>>> # Test 4
>>> test = "my_list = [0] * 50"
>>> timeit(test)
1.5340533503559755
>>>

As you can see above, the last method is the fastest by far.


However, it should only be used with immutable items (such as integers). This is because it will create a list with references to the same item.

Below is a demonstration:

>>> lst = [[]] * 3
>>> lst
[[], [], []]
>>> # The ids of the items in `lst` are the same
>>> id(lst[0])
28734408
>>> id(lst[1])
28734408
>>> id(lst[2])
28734408
>>>

This behavior is very often undesirable and can lead to bugs in the code.

If you have mutable items (such as lists), then you should use the still very fast list comprehension:

>>> lst = [[] for _ in xrange(3)]
>>> lst
[[], [], []]
>>> # The ids of the items in `lst` are different
>>> id(lst[0])
28796688
>>> id(lst[1])
28796648
>>> id(lst[2])
28736168
>>>

*Note: In all of the tests, I replaced range with xrange. Since the latter returns an iterator, it should always be faster than the former.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...