Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
274 views
in Technique[技术] by (71.8m points)

python - Why does creating a list of lists produce unexpected behavior?

EDIT: This question is about why the behavior is what it is, not how to get around it, which is what the alleged duplicate is about.


I've used the following notation to create lists of a certain size in different cases. For example:

>>> [None] * 5
[None, None, None, None, None]
>>>

This appears to work as expected and is shorter than:

>>> [None for _ in range(5)]
[None, None, None, None, None]
>>>

I then tried to create an list of lists using the same approach:

>>> [[]] * 5
[[], [], [], [], []]
>>>

Fair enough. It seems to work as expected.

However, while going through the debugger, I noticed that all the sub-list buckets had the same value, even though I had added only a single item. For example:

>>> t = [[]] * 5
>>> t
[[], [], [], [], []]
>>> t[1].append(4)
>>> t
[[4], [4], [4], [4], [4]]
>>> t[0] is t[1]
True
>>>

I was not expecting all top-level array elements to be references to a single sub-list; I expected 5 independent sub-lists.

For that, I had to write code like so:

>>> t = [[] for _ in range(5)]
>>> t
[[], [], [], [], []]
>>> t[2].append(4)
>>> t
[[], [], [4], [], []]
>>> t[0] is t[1]
False
>>>

I'm clearly missing something, probably a historical fact or simply a different way in which the consistency here is viewed.

Can someone explain why two different code snippets that one would reasonably expect to be equivalent to each other actually end up implicitly producing different and non-obvious (IMO) results, especially given Python's zen of always being explicit and obvious?

Please note that I'm already aware of this question, which is different to what I'm asking.

I'm simply looking for a detailed explanation/justification. If there're historical, technical, and/or theoretical reasons for this behavior, then please be sure to include a reference or two.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

When you do the following:

[[]]*n

You are first creating a list, then using the * operator with an int n. This takes whatever objects are in your list, and creates n- many repetitions of it.

But since in Python, explicit is better than implicit, you don't implicitly make a copy of those objects. Indeed, this is consistent with the semantics of Python.

Try to name a single case where Python implicitly makes a copy.

Furthermore, it is consistent with the addition on the list:

l = [1, [], 'a']

l2 = l + l + l

l[1].append('foo')

print(l2)

And the output:

[1, ['foo'], 'a', 1, ['foo'], 'a', 1, ['foo'], 'a']

Now, as noted in the comments, coming from C++ it makes sense that the above would be surprising, but if one is used to Python, the above is what one would expect.

On the other hand:

[[] for _ in range(5)]

Is a list comprehension. It is equivalent to:

lst = []
for _ in range(5):
    lst.append([])

Here, clearly, every time you are in the loop you create a new list. That is how literal syntax works.

As an aside, I almost never use the * operator on lists, except for one particular idiom I am fond of:

>>> x = list(range(1, 22))
>>> it_by_three = [iter(x)]*3
>>> for a,b,c in zip(*it_by_three):
...    print(a, b, c)
...
1 2 3
4 5 6
7 8 9
10 11 12
13 14 15
16 17 18
19 20 21

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...