python - Why does creating a list of lists produce unexpected behavior?

Question

Welcome To Ask or Share your Answers For Others

python - Why does creating a list of lists produce unexpected behavior?

posted Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)

python - Why does creating a list of lists produce unexpected behavior?

EDIT: This question is about why the behavior is what it is, not how to get around it, which is what the alleged duplicate is about.

I've used the following notation to create lists of a certain size in different cases. For example:

>>> [None] * 5
[None, None, None, None, None]
>>>

This appears to work as expected and is shorter than:

>>> [None for _ in range(5)]
[None, None, None, None, None]
>>>

I then tried to create an list of lists using the same approach:

>>> [[]] * 5
[[], [], [], [], []]
>>>

Fair enough. It seems to work as expected.

However, while going through the debugger, I noticed that all the sub-list buckets had the same value, even though I had added only a single item. For example:

>>> t = [[]] * 5
>>> t
[[], [], [], [], []]
>>> t[1].append(4)
>>> t
[[4], [4], [4], [4], [4]]
>>> t[0] is t[1]
True
>>>

I was not expecting all top-level array elements to be references to a single sub-list; I expected 5 independent sub-lists.

For that, I had to write code like so:

>>> t = [[] for _ in range(5)]
>>> t
[[], [], [], [], []]
>>> t[2].append(4)
>>> t
[[], [], [4], [], []]
>>> t[0] is t[1]
False
>>>

I'm clearly missing something, probably a historical fact or simply a different way in which the consistency here is viewed.

Can someone explain why two different code snippets that one would reasonably expect to be equivalent to each other actually end up implicitly producing different and non-obvious (IMO) results, especially given Python's zen of always being explicit and obvious?

Please note that I'm already aware of this question, which is different to what I'm asking.

I'm simply looking for a detailed explanation/justification. If there're historical, technical, and/or theoretical reasons for this behavior, then please be sure to include a reference or two.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-23T18:42:29+0000

When you do the following:

[[]]*n

You are first creating a list, then using the * operator with an int n. This takes whatever objects are in your list, and creates n- many repetitions of it.

But since in Python, explicit is better than implicit, you don't implicitly make a copy of those objects. Indeed, this is consistent with the semantics of Python.

Try to name a single case where Python implicitly makes a copy.

Furthermore, it is consistent with the addition on the list:

l = [1, [], 'a']

l2 = l + l + l

l[1].append('foo')

print(l2)

And the output:

[1, ['foo'], 'a', 1, ['foo'], 'a', 1, ['foo'], 'a']

Now, as noted in the comments, coming from C++ it makes sense that the above would be surprising, but if one is used to Python, the above is what one would expect.

On the other hand:

[[] for _ in range(5)]

Is a list comprehension. It is equivalent to:

lst = []
for _ in range(5):
    lst.append([])

Here, clearly, every time you are in the loop you create a new list. That is how literal syntax works.

As an aside, I almost never use the * operator on lists, except for one particular idiom I am fond of:

>>> x = list(range(1, 22))
>>> it_by_three = [iter(x)]*3
>>> for a,b,c in zip(*it_by_three):
...    print(a, b, c)
...
1 2 3
4 5 6
7 8 9
10 11 12
13 14 15
16 17 18
19 20 21

Categories

python - Why does creating a list of lists produce unexpected behavior?

python - Why does creating a list of lists produce unexpected behavior?

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags