Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
387 views
in Technique[技术] by (71.8m points)

python copy.deepcopy lists seems shallow

I am trying to initialize a list of lists representing a 3x3 array:

import copy
m = copy.deepcopy(3*[3*[0]])
print(m)
m[1][2] = 100
print(m)

and the output is:

[[0, 0, 0], [0, 0, 0], [0, 0, 0]]
[[0, 0, 100], [0, 0, 100], [0, 0, 100]]

which is not what I expected since the last elements of each row are shared! I did get the result I need by using:

m = [ copy.deepcopy(3*[0]) for i in range(3) ]

but I don't understand why the first (and simpler) form does not work. Isn't deepcopy supposed to be deep?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

The problem is that deepcopy keeps a memo that contains all instances that have been copied already. That's to avoid infinite recursions and intentional shared objects. So when it tries to deepcopy the second sublist it sees that it has already copied it (the first sublist) and just inserts the first sublist again. In short deepcopy doesn't solve the "shared sublist" problem!

To quote the documentation:

Two problems often exist with deep copy operations that don’t exist with shallow copy operations:

  • Recursive objects (compound objects that, directly or indirectly, contain a reference to themselves) may cause a recursive loop.
  • Because deep copy copies everything it may copy too much, such as data which is intended to be shared between copies.

The deepcopy() function avoids these problems by:

  • keeping a “memo” dictionary of objects already copied during the current copying pass; and
  • letting user-defined classes override the copying operation or the set of components copied.

(emphasis mine)

That means that deepcopy regards shared references as intention. For example consider the class:

from copy import deepcopy

class A(object):
    def __init__(self, x):
        self.x = x
        self.x1 = x[0]  # intentional sharing of the sublist with x attribute
        self.x2 = x[1]  # intentional sharing of the sublist with x attribute
        
a1 = A([[1, 2], [2, 3]])
a2 = deepcopy(a1)
a2.x1[0] = 10
print(a2.x)
# [[10, 2], [2, 3]]

Neglecting that the class doesn't make much sense as is it intentionally shares the references between its x and x1 and x2 attribute. It would be weird if deepcopy broke those shared references by doing a separate copy of each of these. That's why the documentation mentions this as a "solution" to the problem of "copy too much, such as data which is intended to be shared between copies.".

Back to your example: If you don't want to have shared references it would be better to avoid them completely:

m = [[0]*3 for _ in range(3)]

In your case the inner elements are immutable because 0 is immutable - but if you deal with mutable instances inside the innermost lists you must have to avoid the inner list multiplication as well:

m = [[0 for _ in range(3)] for _ in range(3)] 

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...