The pandas documentation says:
Returning a view versus a copy
The rules about when a view on the data is returned are entirely
dependent on NumPy. Whenever an array of labels or a boolean vector
are involved in the indexing operation, the result will be a copy.
With single label / scalar indexing and slicing, e.g. df.ix[3:6] or
df.ix[:, 'A'], a view will be returned.
In df[df.key==1]['D']
you first do boolean slicing (leading to a copy of the Dataframe), then you choose a column ['D'].
In df.D[df.key==1] = 3.4
, you first choose a column, then do boolean slicing on the resulting Series.
This seems to make the difference, although I must admit that it is a little counterintuitive.
Edit: The difference was identified by Dougal, see his comment: With version 1, the copy is made as the __getitem__
method is called for the boolean slicing. For version 2, only the __setitem__
method is accessed - thus not returning a copy but just assigning.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…