I'm using a MultiIndexed pandas DataFrame and would like to multiply a subset of the DataFrame by a certain number.
It's the same as this but with a MultiIndex.
>>> d = pd.DataFrame({'year':[2008,2008,2008,2008,2009,2009,2009,2009],
'flavour':['strawberry','strawberry','banana','banana',
'strawberry','strawberry','banana','banana'],
'day':['sat','sun','sat','sun','sat','sun','sat','sun'],
'sales':[10,12,22,23,11,13,23,24]})
>>> d = d.set_index(['year','flavour','day'])
>>> d
sales
year flavour day
2008 strawberry sat 10
sun 12
banana sat 22
sun 23
2009 strawberry sat 11
sun 13
banana sat 23
sun 24
So far, so good. But let's say I spot that all the Saturday figures are only half what they should be! I'd like to multiply all sat
sales by 2.
My first attempt at this was:
sat = d.xs('sat', level='day')
sat = sat * 2
d.update(sat)
but this doesn't work because the variable sat
has lost the day
level of the index:
>>> sat
sales
year flavour
2008 strawberry 20
banana 44
2009 strawberry 22
banana 46
so pandas doesn't know how to join the new sales figures back onto the old dataframe.
I had a quick stab at:
>>> sat = d.xs('sat', level='day', copy=False)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:Python27libsite-packagespandascoreframe.py", line 2248, in xs
raise ValueError('Cannot retrieve view (copy=False)')
ValueError: Cannot retrieve view (copy=False)
I have no idea what that error means, but I feel like I'm making a mountain out of a molehill. Does anyone know the right way to do this?
Thanks in advance,
Rob
See Question&Answers more detail:
os