Why do we use 'loc' for pandas dataframes? it seems the following code with or without using loc both compile anr run at a simulular speed
%timeit df_user1 = df.loc[df.user_id=='5561']
100 loops, best of 3: 11.9 ms per loop
or
%timeit df_user1_noloc = df[df.user_id=='5561']
100 loops, best of 3: 12 ms per loop
So why use loc?
Edit: This has been flagged as a duplicate question. But although pandas iloc vs ix vs loc explanation? does mention that *
you can do column retrieval just by using the data frame's
getitem:
*
df['time'] # equivalent to df.loc[:, 'time']
it does not say why we use loc, although it does explain lots of features of loc, my specific question is 'why not just omit loc altogether'? for which i have accepted a very detailed answer below.
Also that other post the answer (which i do not think is an answer) is very hidden in the discussion and any person searching for what i was looking for would find it hard to locate the information and would be much better served by the answer provided to my question.
Question&Answers:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…