If you do want all the columns (e.g. because some of them may contain NaN
), then:
cnt = job_hist.groupby('employee_id').count()
out = cnt.loc[cnt['start_date'] > 1]
But a more customary goal is simply to count how many rows there are for each employee_id
:
cnt = job_hist.groupby('employee_id').size()
out = cnt.loc[cnt > 1]
Or, in one go:
out = job_hist.groupby('employee_id').size().to_frame('cnt').query('cnt > 1')
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…