grouping - Dask Dataframe groupby and aggregate for column

Question

Welcome To Ask or Share your Answers For Others

grouping - Dask Dataframe groupby and aggregate for column

posted Oct 6, 2021 in Technique[技术] by 深蓝 (71.8m points)

grouping - Dask Dataframe groupby and aggregate for column

I had a pd.DataFrame that I converted to Dask.DataFrame for faster computations. My requirement is that I have to find out the 'Total Views' of a channel.

In pandas it would be, df.groupby(['ChannelTitle'])['VideoViewCount'].sum() but in dask the columns dtypes is object and groupby is taking these as string and not int(see image 2)

To handle above issue, I added two columns separating figure(115) and multiplier(6 for M, 3 for K) of views hoping to do an operation like ddf['new_views_f'] * (10**ddf['new_views_m']), but now I cannot find mul for two columns in dask.

Either I am missing something or complicating the requirement.

question from:https://stackoverflow.com/questions/66053231/dask-dataframe-groupby-and-aggregate-for-column

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-06T03:12:57+0000

It does sound like you are complicating the requirement. For column multiplication, the regular pandas syntax will work (df['c'] = df['a'] * df['b']). In your case, it's possible to use pd.eval to get the actual numeric value for views:

import pandas as pd
import numpy as np
import dask.dataframe as dd
import random

df = pd.DataFrame(15*np.random.rand(15), columns=['views'])
df['views'] = df['views'].round(2).astype('str') + [random.choice(['K views', 'M views']) for _ in range(len(df))]
df['group'] = [random.choice([1,2,3]) for _ in range(len(df))]
ddf = dd.from_pandas(df, npartitions=2)

ddf['views_digits'] = ddf['views'].replace({'K views': '*1e3', 'M views': '*1e6'}, regex=True).map(pd.eval, meta=ddf['group'])
aggregate_df = ddf.groupby(['group']).agg({'views_digits': 'sum'}).compute()

Categories

grouping - Dask Dataframe groupby and aggregate for column

grouping - Dask Dataframe groupby and aggregate for column

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags