Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
180 views
in Technique[技术] by (71.8m points)

Using powerquery M code data wrangling, how to fill in missing value from grouped rows

Using powerquery M code, how can I fill in missing values which are required, using the most common value in a group of rows?

For example, starting with this table:

id group attribute 1 attribute 2 attribute 3
4 AA example1 example2
8 AA example2
9 AA example1 example1
13 AB example4 example2 example3
14 AB example4 example2 example3
15 AB
19 BB
20 BB example5
23 BB
question from:https://stackoverflow.com/questions/65850925/using-powerquery-m-code-data-wrangling-how-to-fill-in-missing-value-from-groupe

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

First pass I was going to suggest fill...down... but your data does not lend itself to that. It seems you are using the most repeated value from the each group to replace the nulls in that group. This will do it

let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
// get table of counts  for Attribute 1 by Group
#"Grouped Rows" = Table.Group(Source, {"group", "attribute 1"}, {{"Count", each Table.RowCount(_), type number}}),
// take out nulls
#"Filtered Rows" = Table.SelectRows(#"Grouped Rows", each ([attribute 1] <> null)),
// Group again, sort on count, add index. The row with index=1 will be the attribute most repeated for the group
#"Grouped rows2" =      Table.Group(#"Filtered Rows", {"group"}, {{"NiceTable", each Table.AddIndexColumn(Table.Sort(_,{{"Count", Order.Descending}} ), "Index",1,1), type table}} ),
#"Expanded NiceTable" = Table.ExpandTableColumn(#"Grouped rows2", "NiceTable", {"attribute 1", "Index"}, {"NiceTable.attribute 1", "NiceTable.Index"}),
#"Filtered Rows1" = Table.SelectRows(#"Expanded NiceTable", each ([NiceTable.Index] = 1)),
// merge this into the orginal table and add custom column to replace blank rows with the most frequent other answer
#"Merged Queries" = Table.NestedJoin(Source,{"group"},#"Filtered Rows1",{"group"},"FR",JoinKind.LeftOuter),
#"Expanded FR" = Table.ExpandTableColumn(#"Merged Queries", "FR", {"NiceTable.attribute 1"}, {"NiceTable.attribute 1"}),
#"Added Custom" = Table.AddColumn(#"Expanded FR", "Custom", each if [attribute 1]=null then [NiceTable.attribute 1] else [attribute 1])
in #"Added Custom"

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...