Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
403 views
in Technique[技术] by (71.8m points)

powerquery - fill time gaps with power query

I have following data

   start        stop       status
+-----------+-----------+-----------+
| 09:01:10  | 09:01:40  |  active   |
| 09:02:30  | 09:04:50  |  active   |
| 09:10:01  | 09:11:50  |  active   |
+-----------+-----------+-----------+

I want to fill in the gaps with "passive"

   start        stop       status
+-----------+-----------+-----------+
| 09:01:10  | 09:01:40  |  active   |
| 09:01:40  | 09:02:30  |  passive  |
| 09:02:30  | 09:04:50  |  active   |
| 09:04:50  | 09:10:01  |  passive  |
| 09:10:01  | 09:11:50  |  active   |
+-----------+-----------+-----------+

How can I do this in M Query language?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You could try something like the below (my first two steps someTable and changedTypes are just to re-create your sample data on my end):

let
    someTable = Table.FromColumns({{"09:01:10", "09:02:30", "09:10:01"}, {"09:01:40", "09:04:50", "09:11:50"}, {"active", "active", "active"}}, {"start","stop","status"}),
    changedTypes = Table.TransformColumnTypes(someTable, {{"start", type duration}, {"stop", type duration}, {"status", type text}}),
    listOfRecords = Table.ToRecords(changedTypes),
    transformList = List.Accumulate(List.Skip(List.Positions(listOfRecords)), {listOfRecords{0}}, (listState, currentIndex) =>
        let
            previousRecord = listOfRecords{currentIndex-1},
            currentRecord = listOfRecords{currentIndex},
            thereIsAGap = currentRecord[start] <> previousRecord[stop],
            recordsToAdd = if thereIsAGap then {[start=previousRecord[stop], stop=currentRecord[start], status="passive"], currentRecord} else {currentRecord},
            append = listState & recordsToAdd
        in
            append
    ),
    backToTable = Table.FromRecords(transformList, type table [start=duration, stop=duration, status=text])
in
    backToTable

This is what I start off with (at the changedTypes step):

Input table

This is what I end up with:

Output table

To integrate with your existing M code, you'll probably need to:

  • remove someTable and changedTypes from my code (and replace with your existing query)
  • change changedTypes in the listOfRecords step to whatever your last step is called (otherwise you'll get an error if you don't have a changedTypes expression in your code).

Edit:

Further to my answer, what I would suggest is:

Try changing this line in the code above:

listOfRecords = Table.ToRecords(changedTypes),

to

listOfRecords = List.Buffer(Table.ToRecords(changedTypes)),

I found that storing the list in memory reduced my refresh time significantly (maybe ~90% if quantified). I imagine there are limits and drawbacks (e.g. if the list can't fit), but might be okay for your use case.

Peformance graphed

Do you experience similar behaviour? Also, my basic graph indicates non-linear complexity of the code overall unfortunately.

Final note: I found that generating and processing 100k rows resulted in a stack overflow whilst refreshing the query (this might have been due to the generation of input rows and may not the insertion of new rows, don't know). So clearly, this approach has limits.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...