Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
226 views
in Technique[技术] by (71.8m points)

optimization - Optimal solution for this interview question

Can anyone provide optimal solution with Time Complexity? Very appreciated!

Determine bandwidth required to run prescheduled video streams.
There can be 10s of thousands of streams and all the scheduling data is available at the start.
There may be time intervals when no streams are running

Inputs

int startTime; //start time for the stream  
int duration; //how long the stream runs  
int bandwidth; // consumed bandwidth from start to end

Output:

list[int] totalBW; // bandwidth needed at every time instant

Example:
input (list may not sorted)

[ [0,2,10], [1,2,10], [2,1,20] ] 

output

[10, 20, 30]

Explanation

At time 0: only the first element needs bandwidth 10 =>  [10, ...]
At time 1: first two elements need bandwidth 10 + 10 => [10, 20, ...]
At time 2: the second and third element need bandwidth 10 + 20 => [10, 20, 30]

The brute force approach using python:

def solution(streams):
    _max = max(streams, key=lambda x: (x[0]+x[1]))
    _max_time = _max[0] + _max[1] 
    res = [0] * _max_time
        
    for s, d, bw in streams:
        for i in range(d):
            res[s+i] += bw
    return res

Is there any more efficient approach?

question from:https://stackoverflow.com/questions/65895229/optimal-solution-for-this-interview-question

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Is there any more efficient approach?

Yes.

The first step would be to transform the original data into a set of "at time = T, bandwidth changes by N" events, sorted in chronological order, and simplified by merging events that happen at the same time.

For your example, if the input is [ [0,2,10], [1,2,10], [2,1,20] ] then it would be broken up into:

** [ [0,2,10] **
    At 0, bandwidth += 10 
    At 2, bandwidth += -10

** [1,2,10] **
    At 1, bandwidth += 10
    At 3, bandwidth += -10

** [2,1,20] **
    At 2, bandwidth += 20
    At 3, bandwidth += -20

..then sorted and simplified (merging events that happen at the same time - e.g. bandwidth += -10, bandwidth += 20 becomes a single bandwidth += 10) to get:

At 0, bandwidth += 10 
At 1, bandwidth += 10
At 2, bandwidth += 10
At 3, bandwidth += -30

From there it's a simple matter of generating the final array from the sorted list:

 10, 20, 30, 0

To understand why this is more efficient, imagine what happens if the time is tracked with higher precision (e.g. maybe milliseconds instead of seconds) and the input is [ [0,2000,10], [1000,2000,10], [2000,1000,20] ] instead. For my approach generating the final array would become an outer loop with 4 iterations and an inner loop that can be a highly optimized memset() (C) or rep stosd (80x86 assembly) or np.full() (Python with NumPy); and for your approach the outer loop needs 30000 iterations where the inner loop wastes a huge amount of time repeating a linear search that (for most iterations of the outer loop) finds the same answer as the previous iteration of the outer loop.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...