There is an O(N*logC)
algorithm, where C
is proportional to the maximum element value of the array. Comparing with some more complicated algorithms in recent papers, this algorithm is easier to understand, and can be implemented in a short time, and still fast enough in practical.
For simplicity, We assume there is at least one non-negative integers in the array.
The algorithm is based on binary search. At first, we can find that the final answer must be in the range [0, max(A)]
, and we half this interval in each iteration, until it is small enough (10-6 for example). In each iteration, assume the available interval is [a,b]
, we need to check whether the maximum average is no less than (a+b)/2
. If so, we get a smaller interval [(a+b)/2, b]
, or else we get [a, (a+b)/2]
.
Now the problem is: Given a number K
, how to check that the final answer is at least K
?
Assume the average is at least K
, there exist some i
, j
such that (A[i] + A[i+1] + ... + A[j]) / (j - i + 1) >= K
. We multiply both sides by (j-i+1)
, and move the right side to left, and we get (A[i] - K) + (A[i+1] - K) + ... + (A[j] - K) >= 0
.
So, let B[i] = A[i] - K
, we only need to find an interval [i, j]
(j - i + 1 > L
) such that B[i] + ... + B[j] >= 0
. Now the problem is: Given array B
and length L
, we are to find an interval of maximum sum whose length is more than L
. If the maximum sum is >= 0
, the original average number K
is possible.
The second problem can be solved by linear scan. Let sumB[0] = 0
, sumB[i] = B[1] + B[2] + ... + B[i]
. For each index i
, the max-sum interval which ended at B[i]
is sumB[i] - min(sumB[0], sumB[1], ..., sumB[i-L-1])
. When scanning the array with increasing i
, we can maintain the min(sumB[0], ..., sumB[i-L-1])
on the fly.
The time complexity of the sub-problem is O(N)
. And we need O(logC)
iterations, so the total complexity is O(N*logC)
.
P.s. This kinds of "average problem" belongs to a family of problems called fractional programming. The similar problems are minimum average-weighted spanning tree, minimum average-weighted cycle, etc.
P.s. again. The O(logC)
is a loose bound. I think we can reduce it by some careful analysis.