Run them all through a min-heap of size 100: for each input number k
, replace the current min m
with max(k, m)
. Afterwards the heap holds the 100 largest inputs.
A search engine like Lucene can use this method, with refinements, to choose the most-relevant search answers.
Edit: I fail the interview -- I got the details wrong twice (after having done this before, in production). Here's code to check it; it's almost the same as Python's standard heapq.nlargest()
:
import heapq
def funnel(n, numbers):
if n == 0: return []
heap = numbers[:n]
heapq.heapify(heap)
for k in numbers[n:]:
if heap[0] < k:
heapq.heapreplace(heap, k)
return heap
>>> funnel(4, [3,1,4,1,5,9,2,6,5,3,5,8])
[5, 8, 6, 9]
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…