Given the following problem:
"Store the largest 5000 numbers from a stream of numbers"
The solution which springs to mind is a binary search tree maintaining a count of the number of nodes in the tree and a reference to the smallest node once the count reaches 5000. When the count reaches 5000, each new number to add can be compared to the smallest item in the tree. If greater, the new number can be added then the smallest removed and the new smallest calculated (which should be very simple already having the previous smallest).
My concern with this solution is that the binary tree is naturally going to get skewed (as I'm only deleting on one side).
Is there a way to solve this problem which won't create a terribly skewed tree?
In case anyone wants it, I've included pseudo-code for my solution so far below:
process(number)
{
if (count == 5000 && number > smallest.Value)
{
addNode( root, number)
smallest = deleteNodeAndGetNewSmallest ( root, smallest)
}
}
deleteNodeAndGetNewSmallest( lastSmallest)
{
if ( lastSmallest has parent)
{
if ( lastSmallest has right child)
{
smallest = getMin(lastSmallest.right)
lastSmallest.parent.right = lastSmallest.right
}
else
{
smallest = lastSmallest.parent
}
}
else
{
smallest = getMin(lastSmallest.right)
root = lastSmallest.right
}
count--
return smallest
}
getMin( node)
{
if (node has left)
return getMin(node.left)
else
return node
}
add(number)
{
//standard implementation of add for BST
count++
}
Question&Answers:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…