It works mathematically. Can be proven by induction.
Clearly works for n = 1 element satisfying p.
For n+1 elements, we will choose the element n+1 with probability 1/(n+1), so its probability is OK. But how does that effect the end probability of choosing one of the prior n elements?
Each of the prior n had a chance of being selected with probability 1/n, until we found element n+1. Now, after finding n+1, there is a 1/(n+1) chance that element n+1 is chosen, so there is a n/(n+1) chance that the previously chosen element remains chosen. Which means its final probability of being the chosen after n+1 finds is 1/n * (n/n+1) = 1/n+1 -- which is the probability we want for all n+1 elements for uniform distribution.
If it works for n = 1, and it works for n+1 given n, then it works for all n.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…