Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
299 views
in Technique[技术] by (71.8m points)

Algorithm to select a single, random combination of values?

Say I have y distinct values and I want to select x of them at random. What's an efficient algorithm for doing this? I could just call rand() x times, but the performance would be poor if x, y were large.

Note that combinations are needed here: each value should have the same probability to be selected but their order in the result is not important. Sure, any algorithm generating would qualify, but I wonder if it's possible to do this more efficiently without the random order requirement.

How do you efficiently generate a list of K non-repeating integers between 0 and an upper bound N covers this case for permutations.

question from:https://stackoverflow.com/questions/2394246/algorithm-to-select-a-single-random-combination-of-values

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Robert Floyd invented a sampling algorithm for just such situations. It's generally superior to shuffling then grabbing the first x elements since it doesn't require O(y) storage. As originally written it assumes values from 1..N, but it's trivial to produce 0..N and/or use non-contiguous values by simply treating the values it produces as subscripts into a vector/array/whatever.

In pseuocode, the algorithm runs like this (stealing from Jon Bentley's Programming Pearls column "A sample of Brilliance").

initialize set S to empty
for J := N-M + 1 to N do
    T := RandInt(1, J)
    if T is not in S then
        insert T in S
    else
        insert J in S

That last bit (inserting J if T is already in S) is the tricky part. The bottom line is that it assures the correct mathematical probability of inserting J so that it produces unbiased results.

It's O(x)1 and O(1) with regard to y, O(x) storage.

Note that, in accordance with the tag in the question, the algorithm only guarantees equal probability of each element occuring in the result, not of their relative order in it.


1O(x2) in the worst case for the hash map involved which can be neglected since it's a virtually nonexistent pathological case where all the values have the same hash


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...