What you want is an implementation of the multivariate hypergeometric distribution.
I don't know of one in numpy or scipy, but it might already exist out there somewhere.
I contributed an implementation of the multivariate hypergeometric distribution to numpy 1.18.0; see numpy.random.Generator.multivariate_hypergeometric
.
For example, to draw 15 samples from an urn containing 12 red, 4 green and 18 blue marbles, and repeat the process 10 times:
In [4]: import numpy as np
In [5]: rng = np.random.default_rng()
In [6]: colors = [12, 4, 18]
In [7]: rng.multivariate_hypergeometric(colors, 15, size=10)
Out[7]:
array([[ 5, 4, 6],
[ 3, 3, 9],
[ 6, 2, 7],
[ 7, 2, 6],
[ 3, 0, 12],
[ 5, 2, 8],
[ 6, 2, 7],
[ 7, 1, 7],
[ 8, 1, 6],
[ 6, 1, 8]])
The rest of this answer is now obsolete, but I'll leave for posterity (whatever that means...).
You can implement it using repeated calls to numpy.random.hypergeometric
. Whether that will be more efficient than your implementation depends on how many colors there are and how many balls of each color.
For example, here's a script that prints the result of drawing from an urn containing three colors (red, green and blue):
from __future__ import print_function
import numpy as np
nred = 12
ngreen = 4
nblue = 18
m = 15
red = np.random.hypergeometric(nred, ngreen + nblue, m)
green = np.random.hypergeometric(ngreen, nblue, m - red)
blue = m - (red + green)
print("red: %2i" % red)
print("green: %2i" % green)
print("blue: %2i" % blue)
Sample output:
red: 6
green: 1
blue: 8
The following function generalizes that to choosing m
balls given an array colors
holding the number of each color:
def sample(m, colors):
"""
Parameters
----------
m : number balls to draw from the urn
colors : one-dimensional array of number balls of each color in the urn
Returns
-------
One-dimensional array with the same length as `colors` containing the
number of balls of each color in a random sample.
"""
remaining = np.cumsum(colors[::-1])[::-1]
result = np.zeros(len(colors), dtype=np.int)
for i in range(len(colors)-1):
if m < 1:
break
result[i] = np.random.hypergeometric(colors[i], remaining[i+1], m)
m -= result[i]
result[-1] = m
return result
For example,
>>> sample(10, [2, 4, 8, 16])
array([2, 3, 1, 4])