This is the solution to the problem posed a couple weeks earlier.
The trick is to realize we can write \[ f(x|N,M) = \sum^{M}_{k=1}\frac{(M)_{k}}{(N)_{k}}x^{k} \tag{1}\] where \((M)_{k} = M(M-1)(\dots)(M-(k-1))\) is the falling factorial. This permits us to rewrite the expected length of a run as \[ \mathbb{E}[L|N,M] = \left.\frac{\mathrm{d}}{\mathrm{d}x}f(x|N,M)\right|_{x=1}. \tag{2} \] But astute readers will recognize Eq (1) is Gauss's hypergeometric function \[ f(x|N,M) = \frac{M}{N}x {}_{2}F_{1}(1, 1-M;1-N;x) \tag{3} \] which permits us to deduce \[ \mathbb{E}[L|N,M] = \frac{M(1+N)}{(1+N-M))(1+N-(M-1))} \] which is the expected run for M red balls in an urn with N balls. This assumes \(M\gt2\), otherwise we can manually compute the expected run trivially.
HOWEVER, we need to normalize the probabilities, since we are considering a slightly different experiment: we are varying the sample size \(n\) from 1 to M (as opposed to varying the portion k of the fixed sample size n balls drawn are of the specified color). That is to say, we need to divide through by \(f(1|N,M)=M/(N+1-M)\) to give us \[ \mathbb{E}[L|N,M] = \frac{(1+N)}{(1+N-(M-1))} \tag{4} \] which, if we fix \(p=M/N\) as both \(M\to\infty\) and \(N\to\infty\), recovers the geometric distribution. [The expected value is approximately \(N/(N-M) = (N - M + M)/(N - M)\) which is \(1 + (M/(N-M)) = 1 + (p/(1-p)) = 1/(1-p)\). Then use the geometric distribution with \(1-p\) for the probability of "success", so we count the number of "failures" with the geometric distribution to coincide with the length of a run in our situation.] (This is an example of a sanity check.)
I suspect there's a way to solve this puzzle elegantly without invoking hypergeometric functions, but this accidentally fell into my lap (literally).
Homework 1. What if we fix \(n\) draws from an urn, sampled without replacement. Suppose this urn contains K red balls, the remainder are white, for a total of N balls. What's the expected first run's length? I.e., if you draw a white, what's the number white balls you expect to draw until you draw a red? And if you first draw a red ball, then what's the number of red balls you expect to draw until drawing a white?
Homework 2. If we consider sampling without replacement from an urn (possibly with balls of multiple colors), then what does the distribution look like on finite sequences of runs of different colors? I.e., specifying a sample as drawing \(n_{1}\) of a color \(c_{1}\), then \(n_{2}\) of \(c_{2}\), then..., such that \(n_{1}+\dots + n_{k}=n\). What does the sample space look like? What's the probability distribution look like?
No comments:
Post a Comment