A bag of N balls has black and white balls. We define a function f such that it is 1 for a black ball and 0 for a white ball, and \frac{1}{N}\sum_i f(x_i) = p, where x_i could be either a black or white ball. If we draw balls in groups G of n balls, what would the mathematical expectation of the random variableNow, \frac{1}{n}\sum_{i\in G} f(x_i) is a random variable because the withdrawal of balls is random. Its expectation is E[Y]=\frac{1}{n} E[\sum_{i\in G} f(x_i)]. Now, since f marks balls as 0 or 1, this is a variable that counts black withdrawals. Let's see how we can make groups of black and white balls.
\frac{1}{n}\sum_{i\in G} f(x_i)
Within a group of n balls, how can we withdrawn k black balls? There are Np black balls in the bag. Therefore
\left(\begin{array}{c} Np \\ k \end{array}\right)
To fill up the rest of the group, we need to pick up white balls. There are N-Np white balls and we need n-k. Therefore
\left(\begin{array}{c} N-Np \\ n-k \end{array}\right)
We combine the two parts of a possible group.
\left(\begin{array}{c} Np \\ k \end{array}\right) \left(\begin{array}{c} N-Np \\ n-k \end{array}\right)
Now, how many groups of n balls can we withdraw altogether, disregarding the color of the balls?
\left(\begin{array}{c} N \\ n \end{array}\right)
To build the probability of a specific group of n balls, k black and n-k white drawn from the bag, we compute its frequency
\frac{\left(\begin{array}{c} Np \\ k \end{array}\right) \left(\begin{array}{c} N-Np \\ n-k \end{array}\right)}{\left(\begin{array}{c} N \\ n \end{array}\right)}
If we look for a probability distribution with this print, we find that it is the Hypergeometric distribution, which meets the description of the problem
[...] describes the probability of k successes in n draws from a finite population of size N containing m successes without replacement.Therefore, since cumulative successes \sum_{i\in G} f(x_i) would have a hypergeometric distribution with mean \frac{nNp}{N}=np. But we want \frac{1}{n}\sum_{i\in G} f(x_i), so that E[Y]=\frac{1}{n} E[\sum_{i\in G} f(x_i)]=\frac{np}{n}=p.
I wrote a program that takes N,n,p as inputs (and a random bag of white and black balls) and computes the expectation, which is always around p (depending on the generated random bag). Do not go very far with N, keep it under 30!
push <- function(i,n,N,ind,v) { es=0; res=list() res[[1]]=0 res[[2]]=0 if (i==1) { for (j in seq(1,N)) { ind[1]=j res2=push(i+1,n,N,ind,v) res[[1]]=res[[1]]+res2[[1]] res[[2]]=res[[2]]+res2[[2]] } print(res[[1]]) print(res[[2]]) return (res[[1]]/(res[[2]]*n)) } if (i>=n+1) { es=sum(v[ind]) #print(ind) res[[1]]=es res[[2]]=1 return (res) } if (ind[i-1]+1<=N) { for (j in seq(ind[i-1]+1,N)) { ind[i]=j res2=push(i+1,n,N,ind,v) res[[1]]=res[[1]]+res2[[1]] res[[2]]=res[[2]]+res2[[2]] } } return (res) } N=20 p=.2 v=(runif(N)<=p) * 1 ind=rep(0,N) n=5 push(1,n,N,ind,v)
No comments:
Post a Comment