## Probability and Computing: Chapter 2 Exercises (Cont. 4)

Exercise 2.34: We roll a standard fair dice over and over. What is the expected number of rolls until the first pair of consecutive sixes appears? (Hint: The answer is not 36.)

Solution: Let $X$ be the expected number of rolls until the first pair of consecutive sixes appears,$Y_{i}$ the random variable takes on value $1$ if the $i$th roll is a $6$ and $0$ otherwise.

$Y_{i}= \left\{ \begin{array}{ll} 1 & \text{if } i \text{th roll is } 6 \\ 0 & \text{otherwise} \end{array} \right.$

$E[X] = Pr(Y_{1} = 0)E[X \mid Y_{1}=0] + Pr(Y_{1}=1)E[X \mid Y_{1} = 1]$

$=\frac{5}{6}(1+E[x]) + \frac{1}{6}E[X \mid Y_{1} = 1] \qquad (1)$

$E[X \mid Y_{1} = 1] = Pr(Y_{2} =0)E[X \mid (Y_{1}=1)\cap (Y_{2} = 0)] +$

$Pr(Y_{2} =1)E[X \mid (Y_{1}=1)\cap (Y_{2} = 1)]$

$= \frac{5}{6}(2 + E[X]) + \frac{1}{6}\cdot 2$

$= 2 + \frac{5}{6}E[X]$

$(1)$ becomes  $E[X] = \frac{5}{6}(1+E[x]) +\frac{1}{6}(2 + \frac{5}{6}E[X])$

$\Rightarrow E[X] = 42$

Exercise 2.25: A blood test is being performed on $n$ individuals. Each person can tested separately, but this is expensive. Pooling can decrease the cost. The blood samples of $k$ people can be pooled and analyzed together. If the test is negative, this one test suffices for the group of $k$ individuals. If the test is positive, then each of the $k$ person must be test separately and thus $k+1$ total tests are required for the $k$ people.
Suppose that we create $n/k$ disjoint groups of $k$ people(where $k$ divides $n$) and use the pooling method. Assume that each person has a positive result on the test independently with probability $p$.

(a) What is the probability that the test for a pooled sample of $k$ people will be positive.

(b) What is the expected number of tests necessary?

(c) Describe how to find the best value of $k$.

(d) Give an inequality that shows for what values of $p$ pooling is better than just testing every individual.

Solution:

(a) $(1-p)^k$ is the probability that the test for a pooled sample of $k$ will be negative, so $1-(1-p)^k$ is the probability that the test will be positive.

(b) Let $m = \frac{n}{k}$. Let $X_{i}$ be random variable that take on $1$ if the test for the $i$th group is positive and $0$ if the test is negative. Let $Y= \sum_{i=1}^{m}X_{i}$$X$ be the total number of tests needed.
It easy to see that $X = m + k\sum_{i=1}^{m}X_{i} = m + kY$.
$Y$ is a binomial random variable with parameter $m$ and $1-(1-p)^{k}$, thus $E[Y]=m[1-(1-p)^{k}]$.
This yields $E[X] = E[m+kY] = m+kE[Y]= m+km[1-(1-p)^k]$
$n +\frac{n}{k}-n(1-p)^{k} = n[1+\frac{1}{k}-(1-p)^{k}]$.

(c) We will find among divisors of $n$ the value that minimizes $f = \frac{1}{k}-(1-p)^{k}$.
If $k$ is that value then we have $f_{k} < f_{k+1}$ and $f_{k} < f_{k-1}$.

$f_{k} < f_{k-1} \Leftrightarrow \frac{1}{k}-(1-p)^{k} < \frac{1}{k-1} - (1-p)^{k-1}$

This leads to $k(k-1) < \frac{1}{p(1-p)^{k-1}}$.

$f_{k} < f_{k+1} \Leftrightarrow \frac{1}{k} - (1-p)^{k} < \frac{1}{k+1} - (1-p)^{k+1}$

This leads to $k(k+1) > \frac{1}{p(1-p)^{k}}$.

Then we obtain $k(k+1) > \frac{1}{1-p}\frac{1}{p(1-p)^{k-1}} > \frac{1}{1-p}k(k-1)$.

This leads to $0 < k < \frac{2-p}{p}$         $(1)$

If we have value of $p$ then for all divisors $k$ of $n$ that satisfy $(1)$, we calculate

$f_{k}$ and then choose $k$ corrersponding to the smallest $f$.

For example, if $p = 0.1$ then $2 \le k \le 18$. But if  $p = 0.01$ then $2 \le k \le 198$ (too large).

/* Wrong

(d) The total number of tests needed if we test every individual is $n$. Assume that we choose $k$ to be the greatest divisor of $n$, so that we don’t need to worry about $k$.
We have $E[X]= n[1 + \frac{1}{k}-(1-p)^{k}]$.
Pooling is better than just testing every individual if
$\frac{1}{k} - (1-p)^{k} <0$
$\frac{1}{k}<(1-p)^{k}$
$-lnk
$-\frac{lnk}{k}<\text{ln}(1-p)$

$e^{-\frac{lnk}{k}}<1-p$
$p < 1 - e^{-\frac{lnk}{k}} = 1- \frac{1}{\sqrt[k]{k}}$.

*/

### 3 responses to this post.

1. Posted by muie geoana on December 12, 2009 at 11:04 am

Part c is wrong.

You are differentiating by k, so d/dx ((1-p)^k) = (1-p)^k ln(1-p), not k (1-p)^(k-1) as you said.