## Probability and Computing: Chapter 2 Exercises (Cont. 2)

Exercise 2.18: The following approach is often called reservoir sampling. Suppose we have a sequence of item passing by one at a time. We want to maintain a sample of one item with the property that it is uniformly distributed over all the items that we have seen at each step. Moreover, we want to accomplish this without knowing the total number of items in advance or storing all of the items that we see.
Consider the following algorithm, which store just one item in memory at all times. When the first item appears, it is stored in the memory. When the $k$th item appears, it replaces the item in memory with probability $1/k$. Explain why this algorithm solves the problem.

We will prove that right after the $k$th item had been replaced the item in memory with probability $1/k$, the item in memory is equally likely to be any of the items had appeared so far, that is the probability of the $i$th item be the item in memory is $1/k$ for $i=1,\ldots ,k$.
The statement can be proved by induction.

For $k = 1$, the $1$st item will be the item in memory with probability $1/k=1/1=1$.

Then suppose the statement is true for some $k=n$, that is after the $n$th item has been replaced the item in memory with probability $1/n$, the probability of the $i$th item be the item in memory is $1/n$ for $i=1,\ldots ,n$.

We will prove that the statement is also true for $k = n+1$, that is after the $n+1$th item has been replaced the item in memory with probability $\frac{1}{n+1}$, the probability of $i$th item be the item in memory is $\frac{1}{n+1}$ for $i = 0, \ldots ,n+1$.
Indeed, after being replaced, the item in memory can be:
–  the new $n+1$th item with probability $\frac{1}{n+1}$.
– the same item as the previous turn with probability $1-\frac{1}{n+1}=\frac{n}{n+1}$. But the same old item can be any item from $1$st to $n$th with probability $\frac{1}{n}$, therefore the item in memory in this case can be any item from $1$st to $n$th item with probability $\frac{1}{n}\cdot \frac{n}{n+1}=\frac{1}{n+1}$.
So the probability of $i$th item be the item in memory is $\frac{1}{n+1}$ for $i= 1, \ldots ,n+1$, thus by induction yields the rightness of the statement.

Exercise 2.19: Suppose that we modify the reservoir sampling algorithm so that, when the $k$th item appears, it replaces the item in memory with probability $1/2$. Describe the distribution of the item in memory.

In this version of the algorithm, let $X$ be random variable which is equal to $i$ if and only if the item in memory is the $i$th item.
We have the following probability distribution on $i=1,\ldots , n$:

$Pr(X=i)=(\frac{1}{2})^{n-i}$.

Exercise 2.21: Let $a_{1},a_{2},\ldots ,a_{n}$ be a random permutation of $\{1,2,\ldots ,n\}$, equally likely to be any of the $n!$ possible permutations. When sorting the list $a_{1},a_{2},\ldots ,a_{n}$, the element $a_{i}$ must move a distance of $\mid a_{i} - i\mid$ places from it current position to reach its position in the sorted order. Find

$E[\sum_{i=1}^{n}\mid a_{i}-i\mid ]$,

the expected total distance the elements will have to be moved.

Solution:

$E[\sum_{i=1}^{n} \mid a_{i} -i \mid] = \sum_{i=1}^{n}E[\mid a_{i} - i \mid]$

Now we calculate $E[\mid a_{i} - i\mid]$.

Because the input is equally likely to be any of $n!$ permutations, the position of $a_{i}$ is equally likely to be any number from $1$ to $n$. In other words, $Pr(a_{i} = j) =\frac{1}{n}\quad j=1,\ldots , n$

Then $E[\mid a_{i} - i\mid] = \sum_{j=1 ; j \ne i}^{n}\frac{1}{n}j$

$=\frac{n+1}{2} - \frac{i}{n}$.

Therefore

$E[\sum_{i=1}^{n}\mid a_{i} - i\mid]$

$=\sum_{i=1}^{n}(\frac{n+1}{2}- \frac{i}{n})$

$= \frac{n(n+1)}{2} -\frac{1}{n}\frac{n(n+1)}{2}$

$= \frac{n^{2}-1}{2}$.

### 5 responses to this post.

1. Posted by Anonymous on May 26, 2012 at 11:21 pm

My answer for ex 21 is (n^2-1)/3