Posts

Showing posts with the label probability

Interview problem: an unfriendly seating arrangement

The following problem description as well as the solution follows the paper by H. D. Friedman, et al. (1962). Problem: There are n seats in a row at a luncheonette and people sit down one at a time at random. They are unfriendly and so never sit next to one another (no moving over). What is the expected number of persons to sit down? Solution: Let the $E_n$ be the expected number of person to sit down when there are n consecutively seats. The first person randomly picks the ith seat, which gives the recursion: \begin{equation} E_n = 1 + \frac{1}{n}\sum_{i=1}^n \left(E_{i-2}+E_{n-i-1}\right)\,,\tag{1}\end{equation}with $E_{-1}=E_0=0$. What I learned from the paper is to solve the above recursion by generating function. For this purpose, we define the generating function $F(x)\equiv \sum_{n=1}^{\infty} E_n \,x^n$, and rewrite the recursion as $n\,E_n = n + 2\sum_{i=1}^{n-2} E_i$. We have \begin{equation}F'(x) = \sum_{n=1}^{\infty} n\, E_n \,x^{n-1}= \sum_{n=1}^{\infty} n\,x...

Interview problems: counter examples in probability

Problems: (1) Zero correlation does not imply independence. (2) For two marginally normal random variables, zero correlation does not imply independence. (3) Pairwise independence does not imply jointly independence. (4) For three events A, B, C: A is independent with B and A is independent with C. Is A independent with the intersection of B and C? (5) Convergence in distribution does not imply convergence in probability. (6) Convergence in probability does not imply almost sure convergence. Solution: (1) X: take values of -1, 0, +1 with equal probability. $Y=X^2$. (2) X is standard Gaussian. Let W = +1 or -1 with equal probability and W is independent with X. Define Y=WX. X and Y are standard Gaussians. They are uncorrelated but dependent. Note: Only for random variables with jointly Gaussian, zero correlation implies independence. (3) Throw two dices. A: sum is 7, B: get 3, C: get 4. (4) No. Throw a dice. A: get a even number, B: get 1 or 2, ...

Interview problems: Bayes

Problems: Suppose a family has two children: (1) If the first child is a boy, what is the probability that the second child is a boy? (2) If one child is a boy, what is the probability that the other child is a boy? (3) If one child is a boy, you make a random guess of the boy's name and it happens that your guess is correct. Then what is the probability that other other child is a boy (assume two boys can have the same name and a girl cannot have a boy's name)? Solution: (1) 1/2. Independent events. (2) 1/3. There are three events with equal probability: (boy, boy), (boy, girl) and (girl, boy). Compared to (1), here the one boy can be either the first child or the second child. (3) Let $p$ be the probability that the random guess of the boy's name is correct. In general, $p<<1$, \begin{eqnarray}&&\mathbb{P}(\text{correct guess}|\text{two boys})=1-(1-p)^2=2p-p^2\approx 2p\,, \\ &&\mathbb{P}(\text{correct guess}|\text{one boy and one g...

Interview problem: distance between median and mean

Problem: Provide a bound on the distance between median and mean. Solution: Let $X$ be a random variable and $M(X)$ be the median. Besides $\mathbb{E}(X)=\text{argmin}_c\mathbb{E}\left(X-c\right)^2$, we have \begin{equation}M(X)=\text{argmin}_c\mathbb{E}\left|X-c\right|\,.\end{equation} As a result,  we can prove using  Jensen's inequality : \begin{eqnarray} \left|\mathbb{E}(X)-M(X)\right| &\leq& \mathbb{E}\left|X-M(X)\right| \\ &\leq& \mathbb{E}\left|X-\mathbb{E}(X)\right| \\ &\leq& \sqrt{\mathbb{E}\left(X-\mathbb{E}(X)\right)^2}=\sqrt{\text{Var}(X)}\,.\end{eqnarray} 

Interview problem: Welford's online algorithm

Problem:  Provide an online algorithm to compute the sample mean and sample variance of stream data.  Solution: Let $x_1, x_2, \cdots, x_n, \cdots$ be the stream data, the sample mean $\bar{x}_n \equiv\sum_{i=1}^n x_i / n$ can be computed by \begin{equation}\bar{x}_n=\bar{x}_{n-1} +\frac{x_n - \bar{x}_{n-1}}{n}\,.\tag{1}\end{equation} For the sample variance $s^2_{n}\equiv \frac{1}{n-1}\sum_{i=1}^n\left(x_i-\bar{x}_n\right)^2$, a naive method is to compute \begin{equation} s^2_n = \frac{1}{n-1}\left(\sum_{i=1}^n x_i^2 - \frac{\left(\sum_{i=1}^n x_i\right)^2}{n}\right)\end{equation} by the statistics of the sum and sum of squared. However, such method is numerically unstable since it may involve the cancellation of two very similar numbers. Instead, as suggested in this wiki , we use Welford's algorithm: \begin{eqnarray} M_{2,n} &=& M_{2, n-1}+\left(x_n-\bar{x}_{n-1}\right)\left(x_n-\bar{x}_n\right)\,, \\ s_n^2 &=&\frac{M_{2, n}}{n-1}\,.\tag{2}\end{eqnarray} Pro...

Interview Problems: Correlation

Problems: (1) Why correlation matrix is positive semi-definite?  (2) For n random variables, if all their pairwise correlations are $\rho$, what is the range of $\rho$? (3) For three random variables X, Y and Z, the correlation between X and Y is $a$ and the correlation between X and Z is $b$. What is the range of correlation between Y and Z? Solve it in three ways. (4) For two covariance matrices A and B, is AB a covariance matrix when AB=BA.  Solution: (1) Since $\text{Var}(\sum_{i} a_iX_i)=\sum_{i,j}a_ia_j\text{Cov}(X_i, X_j) \geq 0$, covariance matrix is positive semi-definite. So does correlation matrix since it is a covariance matrix when all random variables have unit variance.   (2) The correlation matrix is of the form $\boldsymbol{\Sigma}=(1-\rho)\mathbf{I}+\rho\, \mathbf{e}\mathbf{e}^T$ where $\mathbf{e}\equiv[1, \cdots, 1]^T$. $\mathbf{e}\mathbf{e}^T$ has one eigenvector $\mathbf{e}$ with eigenvalue $n$ and $n-1$ eigenvectors th...

Interview problems: Gaussian

Image
Problems: (1) X, Y are i.i.d. standard normal $\mathcal{N}(0, 1)$. What is $\mathbb{P}(X >3Y| X > 0)$? (2) X, Y, Z are standard normal $\mathcal{N}(0, 1)$ with $\rho_{X, Y}=\rho_{X, Z}=\rho_{Y, Z}=0.5$. What is $\mathbb{P}(X+Y>Z+2)$? (3) X, Y are i.i.d. standard normal $\mathcal{N}(0, 1)$. What is $\mathbb{E}[X|X+Y=1]$? Solution: (1) (X, Y) joint pdf is rotation invariant. Therefore, as shown in the figure, $$\mathbb{P}(X >3Y| X > 0)=\frac{\theta}{\pi} = \frac{1}{2} + \frac{\arctan 1/3}{\pi}.$$ Similarly, we can compute, for example, $\mathbb{P}(X >5Y)=1/2$ or $\mathbb{P}(X>0| X+Y>0)=3/4$. (2) Let $W\equiv X+Y-Z$, $\text{Var}(W)=\text{Cov}(X+Y-Z, X+Y-Z)=2$. As a result, $W\sim \mathcal{N}(0, 2)$ and $\mathbb{P}(W>2)=1-\Phi(\sqrt{2})$. (3) $\mathbb{E}[X|X+Y=1]=\mathbb{E}[X+Y-Y|X+Y=1]=1-\mathbb{E}[Y|X+Y=1]=1-\mathbb{E}[X|X+Y=1]$, and thus $\mathbb{E}[X|X+Y=1]=1/2$. For a more general case, such conditional expectation can be solved by two properties of Gaus...