My Favourite Martingale Problems

Polya’s Urn

At time 0, an urn contains 1 black and 1 white ball. At each time 1,2,3,…, a ball is chosen at random from the urn and is replaced together with a new ball of the same colour. Just after time n, there are therefore n+2 balls in the urn, of which Bn+1 are black, where Bn is the number of black balls chosen by time n.

Let Mn=(Bn+1)/(n+2), the proportion of black balls in the urn after time n. Prove that (relative to a natural filtration which you should specify) M is a martingale.

Prove that P(Bn=k) = (n+1)-1 for 0 \leq k \leq n . What is the distribution of \mathbb{\Theta} , \textit{where } \mathbb{\Theta}:= \lim M_n ?

Prove that for 0 < \theta < 1 ,

N_n^\theta : = \frac{(n+1)!}{B_n! (n-B_n)!} \theta^{B_n} (1-\theta)^{n-B_n}

defines a martingale.

JAM 2022 [ 41 -50]

Let \{a_n \}_{n \geq 1} be a sequence of real numbers such that a1+5m=2 , a2+5m=3 , a3+5m=4 , a4+5m=5, a5+5m=6 , m=0,1,2, … . Then

 \limsup_{n \rightarrow \infty} a_n + \liminf_{n \rightarrow \infty} a_n 

equals ____________________________ .

It is evident that an is defined for all values of n=1,2,3,…. . And so

sup \quad a_n = 6 ,\quad  inf \quad  a_n = 2 \\
So, \limsup_{n \rightarrow \infty} a_n + \liminf_{n \rightarrow \infty} a_n  = 8


What are random numbers?

Imagine yourself picking up a card from a well shuffled full deck of cards. What could the card you picked be? The jack of spades, the king of diamonds, or the queen of hearts?? Well it could be the ace of clubs too or it could be any of the other available cards in the deck.

In terms of probability theory, you can definitely say that picking up a card from a well shuffled full deck of cards is a ‘random experiment’. It is an experiment because, when one picks up a card there is an outcome, a result of the effort. And a random one cause you definitely know all the possible cards that may be picked.

Also in this particular random experiment of picking a card from a well shuffled full deck of cards, no card is preferred over the other. In simple words, the probability of picking up any card is the same, in statistical sense, the event of picking a particular card and the event of picking up another card will be equally likely. This is what we call a random phenomenon. So, picking up a card from a well shuffled full deck of cards could be interpreted as picking a card randomly from a full deck of cards.

Now imagine yourself selecting a number from the set (0,1,2,3,4,5,6,7,8,9) randomly. It means that you selecting 0 or 1 or 2 or 3 or …. or 9 are equally likely events. Thus, we are selecting a random number from the set.

How to generate random numbers?

A random number from the set of numbers can be generated by using the following methods:

  • Lottery Method: Suppose there are n numbers in the set. One can then take n similar balls such that each ball is given a unique number from the set, and put it in an urn. Shuffle the balls , and then start picking up the balls one by one with replacement and note down the number on each of the balls picked. The numbers noted will be the random numbers.
  • Roullette Wheel: One can also take a roulette wheel and divide the wheel into n equal pieces and writing the numbers(uniquely from the set) on each of the areas and spin the wheel, and note down the number. Here too the numbers noted will be random numbers.
  • Random Number Table: The above two methods are physical and it always take a considerate amount of time to draw random numbers that way. So instead, one can use a random number table, a table where random numbers are stacked up. However, since the random numbers drawn by one may be duplicated by another quite easily, there has always been an uncertainty about randomness in this method, we quite oftenly call the random numbers drawn by this method to be pseudo- random numbers.

Drawing Random Numbers using algorithms (for computers)

Since a computer is a deterministic device, it might seem impossible that it could be used to generate random numbers. The numbers generated are algorithmically computed and are quite deterministic. However, they appear to be random and must pass stringent tests designed to ensure that they can provide the same results that truly random numbers (such as the first two methods above) would be produced.

Requirements of a Random Number Generator:

  1. It should be fast.
  2. It should be repeatable.
  3. It should be amenable to analysis.
  4. It should have a long period.
  5. It should be apparently random.

Some Random Number Generator (RNF) Methods


Simulation is a numerical technique for conducting experiments on a digital computer, which involves certain types of mathematical and logical models that describe the behavior of business or economic system or some component thereof over extended period of time.

Simulation deals with both abstract and physical models. Some simulation with physical and abstract models might involve real  people. Two types of simulation involving real people are-

  1. Operational gaming
  2. Man-machine simulator

Merits of Simulation

  1. In cases where obtaining data is either impossible or very expensive, simulation is used.
  2. The observed system may be too complex that it cannot be described by a mathematical model.
  3. Even though a mathematical model may be formed to describe the system, it may not always be possible to find a straight forward analytical solution.
  4. It may either be impossible or very costly to perform experiments to validate a mathematical model describing the system.

Demerits of Simulation

  1. Simulation is indeed invaluable and very versatile tool  in those problems where analytical techniques are inadequate. Although  being an impressive technique,  it provides only statistical estimates rather than exact results and it only compare alternatives rather than generating the optimal one.
  2. Simulation is also a slow and costly way to study a problem. It usually requires a large amount of time and great expense for analysis and programming.
  3. Finally simulation deals only numerical data about the performance of the system and sensitivity analysis of the model parameters is very expensive.

Monte Carlo Methods

A Monte Carlo method is a computational / numerical method that uses random numbers to compute / estimate a quantity of interest. The quantity of interests may be the mean of a random variable , functions of several means , distributions of random variables or high dimensional integrals.

Basically, Monte Carlo methods may be grouped into two types:

The direct/simple/classical Monte Carlo methods involves generating identical and independent sequences of random samples. And the other, in a sense, involves generating a sequence of random samples, which are not independent, and is the Markov Chain Monte Carlo methods.

Monte Carlo Integration

Let f(x) be a function of x and suppose we are interested in computation of the integral:

I= \int_0^1 f(x) dx

We can write the integral as,

I=\int_0^1 f(x) p(x) dx =E(f(X))\\
\textit{; where p(x) is the pdf of a r.v. X $\sim$ Unif(0,1)} 

Now suppose that x_1,x_2,\dots,x_n are independent random samples drawn from Unit(0,1), then by the law of large number we have,

\frac{1}{n} \sum_{i=1}^n f(x_i) \rightarrow E(f(X)) 

Thus an estimator of I may be:

\hat{I}=\frac{1}{n} \sum_{i=1}^n f(x_i)

On a more general note, if a < b < \infty then,

I= \int_a^b f(x) dx \\
\textit{Taking $y=\frac{x-a}{b-a}$}\\
I= \int_0^1 (b-a) f \left(\frac{a+(b-a)y}{b} \right) dy  \\
=  \int_0^1 h(y) dy \quad dy \\ \text{;where } h(y)=f \left(\frac{a+(b-a)y}{b} \right)  \\
= E\left[ h(Y) | Y \sim Unif(0,1) \right]

And when b=\infty ,

I= \int_a^\infty f(x) dx \\ 
\text{Taking $y=\frac{1}{x+1}$} \\
I= - \int_1^0  f\left( \frac{1}{y} -1 \right) \frac{dy}{y^2} \\
= \int_0^1  h(y) dy \\
\text{; where } h(y) =  f\left( \frac{1}{y} -1 \right)/y^2 \\
=E(h(Y) | Y \sim Unif(0,1))

JAM 2022 [ 51-60 ]

[Q No. 51-60]

Let X_1,X_2,\dots,X_9 be a random sample from a N(\mu,\sigma^2) distribution, where \mu \in \mathbb{R} \text{ and } \sigma>0 are unknown. Let the observed values of \bar{X} = \frac{1}{9} \sum_{i=1}^9 \text{and} S^2 = \frac{1}{8} \sum_{i=1}^{9} ( X_i -\bar{X} )^2 be 9.8 and 1.44, respectively. If the likelihood ratio test is used to test the hypothesis H_0: \mu=8.8 \text{ against } H_1:\mu>8.8, then the p-value of the test equals __________________ (round off to 3 decimals)

We know that,

Y=\frac{(\bar{X}-\mu)}{\sigma} \sim N(0,1) \\
and \\
Z^2=\frac{8S^2}{\sigma^2} \sim \chi^2_{(8)} \text{independently}

Thus, the test statistic that should to be used for testing the null vs alternative would be:

T = \frac{Y}{Z/\sqrt{8}} \sim t_{(8)} 

Judging from the alternative the rule of rejection of Null Hypothesis is: Reject H_0 if the tabulated value is larger than the tabulated value, t1 (say).

Now, the p-value of the test is given as:

Pr [T > T_{tab} / H_0 \text{ is true} ] = Pr \left[ T > \frac{ (9.8 - 8.8)}{\sqrt{8 (1.44)} } \right] \\= Pr \left[T > \frac{1}{1.2 (2\sqrt{2})} \right] = 0.0185


Suppose an urn contains N coloured balls, out of which M are blue and the rest are red in color. Suppose we draw a sample of ‘n’ number of balls from the urn at random and without replacements. Then if X a random variable denoting the number of blue balls drawn, then the probability that there will be x number of blue balls drawn is given by:

p(x ; N,M) = Pr[X=x] \\ \hskip{1.8cm}= \frac{\binom M x \binom {N-M}{n-x}}{\binom N n}  \\ \hskip{3cm}; x=0,1,\dots,M


E(X)= \sum_{x=0}^{M} x p(x;n,N,M) = \sum_{x=1}^{M} x  \frac{\binom M x \binom {N-M}{n-x}}{\binom N n} \\
= \sum_{x=1}^{M} x  \frac{n!}{N!(N-n)!)} \frac{M!}{x!(M-x)!} \frac{(N-M)!}{(n-x)!(N-M-n+x)!} \\
= \sum_{x=1}^{M}  \frac{n!}{N!(N-n)!)} \frac{M(M-1)!}{(x-1)!(M-x)!} \frac{(N-M)!}{(n-x)!(N-M-n+x)!} \\
= \frac{M}{\binom N n} \sum_{x=1}^{M}  \binom {M-1}{x-1} \binom {(N-1)-(M-1)}{(n-1)-(x-1)} \\
= \frac{M}{\binom N n} \binom {N-1}{n-1} \sum_{x=1}^{M}  p(x-1 ; n=n-1 ,N=N-1 , M =M-1) \\
= \frac{M}{\binom N n} \binom {N-1}{n-1} \sum_{x=0}^{M-1}  p(x ; n=n-1 ,N=N-1 , M =M-1) \\
= \frac{M n! (N-n)! (N-1)!}{(n-1)!(N-n)!N!} = n \frac{M}{N}


For any integer, s=0,1,…,M, we have,

\mu_{[s]}=E(X(X-1)\dots(X-s+1))  \\= \sum_{x=0}^{M} \frac{x!}{(x-s)!}p(x; n,N,M) \\
=  \sum_{x=s}^{M} \frac{x!}{(x-s)!} \frac{\binom M x \binom {N-M}{n-x}}{\binom N n} \\
=  \sum_{x=s}^{M} \frac{x!}{(x-s)!} \frac{\binom M x \binom {N-M}{n-x}}{\binom N n} \\
=  \sum_{x=s}^{M} \frac{x!}{(x-s)!} \frac{M!}{x!(M-x)!} \frac{ \binom {N-M}{n-x}}{\binom N n} \\
= \frac{M!}{(M-s)!}  \sum_{x=s}^{M}  \frac{(M-s)!}{(x-s)!(M-x)!} \frac{ \binom {N-M}{n-x}}{\binom N n} \\
= \frac{M!}{(M-s)!} \frac{1}{\binom N n} \sum_{x=s}^{M} \binom {M-s}{x-s} { \binom {N-M}{n-x}} \\
 = \frac{M!}{(M-s)!} \frac{1}{\binom N n} \sum_{x=0}^{M-s} \binom {M-s}{x} { \binom {(N-s)-(M-s)}{n-x-s}} \\
 = \frac{M!}{(M-s)!} \frac{\binom {N-s}{n-s}}{\binom N n} \sum_{x=0}^{M-s} p(x ; n'=n-s,N'=N-s,M'=M-s) \\ 
= \frac{M!}{(M-s)!} \frac{(N-s)! n! (N-n)!}{N! (n-s)! (N-n)!)} \\
= \frac{M!}{(M-s)!} \frac{(N-s)!}{N!} \frac{n!}{(n-s)!} \\ 


E(X(X-1)) = \frac{M!}{(M-2)!} \frac{(N-2)!}{N!} \frac{n!}{(n-2)!} = \frac{n(n-1)M(M-1)}{N(N-1)} \\
------------------------------ \\
E(X^2) = E(X(X-1)) +E(X) \\
=  \frac{n(n-1)M(M-1)}{N(N-1)} + \frac{nM}{N}  \\
= \frac{nM}{N(N-1)} \left[ (n-1)(M-1) + (N-1)  \right]  \\
------------------------------ \\
V(X) =  \frac{nM}{N(N-1)} \left[ (n-1)(M-1) + (N-1)  \right]  - \frac{n^2M^2}{N^2} \\
=  \frac{nM}{N^2 (N-1)}  \left[ N(n-1)(M-1) + N(N-1) - nM(N-1)) \right]  \\
=  \frac{nM}{N^2 (N-1)}  \left[ nNM-nN-NM+N + N^2-N - nMN+nM \right]  \\
=  \frac{nM}{N^2 (N-1)}  \left[ -nN-NM + N^2 +nM \right]  \\
=  \frac{nM}{N^2 (N-1)}  (N-n)(N-M)  \\
= n \frac{M}{N} \frac{N-M}{N} \frac{N-n}{N-1}


Provided that the specified moment exists, it is straightforward (though tedious) to show via the factorial moments that…

Univariate Discrete Distributions – Johnson, Kemp, Kotz
\mu_3= \frac{nM(N-M)(N-n)(N-2M)(N-2n)}{N^3(N-1)(N-2)} \\
\mu_4 = \frac{nM(N-M)(N-2n)}{N^3(N-1)(N-2)(N-3)} \left[\frac{}{} N(N+1)-6n(N-n) \right. \\
+ \left. \frac{M(N-M)}{N^2} \left( N^2(n-2) -Nn^2 +6n(N-n) \right) \right]

It’s really tedious!!!

\mu_{[3]}=E(X(X-1)(X-2)) = \frac{n(n-1)(n-2) M(M-1)(M-2)}{N(N-1)(N-2)} \\
------------------------------ \\
\mu'_3 =  E(X^3) = \mu_{[3]} + 3 \mu'_2 - 2 \mu'_1 \\
= \frac{n(n-1)(n-2) M(M-1)(M-2)}{N(N-1)(N-2)} + 3  \frac{n(n-1)M(M-1)}{N(N-1)} -2 \frac{nM}{N} \\
\mu_3 = \mu'_3 -3 \mu'_2 \mu_1  +2 (\mu'_1)^3   \\
= \mu_{[3]}+3(\mu_2 + \mu^2) -2\mu-3 (\mu_2 + \mu^2) \mu  +2 (\mu'_1)^3 \\
= ( \mu_{[3]} +3 \mu^2 -2\mu + \mu^3)  +3\mu_2 -3\mu_2 \mu \\ 
 \mu_{[3]} +3 \mu^2 -2\mu + \mu^3 = \frac{n(n-1)(n-2)M(M-1)(M-2)}{N(N-1)(N-2)} \\ \hskip{3cm}+ \frac{3n^2M^2-2nMN^2-n^3M^3}{N^3} \\
=\frac{nM}{N^3(N-1)(N-2)} \left( n^2M^2N^2-3nM^2N^2+2M^2N^2 -3n^2MN^2 + 9nMN^2 \right. \\ \left.-6MN^2 +2n^2N^2 -6nN^2 +4N^2 + 3nMN^3-9nMN^2 +6nMN -2N^4 \right. \\ \left. +6N^3 -4N^2 -n^2M^2N^2 +3n^2M^2N -2n^2M^2 \right) \\
=\frac{nM}{N^3(N-1)(N-2)}  \left[ -3nM^2N(N-n)+3nMN^2(N-n) \right. \\ \left. -6MN(N-n) +6N^2(N-n) \right. \\
\left. +2M^2(N^2-n^2)-2N^2(N^2-n^2) \right)\\
=\frac{nM}{N^3(N-1)(N-2)}  \left( (N-n) \left[ 3nMN(N-M) +6N(N-M)  \right. \right. \\ \left. \left.-2(N+n)(N^2-M^2) \right] \right) \\
=\frac{nM(N-n)(N-M)}{N^3(N-1)(N-2)} \left( 3nMN +6N - 2(N+n)(N+M) \right)
\mu_3 = \frac{nM(N-n)(N-M)}{N^3(N-1)(N-2)} \left( 3nMN +6N - 2(N+n)(N+M) \right) \\
+ 3n \frac{M}{N} \frac{N-M}{N} \frac{N-n}{N-1} -3 n^2 \frac{M^2}{N^2} \frac{N-M}{N} \frac{N-n}{N-1} \\
= \frac{nM(N-n)(N-M)}{N^3(N-1)(N-2)} \left[ 3nMN +6N - 2N^2 -2nN -2NM -2nM  \right. \\ \left. +3N^2 -6N -3nMN +6nM \right] \\
= \frac{nM(N-M)(N-n)(N-2M)(N-2n)}{N^3(N-1)(N-2)} \\
\mu_4 = \frac{nM(N-M)(N-2n)}{N^3(N-1)(N-2)(N-3)} \left[\frac{}{} N(N+1)-6n(N-n) \right. \\
+ \left. \frac{M(N-M)}{N^2} \left( N^2(n-2) -Nn^2 +6n(N-n) \right) \right]

Survival Analysis: Competing Risk Theory


A distinctive feature of survival data is the concept of censoring. And an implicit concept in the definition of censoring is that if the study had been prolonged (or if subjects had not dropped out), eventually the outcome of interest would have been observed to occur for all the subjects. Conventional statistical methods for the analysis of survival data make the important assumption of independent or non-informative censoring. This means that at a given point in time, subjects who remain under follow-up have the same future risk for the occurrence of the event as those subjects are no longer being followed (either because of censoring or study dropout), as if losses to follow-up were random and thus non-informative.

A competing risk is an event whose occurrence precludes, the occurrence of the primary event of interest. For instance, in a study in which the primary outcome was time to death due to a cardiovascular cause, a death due to a non-cardiovascular serves as a competing risk.

Conventional statistical methods for the analysis of survival data assume that competing risk are absent. Two competing risks are said to be independent if information about a subject’s risk of experiencing one type of event provides no information about the subject’s risk of experiencing the other type of event. The methods that will be described later on will involve impeding risks which are independent of one another and further also in which competing risks are not independent of one another.

In biomedical applications, the biology often suggests at least some dependence between competing risks, which in many cases may be quite strong. Accordingly, independent competing risks may be relatively rare in biomedical applications.

When analyzing survival data in which competing risks are present, analysts frequently censor subjects when a competing event occurs. Thus, when the outcome is time to death attributable to cardiovascular causes, an analyst may consider a subject as censored once that subject dies of noncardiovascular causes. However, censoring subjects at the time of death attributable to noncardiovascular causes may be problematic.

First, it may violate the assumption of noninformative censoring: it may be unreasonable to assume that subjects who died of noncardiovascular causes (and were thus treated as censored) can be represented by those subjects who remained alive and had not yet died of any cause.

Second, even when the competing events are independent, censoring subjects at the time of the occurrence of a competing event may lead to incorrect conclusions because the event probability being estimated is interpreted as occurring in a setting where the censoring (eg, the competing events) does not occur.

In the cardiovascular example described above, this corresponds to a setting where death from noncardiovascular causes is not a possibility. Although such probabilities may be of theoretical interest, they are of questionable relevance in many practical applications, and generally lead to overestimation of the cumulative incidence of an event in the presence of the competing events.

STB – 2019

  • Consider a count variable X following a Poisson distribution with parameter θ > 0, where zero count (i.e., X = 0) is not observable. We have n observations X1, . . . , Xn from this distribution. Let  denote the sample mean.

a)  Derive the quantity for which  is an unbiased estimator. 

b)  Suppose that the observed value of  is strictly greater than 1. Show that the likelihood function of θ has a unique maximiser.

The pmf of a Poisson distribution with parameter   is given by:

We know that X follows a Poisson distribution with parameter \theta   and X=0 is not observable. Under such condition the probability mass function(pmf) of X is given by:

g(x | \theta ) =Pr⁡[ X=x | X≠0] \\
\hskip{1.8cm}=Pr⁡[X=x]/Pr⁡[X≠0] \\
\hskip{0.8cm}=f(x)/(1-f(0) ) \\
\hskip{2.8cm}=\frac{θ^x e^{-θ}} {x!(1-e^{-θ} )} \quad ,x=1,2,3,…

(a) Now,

E (X)= ∑_{x=1}^∞ E\left[x g(x)\right]= ∑_{x=1}^∞E\left[x \frac{θ^x e^{-θ}} {x!(1-e^{-θ} )} \right]= ∑_{x=1}^∞\frac{θ^x e^{-θ}} {(x-1)!(1-e^{-θ} )} \\
= \frac{\theta e^{-θ}} {(1-e^{-θ} )}∑_{x=1}^∞\frac{θ^{x-1}} {(x-1)!} = \frac{\theta e^{-θ}} {(1-e^{-θ} )}∑_{x=0}^∞\frac{θ^{x}} {x!} =  \frac{\theta} {(1-e^{-θ} )}

Also, as  are random samples drawn from the distribution so,

E(\bar{X})= \frac{1}{n} \sum_{i=1}^n E(X_i) =  \frac{\theta} {(1-e^{-θ} )} \quad
\left[ i.e. \bar{X} \text{is an u.e. of }   \frac{\theta} {(1-e^{-θ} )} \right]

(b) The Likelihood function of x1,x2,…,xn is given by:

L(θ)= ∏_{x=1}^ng(x_i|\theta ) \\
\hskip{2.4cm} =∏_{x=1}^n
((1-e^{-θ} )^{-1} \frac{ θ^{x_i} e^{-θ}}{(x_i !)} \\
\hskip{1.8cm}=\frac{1}{(e^θ-1)^n}   \frac{θ^{\sum x_i }}{∏_{i=1}^{n}x_i !}

Taking ‘ln’ on both sides we get,

l(θ)=ln(L(θ))=-n log⁡(e^θ-1)+n \bar{x}  ln⁡(θ)-ln⁡(∏x_i !)

Differentiation wrt  \theta would give,

\frac{d}{d\theta}l(\theta) =  \frac{n}{1-e^{-\theta}} + \frac{n\bar{x}}{\theta} \\
\hskip{.5cm} \frac{d^2}{d\theta^2}l(\theta) = - \frac{n}{(1-e^{-\theta})^2}-\frac{n\bar{x}}{\theta^2}


\left[\frac{d}{d\theta}l(\theta) \right]_{\theta=\hat{\theta}} =0 \Rightarrow \frac{\hat{\theta}}{1-e^{-\theta}} = \bar{x} \\
 \left[\frac{d^2}{d\theta^2}l(\theta) \right]_{\theta=\hat{\theta}}= -\frac{n\bar{x}^2}{\hat{\theta}^2}-\frac{n\bar{x}}{\hat{\theta}^2}<0

Thus, the likelihood function of θ has a unique maximiser.