# Confidence set

For a parameter θ, a 1-α confidence interval is C

_{n}= (a, b)

where a = A(X_{1},. . , X_{n}) and b = B(X_{1},. . , X_{n}) are the functions of the data such thatP

_{θ}(θ ∈ C_{n}) ≥ 1 – α, for all θ ∈ ϴ

In simpler words, (a, b) traps θ with a probability 1 – α. We call 1 – α the **coverage** of the confidence interval. When we say 95% confidence interval, that means we choose α = 0.05.

Suppose that we calculate the mean of the height of 50 people chosen at random from the world. The mean turns out to 170cm. We know that the standard deviation of the height of these men is 20cms. The 95% confidence interval is 170 ± 5.54cms.

It means that the mean of the height of all the people in the world will lie between 164.46 cms to 175.54cms and the probability of happening this is 95%. Now Let’s see how to calculate the confidence interval. For this we need 3 data**Number of observations** **n**: 50**Mean (Sample mean)** **X̄**: 170**Standard deviation (of sample) s**: 20

The confidence interval is X̄ ± Z*(s/√n)

The only parameter unseen is Z, we find that using Z table below this example. The value of Z for 95% confidence interval is 1.96. Therefore the confidence interval is

170 ± 1.96 * ( 20/√50) = 170 ± 5.54cms

ConfidenceInterval | Z |

80% | 1.282 |

85% | 1.440 |

90% | 1.645 |

95% | 1.960 |

99% | 2.576 |

99.5% | 2.807 |

99.9% | 3.291 |

### Normal based confidence interval (Derivation and intuition)

You just saw the fancy little method to find a confidence interval, ever wondered where the formula comes from. Let’s find out.

Suppose that *θ̂*_{n} ≈ N(θ, ŝe). Let Φ be the CDF of a standard Normal and

z_{α/2} = Φ^{-1}(1 – (α/2)) i.e. P(Z > z_{α/2}) = α/2 and

P(-z_{α/2} < Z < z_{α/2}) = 1 – α where Z ~ N(0, 1). Let

C_{n} = (*θ̂*_{n} – z_{α/2}ŝe, *θ̂*_{n} + z_{α/2}ŝe) then,

P_{θ}(θ ∈ C_{n}) → 1 – α

Proof: Let Z_{n} = (*θ̂*_{n} – θ)/ŝe. By assumption Z_{n} ⇝ Z where Z ~ N(0, 1). Hence

Note that ŝe i.e. standard error for a sample of the population is **σ/√n** and **X̄** is the point estimate. This is imputed in the above formula to get X̄ ± Z*(**s/√n**) (equation used in the previous example) and the confidence table is computed from the CDF of Standard Normal (Φ) table.