Maximum Likelihood Estimation
Let X1, . . . Xn be IID with PDF f(x; θ).
The likelihood function is defined by
The log-likelihood function is defined by ℓn(θ) = log Ln(θ)
The maximum likelihood estimator MLE, denoted by θ̂n is the value of θ that maximizes Ln(θ).
Suppose that X1, . . . , Xn ~ Bernoulli(p). The probability function is f(x;p) = px(1−p)1−x. The unknown parameter is p, then
where S = ∑Xi.
ℓn(p) = S log p + (n − S) log(1 − p).
Taking the derivate of ℓn(p) and equating it to 0 gives us the MLE p̂ = S/n
Properties of Maximum likelihood
- The MLE is consistent: θ̂n converges in probability to θ* where θ* denotes the true value of the parameter θ.
- The MLE is equivariant: if θ̂n is the MLE of θ, then g(θ̂n) is the MLE of g(θ).
- The MLE is asymptotically normal: θ̂n – θ* / sê ⇝ N(0, 1)
- The MLE is asymptotically optimal or efficient: roughly, this means that among all well-behaved estimators, the MLE has the smallest variance, at least for large samples.
- The MLE is approximately the Bayes estimator.