# Maximum Likelihood Estimation

Let X1, . . . Xn be IID with PDF f(x; θ).

The likelihood function is defined by The log-likelihood function is defined by n(θ) = log Ln(θ)

The maximum likelihood estimator MLE, denoted by θ̂n is the value of θ that maximizes Ln(θ).

Example:
Suppose that X1, . . . , Xn ~ Bernoulli(p). The probability function is f(x;p) = px(1−p)1−x. The unknown parameter is p, then where S = ∑Xi.
Hence
n(p) = S log p + (n − S) log(1 − p).
Taking the derivate of n(p) and equating it to 0 gives us the MLE􏰞􏰞 p̂ = S/n

### Properties of Maximum likelihood

1. The MLE is consistent: θ̂n converges in probability to θ* where θ* denotes the true value of the parameter θ.
2. The MLE is equivariant: if θ̂n is the MLE of θ, then g(θ̂n) is the MLE of g(θ).
3. The MLE is asymptotically normal: θ̂n – θ* / sê ⇝ N(0, 1)
4. The MLE is asymptotically optimal or efficient: roughly, this means that among all well-behaved estimators, the MLE has the smallest variance, at least for large samples.
5. The MLE is approximately the Bayes estimator.