# Empirical Distribution Function and Estimation of Statistical Functionals

When starting with the inference problem, the most basic is the non-parametric estimation of CDF and functions of CDF.

Let X

_{1}, . . . , X_{n}~ F be IID where F is a distribution function on the real line. The emperical distribution function F̂_{n}is the CDF that puts mass 1/n at each data point X_{i}, Formally

where

To understand EDF/ECDF (Empirical Cumulative Distribution function), it gives the fraction of sample observation less than or equal to a value of x.

#### Theorem

- E(F̂
_{n}(x)) = F(x) - V(F̂
_{n}(x)) = F(x)(1 – F(x)) / n - MSE = F(x)(1 – F(x)) / n → 0

### Plug-in estimator (for estimating functions of CDF)

A statistical functional T(F) is any function of F. Examples are mean µ = ∫xdF(x) and variance σ^{2} = ∫(x – µ)^{2}dF(x) etc

The plug-in estimator of θ = T(F) is defined by

θ̂_{n}= T(F̂_{n})In other words just plugin F̂

_{n}(EDF) for the unknown F.

#### Linear Functional

A functional of the form ∫r(x)dF(x) is called a linear functional. The EDF F̂_{n}(x) is discrete, putting mass 1/n at each X_{i}. Linear functional for discrete is defined to be ∑_{j}r(x_{j})f(x_{j})

The plug-in estimator for linear functional is:

**Example**: **Estimating mean**.

Let µ = T(F) = ∫xdF(x).

The plugin estimator is μ̂ = ∫xdF̂_{n}(x) = X̄_{n}

The standard error is se = √V(X̄_{n}) = σ/√n.

The estimated standard error is σ̂/√n