# Empirical Distribution Function and Estimation of Statistical Functionals

When starting with the inference problem, the most basic is the non-parametric estimation of CDF and functions of CDF.

Let X1, . . . , Xn ~ F be IID where F is a distribution function on the real line. The emperical distribution function F̂n is the CDF that puts mass 1/n at each data point Xi, Formally where To understand EDF/ECDF (Empirical Cumulative Distribution function), it gives the fraction of sample observation less than or equal to a value of x.

#### Theorem

• E(F̂n(x)) = F(x)
• V(F̂n(x)) = F(x)(1 – F(x)) / n
• MSE = F(x)(1 – F(x)) / n → 0
• ### Plug-in estimator (for estimating functions of CDF)

A statistical functional T(F) is any function of F. Examples are mean µ = ∫xdF(x) and variance σ2 = ∫(x – µ)2dF(x) etc

The plug-in estimator of θ = T(F) is defined by

θ̂n = T(F̂n)

In other words just plugin F̂n (EDF) for the unknown F.

#### Linear Functional

A functional of the form ∫r(x)dF(x) is called a linear functional. The EDF F̂n(x) is discrete, putting mass 1/n at each Xi. Linear functional for discrete is defined to be ∑jr(xj)f(xj)

The plug-in estimator for linear functional is: Example: Estimating mean.
Let µ = T(F) = ∫xdF(x).
The plugin estimator is μ̂ = ∫xdF̂n(x) =  X̄n

The standard error is se = √V(X̄n) = σ/√n.
The estimated standard error is σ̂/√n