# Variance and Covariance

### Variance

Variance means **spread** of a distribution

Let X be a random variable with mean μ. The variance of X – denoted by σ

^{2}or σ^{2}_{X}or V(X) or VX is defined by

assuming this expectation exists. The standard deviation is sd(X) = √V(X) and is also denoted by σ and σ

_{X}.

**Variance has the following properties.**

1. V(X) = E(X^{2}) – μ^{2}

2. If a and b are constants then V(aX+b) = a^{2}V(X)

3. If X_{1}, . . . , X_{n} are independent and a_{1}, . . . , a_{n} are constants, then

Example: Let X ~ Binomial(n, p). We write X = ∑_{i}X_{i} where X_{i} = 1 if toss i is head and X_{i} = 0 otherwise. Here, P(X_{i} = 1) = p and P(X_{i} = 0) = 1 – p.

Using the formula of expectation,

E(X_{i}) = [p x 1] + [(1 – p) x 0] = p

E(X_{i}^{2}) = [p x 1^{2}] + [(1 – p) x 0^{2}] = p

Now, V(X_{i}) = E(X_{i}^{2}) – p^{2} = p – p^{2} = p(1 – p).

Finally, V(X) = V(∑_{i}X_{i}) = ∑_{i}V(X_{i}) = ∑_{i}p(1 – p) = np(1 – p).

#### Sample mean & Sample variance

If X_{1}, . . . , X_{n} are random variables then we define the sample mean to be

and the sample variance to be

Note the denominator in sample variance as **n-1** instead of n. This is called Bessel’s correction. The mean of the sample is closer to the sample data points than the actual mean of the population. Thus the numerator term in sample variance is smaller than expected. To balance this, the denominator is reduced by 1, known as Bessel’s correction.

When writing

Sample mean = (X_{1} + X_{2 }+ . . . + X_{n}) / n

Given a sample of size *n*, consider *n* independent random variables *X _{1}*,

*X*, …,

_{2}*X*, each corresponding to one randomly selected observation.

_{n}**Each of these variables has the distribution of the population, with mean µ and standard deviation σ**.

Numpy Variance: 909.9216 Calculated variance 1: 909.9216 Calculated variance 2: 909.9216

### Covariance

Let X and Y be random variables with mean μ

_{X}and μ_{Y}and standard deviation σ_{X}and σ_{Y}. Define the covariance between X and Y byCov(X, Y) = E[(X – μ

_{X}) (Y – μ_{Y})]

and the correlation by

Covariance satisfies the following property

Cov(X, Y) = E(XY) – E(X)E(Y)

Example:

Let X and Y have joint probability mass function as follows:

y | |||||
---|---|---|---|---|---|

f(x, y) | 1 | 2 | 3 | f_{X}(x) | |

1 | 1/4 | 1/4 | 0 | 1/2 | |

x | 2 | 0 | 1/4 | 1/4 | 1/2 |

f_{Y}(y) | 1/4 | 1/2 | 1/4 | 1 |

Lets calculate µ_{X} and µ_{Y}.

µ_{X} = ∑xf_{X}(x) = (1 x 1/2) + (2 x 1/2) = 3/2

µ_{Y} = ∑yf_{Y}(y) = (1 x 1/4) + (2 x 1/2) + (3 x 1/4) = 2

Lets calculate σ_{X} and σ_{Y}.

σ_{X} = √∑(x – µ_{X})^{2}f_{X}(x) = √( (1 – 3/2)^{2}1/2 + (2 – 3/2)^{2} 1/2) = 1/2

similarly σ_{Y} = √(2/3)

Now cov(X, Y) is

and correlation coefficient,

ρ_{X, Y} = cov(X, Y)/ (σ_{X}σ_{Y}) = (1/4) / (1/2 x √(3/2)) = 0.61

Covariance matrix : [[ 985.42897959 229.23591837] [ 229.23591837 95362.44244898]] Covariance: 229.23591836734713 Pearson correlation: 0.023647286832189817 Calculated Pearson coeff: 0.024129884522642694

### Conditional Expectation

The conditional expectation of X given Y = y is

If r(x, y) is a function of x and y then

Conditional expectation corresponds to the mean X among those times in which Y = y.