Independent Random Variables and Conditional Distribution
Independent Random Variables
Two random variables are X and Y are independent if, for every A and B,
P(X ∈ A, Y ∈ B) = P(X ∈ A)P(Y ∈ B)
Two random variable X and Y which have joint pdf fX,Y are independent if and only if fX,Y(x, y) = fX(x) fY(y) for all values of x and y.
Note that fX(x) and fY(y) above are marginal distributions.
Let us take the following example of bivariate distribution.
fX(0) = 1/3, fX(1) = 2/3, fY(0) = 1/3 and fY(1) = 2/3. Here X and Y are independent because fX(0) fY(0) = f(0, 0), fX(1) fY(0) = f(1, 0), fX(0) fY(1) = f(0, 1) and fX(1) fY(1) = f(1, 1).
To check the independence of two random variables/ features in python there are many approaches and theories. One of the most famous is finding the correlation. A correlation of 0 between two features means that knowing one doesn’t give you any insight about the other. This definition is somewhere similar to the theory of independent random variable. There are various correlation methods like ‘pearson’, ‘kendall’, ‘spearman’ etc.
Conditional Distribution
Let X and Y be two discrete random variables. The conditional distribution of X given that we observe Y = y is expressed as
P(X = x| Y = y) = P(X = x, Y = y)/P(Y = y).
The conditional probability mass function is
if fY(y) > 0.
For continuous random variables, the conditional probability density function is
assuming that fY(y) > 0. Then,
