Introduction to Probability

Probability quantifies uncertainty. It is a measure of “how likely” an “event” can occur. Probability is measured on a scale of 0 to 1, both inclusive. An event that can never occur has probability 0 and an event that is certain to occur has a probability 1.

Some Basic Terminologies

  • Sample Space: The sample space Ω, is the set of possible outcomes of an experiment.
  • Sample Outcome: Points ω in Ω are called Sample Outcomes or realizations.
  • Events: Events are subsets of Ω

Example: Let us suppose that the experiment is tossing a coin twice.
Sample Space: {HH, HT, TH, TT}
Sample Outcome: HH, HT, TH, TT (each point in Sample space)
Events: first toss is head A = {HH, HT}. Both tosses are different B = {TH, HT}

  • |A| : Number of points in A (if A is finite)
  • Ac : A complement (not A)
  • A ∪ B: A union B (A or B)
  • A ∩ B or AB: A intersection B (A and B)
  • A – B : set difference (points in A that are not in B)
  • A⊂B : set inclusion (A is a subset of or equal to B)
  • Φ : Null event (always false)
  • Ω : True event (always true)

From the previous example.
|A| = 2
Ac = {TH, TT}
A ∪ B = {HH, HT, TH}
A ∩ B = {HT}
The probability of occurance of event A is:
P(A) = number of points in A/ number of points in sample space i.e. P(A) = 2/4 = 1/2.

Note that this is only possible when the coin is fair i.e. probability of Head and Tail in a toss is equal to 0.5. The sample space in which each outcome has same probability of occurance is said to have Uniform probability distribution. We will study about it in the next section.

Disjoint or Mutually exclusive events

We say that A1, A2, … are disjoint or are mutually exclusive if Ai∩Aj = φ(Null) whenever i ≠ j.

Now let’s get some intuitions on this definition. Disjoint/ Mutually exclusive events are those events whose probability of occurring together is zero. The sample outcomes in each of these events are unique and do not occur in any other event. The probability of occurring of either of them is equal to the sum of probability of each occurring.

Example: Let us suppose a dice is rolled.
Events: A = {4, 6}, B = {3, 5}
You can notice that the probability of occurring together is zero i.e. A ∩ B is Φ.
P(A) = 1/3, P(B) = 1/3 and P(A ∪ B) = 2/3 which is equal to P(A) + P(B) which implies that A and B are disjoint.

What if the events are not mutually exclusive. Then there will be some overlap of both the events. So when probability will be added, the overlap will be added twice. Therefore, formula P(A ∪ B) becomes
P(A ∪ B) = P(A) + P(B) – P(A ∩ B)
which is basically deduction of overlap of both the sets.


We assign a real number P(A) to every event A called Probability of A.
P is called Probability distribution or probability measure.

A function P that assigns a real number P(A) to each event A is a probability distribution or a probability measure if it satisfies the following three axioms:
Axiom 1: P(A) ≥ 0 for every A.
Axiom 2: P(Ω) = 1
Axiom 3: If A1, A2,… are disjoint then

There first two axioms are quite intuitive, whereas the third is the generalized formula of disjoint/mutually exclusive events. The Ai in axiom 3 can be considered as a partition of event A summarizing the formula as
P(A) = P(A1) + P(A2) + P(A3) + . . .

Let us take an example of rolling a dice. The sample space is, Ω = {1, 2, 3, 4, 5, 6}. Take any event A, getting odd numbers A = {1, 3, 5} or getting prime number A = {2, 3, 5}. All of the events will have probability greater than or equal to zero. The probability of all the points in the sample space i.e. P(A) where A = {1, 2, 3, 4, 5, 6} is 1. This gives you the concept of the first two axioms of probability.

For the third axiom, take 3 disjoint event A1 = {1}, A2 = {2, 4, 6} and A3 = {3, 5}.
Let A = A1 ∪ A2 ∪ A3 = {1, 2, 3, 4, 5, 6}
Therefore P(A) = P(A1) + P(A2) + P(A3) = 1/6 + 1/2 + 1/3 = 1
Here A1, A2, A3 are partition of sample space.

Leave A Comment