Two Random Variables

Introduction

In the previous post of Basic Probability, I discussed my chance of visiting Paris next year. I had a sample space like this: \( S = \{Meet, No\_more\_holiday, No\_money, Paris\_gone\_from\_Earth, ...\} \). The random variable of \( X \) was all about me going to Paris next year.

What I am going to write in this post is when we have two sample spaces, two outcomes and two random variables. Having two random variables means that we need to consider the followings: two events happen simultaneously (joint probability), only one event happens regardless of the other event (marginal probability) and one event happens because of the other event (conditional probability).

Let's define a simple sample space of me visiting Paris next year is \( S_X = \{me\_in\_paris, me\_not\_in\_paris\} \). 

Let's define another sample space to have two events at the same time. The second event is whether Emily goes to Paris next year. We now have another sample space \( S_Y = \{emily\_in\_paris, emily\_not\_in\_paris\} \). 
Paris

Joint Sample Space

The two sample spaces defined can form a new joint sample space. \[ S_{XY} = S_X \times S_Y \] This is the product of two sets referred to as the Cartesian product. We get all possible joint outcomes from the two individual sample spaces:
  • (Me not in Paris, Emily not in Paris)
  • (Me not in Paris, Emily in Paris)
  • (Me in Paris, Emily not in Paris)
  • (Me in Paris, Emily in Paris)
Hmm, this doesn't look tidy. 

Two Random Variables

Now, random variables come to rescue us. A random variable is a function that takes an outcome and translates it into a numeric value. We can define "in Paris" as 1 and "not in Paris" as 0 using random variables. \[ X(me\_in\_paris) = 1 \] \[ X(me\_not\_paris) = 0\] \[Y(Emily\_in\_paris) = 1\] \[ Y(Emily\_not\_in\_Paris) = 0 \] It is common to express the joint sample space with output of random variables. We can now write the joint sample space of \( S_{XY} \): \[ S_{XY} = \{(0, 0), (0, 1), (1, 0), (1, 1)\} \] The joint space looks much tidier now!

Independence / Dependence

Independence and dependence of two events are extremely important. Independence means that occurrence of one event has no impact on the other event. Dependence is the other way around. One event affects the chance of the other event.

Suppose Emily and I don't know each other, my decision to visit Paris next year has no impact on Emily's decision to go to Paris. We don't know each other. I go to Paris in May. Emily goes to Paris in October. Or we may happen to be in Paris at the same time, but that's just coincidence.

If we know each other, does one's decision change? Most likely. I text Emily, "let's meet in Paris June next year". She replies "okay". My visit to Paris modifies the chance of Emily visiting Paris. One event affects the outcome of the other event. 

A more classical example of dependent events (a pedagogical one) is drawing a card without replacement. There are 52 cards in a deck. If I randomly pick one card, my chance of drawing a spade ace (♠) is \( \frac{1}{52} \). The first card is a heart ace (🩵). I want a spade ace and draw a card again. Now, the probability of drawing a spade ace is \( \frac{1}{51} \). 

Probability of Two Events

We have defined the two random variables \( XY \) and the joint sample space \( S_{XY} \). The other event affects the chance of the other event. This is when an event is dependent on the other event. If two events have no impact on the result of the other event, these events are independent of each other.

Let's create example probabilities of me and Emily visiting Paris next year: \[ P(X=0) = 0.3, P(X=1) = 0.7 \] \[P(Y=0) = 0.5, P(Y=1) = 0.5 \] As you can see, I am more inclined to go to Paris next year. Emily is not sure about her trip yet.

Joint probability (independent events)

The joint probability tells us the chance of both me and Emily visiting Paris next year. The joint probability is expressed as \( P(X \cap Y) \) using the intersection symbol or simply \( P(X, Y) \).

The most important to remember is that dependence and independence change the way to calculate the joint probability of two events.

If Emily and I do not know each other, we will independently decide to visit Paris next year. In this case, we simply multiply the probability of me going to Paris by the probability of Emily going to Paris. The formula to compute the joint probability of two independent events is: \[ P(X=x, Y=y) = P(X=x) P(Y=y) \] That's it. The lower case \(x\) and \(y\) represent input values to random variables. In this example, both \(X\) and \(Y\) take \(\{0, 1\}\). So: \[ P(X=1, Y=1) = 0.7 \times 0.5 = 0.35 \] Emily and I will be in Paris next year with the 35% of chance.

Conditional probability

Computing the joint probability of dependent events is a little more complex. This is because one's decision (or outcome) can have an impact on the the other. We use the conditional probability to compute the joint probability.

The conditional probability expresses the impact of the first event on the second event. The chance of Emily's visit \(Y\) to Paris depends on my visit to Paris \(X\), then the conditional probability is: \( P(Y|X)\). If Emily's visit modifies my decision to go to Paris, the conditional probability is: \( P(X|Y) \).

The conditional probability \( P(Y|X)\) is the "probability of Y given X", focusing on Emily's visit to Paris, given my chance of visiting Paris. The conditional probability \( P(X|Y)\) is about my chance of visiting Paris, given Emily's chance of visiting Paris. 

We don't know yet the final probability of Emily visiting Paris with \( P(Y|X) \). If she doesn't like me, she will be less likely to be there. If she likes me, she wants to be there knowing I will be there.

The conditional probability of dependent events is formally: \[ P(X|Y) = \frac{P(X, Y)}{P(Y)}  \] This formula is the ratio of the joint probability of \(XY\) and the probability of \(Y\).

The conditional probability exists for independent events too. That is: \[ P(X|Y) = P(X) \] This is because a random variable \(Y\) e.g., Emily has no influence on the chance of \(X\) e.g., me, so it is just the probability of \(X\).

Joint probability (dependent events)

The formula of the conditional probability of dependent events contains the joint probability of \( P(X,Y) \) inside. We can rewrite that formula like this: \[ P(X, Y) = P(Y)P(X|Y) \] Equivalentlly, this joint probability can also be expressed like this: \[ P(X, Y) = P(X)P(Y|X) \] Let's say that Emily goes to Paris next year and I want to see her there: \[ P(X=1|Y=1) = 1.0 \] Interpretation of this is "Emily goes to Paris and I go there 100%". Emily dictates my decision. The joint probability of the dependent events, Emily in Paris and me in Paris, is: \[ P(X=1, Y=1) = P(Y=1) P(X=1|Y=1) = 0.5 \times 1.0 = 0.5 \] Remember at the beginning of this section (Probability of Two Events), we defined that Emily would go to Paris next year with a 50:50 chance. That is why \( P(X=1) = 0.5 \).

Marginal probability

When we deal with two or more random variables, and we want to focus on a probability of a single random variable e.g., \(P(X=x)\), this is referred to as the marginal probability. To obtain the marginal probability of \(P(X=x)\), we apply summation (discrete values) or integration (continuous values) to all the probabilities of other variables but \(X=x\).

Marginalisation essentially means that "we want to focus on only one variable and ignore all the other variables". With the Paris example, "we want to focus on whether I \(X\) go to Paris and we don't mind Emily \(Y\)".

The formula to compute the marginal probability of \(P(X=x)\) is as follows: \[ P(X=x) = \sum_{y \in S_Y}P(X=x, Y=y) \] This formula has the joint probability of \(P(X, Y)\). Remember that the way to compute the joint probability varies by dependence and independence of the events. In case two events are independent, the marginal probability of \( P(X=x) \) is expressed as: \[ P(X=x) = \sum_{y \in S_Y}P(X=x)P(Y=y) \] When the two events are dependent, we need to consider dependence of the joint probability for marginalising \(Y\): \[ P(X=x) = \sum_{y \in S_Y}P(Y=y)P(X=x|Y=y) \]

Wrapping Up

I hope that the difference between independent and dependent events is clear now. Let's recap:

Probabilities of me going to Paris or not. \[ P(X=0) = 0.3, P(X=1) = 0.7 \] Probabilities of Emily going to Paris or not. \[P(Y=0) = 0.5, P(Y=1) = 0.5 \] When the two events are independent, a probability table would look like below:
Independent Sample Space Emily Not Paris (Y=0) Emily Paris (Y=1) Marginal (X)
Me Not Paris (X=0) 0.15
(0.3 × 0.5)
0.15
(0.3 × 0.5)
0.3
Me Paris (X=1) 0.35
(0.7 × 0.5)
0.35
(0.7 × 0.5)
0.7
Marginal (Y) 0.5 0.5 1.0
The cell highlighted in blue is the joint probability we saw in the section "Joint probability (independent events)".

The probability table for the two dependent events is below: 
Dependent Sample Space Emily Not Paris (Y=0) Emily Paris (Y=1) Marginal (X)
Me Not Paris (X=0) 0.3
(0.3 - 0.0)
0.0
P(X=0|Y=1)=0
0.3
Me Paris (X=1) 0.2
(0.7 - 0.5)
0.5
P(Y) × P(X|Y)
0.7
Marginal (Y) 0.5 0.5 1.0
The cell highlighted in orange is the joint probability we saw in the section "Joint probability (dependent events)".

Extra: Venn Diagram

A Venn diagram is a common illustration to show probabilities. I find it a little misleading because the diagram cannot illustrate dependence of two events.
A standard Venn diagram (Sorry for poor resolution)
Here is a standard Venn diagram, expressing the probability of me going to Paris \(P(X=1)\) and the probability of Emily going to Paris \(P(Y=1)\). The first question arises. Where is the conditinal probability? 

The answer is that we cannot show it in the figure: What is P(A|B) in Venn diagram.

The second Venn diagram below shows the situation where Emily and I hate each other so much and there is no chance for both of us to visit Paris the same year.
Venn diagram: we hate each other 
This is mutually exclusive and \(P(X=1, Y=1) = 0\). This demonstrates that being mutually exclusive is NOT independent. Actually, these two events are well dependent. The joint probability of independent event would be \(P(X=1)= 0.7\; P(Y=1)=0.5\) and \(P(X=1, Y=1)=0.35\). However, the joint probability here is 0 and that means these events are dependent!

Comments

Popular posts from this blog

Digital Signal Processing Basics

How wav2vec2.0 takes input audio data

SLT 2022 Notes