Let
and
be two random variables defined on the same
sample space
.
In this lecture we discuss how to compute the expected value of
when we observe the realization of
,
i.e. when we receive the information that
.
The expected value of
conditional on the information that
is called conditional expectation of
given
.
As in the case of the expected value, giving a
completely rigorous definition of conditional expected value (or conditional
expectation) requires some background in measure theory. To make things
simpler, we do not give a completely rigorous definition in this lecture. We
rather give an informal definition and we show how conditional expectation is
computed.
Definition (informal)_
The conditional expected value (or
conditional expectation) of a
random variable
given
is the weighted average of the values that
can take on, where each possible value is weighted by its respective
conditional probability (conditional on the information that
).
The expectation of a random variable
conditional on
is denoted by
and it is often called the conditional mean of
given
.
In the case in which
is a discrete random vector (as a
consequence
is a discrete random variable), the
conditional expectation of
given
is computed as
follows:
where
is the conditional probability mass
function of
given
and
is the support of
.
Note that the above summation is not guaranteed to be well-defined or can be
infinite. In case it is not well-defined,
does not possess a conditional expectation (conditional on
).
Thus, the formula for computing the conditional expectation of a discrete
random variable is a straightforward implementation of the informal definition
of conditional expectation we have given above: the conditional expectation of
a discrete variable
given
is the weighted average of the values that
can take on (the elements of
),
where each possible value
is weighted by its respective conditional probability
(conditional on
).
Example_
Let the support of
be:
and
its joint probability mass function
be:
Let
us compute the conditional probability mass function of
given
.
The support of
is:
The
marginal probability mass function of
evaluated at
is:
The
support of
is:
Thus,
the conditional probability mass function of
given
is:
The conditional expectation of
given
is:
In the next subsections we will explain how to compute the conditional
expectation of absolutely continuous random variables and of random variables
that are neither discrete nor absolutely continuous: to calculate the
conditional expectation of these variables, we will need to compute integrals
instead of summations; we can think of these integrals as the limiting cases
of the summation
.
In the case in which
is an absolutely continuous random vector
(as a consequence
is an absolutely continuous random variable),
the conditional expectation of
given
is computed as
follows:
where
is the conditional probability density
function of
given
.
As in the discrete case, the above integral is not guaranteed to be
well-defined or can be infinite. In case it is not well-defined,
does not possess a conditional expectation (conditional on
).
As we anticipated in the previous subsection, this integral can safely be
thought of as the limiting case of the summation
found in the discrete case, where
is the (infinitesimal) conditional probability of
and
can be thought of as a summation sign.
Example_
Let the support of
be:
and
its joint probability density function
be:
Let
us compute the conditional probability density function of
given
.
The support of
is:
When
,
the marginal probability density function of
is
;
when
,
the marginal probability density function
is:
Thus,
the marginal probability density function of
is:
When
evaluated at
,
it
is:
The
support of
is:
Thus,
the conditional probability density function of
given
is:
The conditional expectation of
given
is:
The general formula for computing the conditional expectation of
given
does not require
to be discrete or absolutely continuous, but it is applicable to any random
vector:
where
the integral is a Riemann-Stieltjes integral
and
is the conditional distribution function of
given
.
Of course, as in the two special cases above, the above integral is not
guaranteed to be well-defined or can be infinite. In case it is not
well-defined,
does not possess a conditional expectation (conditional on
).
Again, the above integral can safely be thought of as the limiting case of the
summation
found in the discrete case, where
is the conditional probability of
and
can be thought of as a summation sign.
The above formula follows the same logic of the formula already introduced in
the lecture on the expected value:
with
the only difference that the unconditional distribution function
has now been replaced with the conditional distribution function
.
The reader who feels unfamiliar with this formula can go back to the lecture
entitled Expected value to read an intuitive
introduction to the Riemann-Stieltjes integral and its use in probability
theory.
From the above sections, it should be clear that the conditional expectation is computed exactly as the expected value, with the only difference that probabilities and probability densities are replaced by conditional probabilities and conditional probability densities. Therefore, the properties enjoyed by the expected value are, in general, also enjoyed by the conditional expectation. For example, two important properties enjoyed by the expected value that are also enjoyed by the conditional expectation are: