# Conditional probability with respect to a sigma-algebra

In this lecture we provide a rigorous definition of conditional probability, based on sigma-algebras.

The main advantage of this definition with respect to a more elementary definition is that it allows us to condition on zero-probability events.

## Motivation

In the introduction to conditional probability, we have stated a number of properties that conditional probabilities should satisfy to be rational in some sense.

We have proved that, whenever , these properties are satisfied if and only if

However, we have not been able to derive a formula for probabilities conditional on zero-probability events. In other words, we have not been able to find a way to compute when .

Thus, we have concluded that the above elementary formula cannot be taken as a general definition of conditional probability because it does not cover zero-probability events.

We now discuss a completely general definition of conditional probability, which covers also the case in which . The resulting concept is called conditional probability with respect to a sigma-algebra.

The plan of the lecture is as follows:

1. We define the concept of a partition of events.

2. We show that, given a partition of events, conditional probability can be regarded as a random variable (probability conditional on a partition).

3. We show that, when no zero-probability events are involved, probabilities conditional on a partition satisfy a certain property (the fundamental property of conditional probability).

4. We require that the fundamental property of conditional probability be satisfied also when zero-probability events are involved. We show that this requirement is sufficient to unambiguously pin down probabilities conditional on a partition; therefore, it can be used to provide a completely general definition of conditional probability.

5. Finally, we discuss how to replace partitions with sigma-algebras.

## Partitions of events

Let be a sample space and let denote the probability assigned to the events .

Define a partition of events of as follows.

Definition Let be a collection of non-empty subsets of . is called a partition of events of if

1. all subsets are events;

2. if then either or ;

3. .

In other words, a partition is a subdivision of into non-overlapping and non-empty events that cover all of .

Example Consider the sample spacewhich describes the possible outcomes of the toss of a die. Define the two eventsThen, is a partition of events of . In fact,and .

## Probability conditional on a partition

Take a finite partition of events of , such that for every .

Suppose that at a certain time in the future we will be told that the realized outcome belongs to a set .

After receiving this information, we will update our assessment of the probability of an event , by computing its conditional probability

Before receiving the information, this conditional probability is unknown and can be regarded as a random variable, denoted by and defined as follows:

The random variable is the probability of conditional on the partition .

Example Let us continue with the example above, where the sample space isWe consider the partition whereWe assign equal probability to all the outcomes:We would like to analyze the conditional probability of the event We haveThe conditional probability is a discrete random variable defined as follows:Since , the probability mass function of is

## The fundamental property of conditional probability

A property of is that its expected value equals the unconditional probability :

Proof

This is proved as follows:

Example In the previous example, the probability mass function of was It is easy to verify that

The property above can be generalized as follows.

Proposition (Fundamental property) Let be a finite partition of events of such that for every . Let be any event obtained as a union of events . Let be the indicator function of . Let be defined as above. Then,

Proof

Without loss of generality, we can assume that is obtained as the union of the first () sets of the partition (we can always rearrange the sets by changing their indices):First note thatThe property is proved as follows:

## The fundamental property as a defining property

Suppose that we are not able to explicitly define because contains a zero-probability event and, therefore, we cannot use the formula to define for .

When we are not able to explicitly define , what we can do is to define implicitly, by requiring that it satisfies the fundamental property of conditional probabilityfor all events obtained as unions of events .

How can we be sure that there exists a random variable satisfying this property?

Existence is guaranteed by the following important theorem, that we state without providing a proof.

Proposition Let be an arbitrary partition of events of . Let be an event. Then, there exists at least one random variable that satisfies the propertyfor all the events obtained as unions of events . Furthermore, if and are two random variables such thatfor all , then and are almost surely equal.

Thus, a random variable satisfying the fundamental property of conditional probability exists and is unique (up to almost sure equality).

As a consequence, we can indirectly define the probability of an event conditional on the partition as .

This indirect way of defining conditional probability is summarized as follows..

Definition (Probability conditional on a partition) Let be a partition of events of . Let be an event. The probability of conditional on the partition is a random variable that satisfiesfor all events obtainable as unions of events .

As we have seen above, such a random variable is guaranteed to exist and is unique up to almost sure equality.

Different random variables satisfying the criterion in the definition are called versions of the conditional probability.

## Conditioning with respect to sigma-algebras

In rigorous probability theory, conditional probability is defined with respect to sigma-algebras, rather than with respect to partitions.

Let be a probability space.

Let be a sub-sigma-algebra of (i.e., is a sigma-algebra and ).

Let be an event.

We say that a random variable is a conditional probability of with respect to the sigma-algebra if and only if

It can be shown that this definition is equivalent to our definition of probability conditional on a partition.

In particular, if is a partition of events of and is the smallest sigma-algebra containing all the possible unions of events , then is the same as .

## How to interpret conditioning with respect to sigma-algebras

Thus, when we see the abstract definition of conditional probability with respect to a sigma-algebra , we can think about it as follows:

1. there is a partition of the sample space;

2. the sigma-algebra is the smallest sigma-algebra that contains all the elements of ;

3. at some future time we will be told that the realized outcome belongs to a set ;

4. at that time, we will be able to compute a conditional probability ;

5. until that time, this conditional probability is unknown and it can be regarded as a random variable, denoted by .

## Finer and coarser sigma-algebras

Let us get back to our toss-of-a-die example, in which the sample space is

Consider the partition

The smallest sigma-algebra containing is

Now, consider the partition

The smallest sigma-algebra containing is

Since contains all the sets included in plus some more, we say that is finer than . Conversely, we say that is coarser than .

We also say that the conditional probability is based on a larger information set as compared to . In intuitive terms, this means that the information we receive when we are able to actually calculate the conditional probability is more precise.

## Regular conditional probabilities

Until now, we have kept fixed the event in the conditional probability . Moreover, we have regarded as a random variable.

If we allow to vary, becomes a random probability measure.

Or does it? Mathematicians have found some examples (see here), in which, despite a careful choice of the versions of , it is not possible to simultaneously satisfy the two requirements that

1. be -measurable (i.e., a proper random variable) for each choice of ;

2. be a proper probability measure for each choice of (except possibly for some forming a set of measure zero).

When these two requirements can be satisfied, we say that the probability space admits a regular probability conditional on the sigma-algebra .

## Applications

The abstract definition of conditional probability with respect to a sigma-algebra is extremely useful.

One of its most important applications is the derivation of conditional probability density functions for continuous random vectors. To learn about this application, read the next lecture, on Conditional probability distributions.