The concept of expected value of a random variable is one of the most important concepts in probability theory.
The concept was first devised in the 17th century to analyze gambling games and answer questions such as:
how much do I gain - or lose - on average, if I repeatedly play a given gambling game?
how much can I expect to gain - or lose - by making a certain bet?
If the possible outcomes of the game (or the bet) and their associated probabilities are described by a random variable, then these questions can be answered by computing its expected value.
The expected value is a weighted average of the possible realizations of the random variable (the possible outcomes of the game). Each realization is weighted by its probability.
For example, if you play a game where you gain 2$ with probability 1/2 and you lose 1$ with probability 1/2, then the expected value of the game is half a dollar:
What does this mean? Roughly speaking, it means that if you play this game many times, and the number of times each of the two possible outcomes occurs is proportional to its probability, then on average you gain 1/2$ each time you play the game.
For instance, if you play the game 100 times, win 50 times and lose the remaining 50, then your average winning is equal to the expected value:
In general, giving a rigorous definition of expected value requires quite a heavy mathematical apparatus. To keep things simple, we provide an informal definition of expected value and we discuss its computation in this lecture, while we relegate a more rigorous definition to the (optional) lecture entitled Expected value and the Lebesgue integral.
The following is an informal definition of expected value.
Definition (informal) The expected value of a random variable is the weighted average of the values that can take on, where each possible value is weighted by its respective probability.
The expected value of a random variable is denoted by and it is often called the expectation of or the mean of .
The following sections discuss how the expected value of a random variable is computed.
When is a discrete random variable having support and probability mass function , the formula for computing its expected value is a straightforward implementation of the informal definition given above: the expected value of is the weighted average of the values that can take on (the elements of ), where each possible value is weighted by its respective probability .
Definition Let be a discrete random variable with support and probability mass function . The expected value of isprovided that
The symbol indicates summation over all the elements of the support .
For example, if then
The requirement that is called absolute summability and ensures that the summation is well-defined also when the support contains infinitely many elements.
When summing infinitely many terms, the order in which you sum them can change the result of the sum. However, if the terms are absolutely summable, then the order in which you sum becomes irrelevant.
In the above definition of expected value, the order of the sumis not specified, therefore the requirement of absolute summability is introduced in order to ensure that the expected value is well-defined.
When the absolute summability condition is not satisfied, we say that the expected value of is not well-defined or that it does not exist.
Example Let be a random variable with support and probability mass functionIts expected value is
When is a continuous random variable with probability density function , the formula for computing its expected value involves an integral, which can be thought of as the limiting case of the summation found in the discrete case above.
Definition Let be a continuous random variable with probability density function . The expected value of isprovided that
Roughly speaking, this integral is the limiting case of the formula for the expected value of a discrete random variable
Here, is replaced by (the infinitesimal probability of ) and the integral sign replaces the summation sign .
The requirement that is called absolute integrability and ensures that the improper integral is well-defined.
This improper integral is a shorthand forand it is well-defined only if both limits are finite. Absolute integrability guarantees that the latter condition is met and that the expected value is well-defined.
When the absolute integrability condition is not satisfied, we say that the expected value of is not well-defined or that it does not exist.
Example Let be a continuous random variable with support and probability density functionwhere . Its expected value is
This section introduces a general formula for computing the expected value of a random variable . The formula, which does not require to be discrete or continuous and is applicable to any random variable, involves an integral called Riemann-Stieltjes integral. While we briefly discuss this formula for the sake of completeness, no deep understanding of this formula or of the Riemann-Stieltjes integral is required to understand the other lectures.
Definition Let be a random variable having distribution function . The expected value of iswhere the integral is a Riemann-Stieltjes integral and the expected value exists and is well-defined only as long as the integral is well-defined.
Roughly speaking, this integral is the limiting case of the formula for the expected value of a discrete random variable Here replaces (the probability of ) and the integral sign replaces the summation sign .
The following section contains a brief and informal introduction to the Riemann-Stieltjes integral and an explanation of the above formula. Less technically oriented readers can safely skip it: when they encounter a Riemann-Stieltjes integral, they can just think of it as a formal notation which allows a unified treatment of discrete and continuous random variables and can be treated as a sum in one case and as an ordinary Riemann integral in the other.
As we have already seen above, the expected value of a discrete random variable is straightforward to compute: the expected value of a discrete variable is the weighted average of the values that can take on (the elements of the support ), where each possible value is weighted by its respective probability :or, written in a slightly different fashion,
When is not discrete the above summation does not make any sense. However, there is a workaround that allows to extend the formula to random variables that are not discrete. The workaround entails approximating with discrete variables that can take on only finitely many values.
Let ,,..., be real numbers () such that:
Define a new random variable (function of ) as follows:
As the number of points increases and the points become closer and closer (the maximum distance between two successive points tends to zero), becomes a very good approximation of , until, in the limit, it is indistinguishable from .
The expected value of is easy to compute:where is the distribution function of .
The expected value of is then defined as the limit of when tends to infinity (i.e., when the approximation becomes better and better):
When the latter limit exists and is well-defined, it is called the Riemann-Stieltjes integral of with respect to and it is indicated as follows:
Roughly speaking, the integral notation can be thought of as a shorthand for and the differential notation can be thought of as a shorthand for .
If you are not familiar with the Riemann-Stieltjes integral, make sure you also read the lecture entitled Computing the Riemann-Stieltjes integral: some rules, before reading the next example.
Example Let be a random variable with support and distribution functionIts expected value is
A completely general and rigorous definition of expected value is based on the Lebesgue integral. We report it below without further comments. Less technically inclined readers can safely skip it, while interested readers can read more about it in the lecture entitled Expected value and the Lebesgue integral.
Definition Let be a sample space, a probability measure defined on the events of and a random variable defined on . The expected value of isprovided (the Lebesgue integral of with respect to ) exists and is well-defined.
The next sections contain more details about the expected value.
An important property of the expected value, known as transformation theorem, allows to easily compute the expected value of a function of a random variable.
Let be a random variable. Let be a real function. Define a new random variable as follows:
Then,provided the above integral exists.
This is an important property. It says that, if you need to compute the expected value of , you do not need to know the support of and its distribution function : you can compute it just by replacing with in the formula for the expected value of .
For discrete random variables the formula becomes while for continuous random variables it isIt is possible (albeit non-trivial) to prove that the above two formulae hold also when is a -dimensional random vector, is a real function of variables and .
When is a discrete random vector and is its joint probability function, then
When is an continuous random vector and is its joint density function, then
If is a random variable and is another random variable such thatwhere and are two constants, then the following holds:
For discrete random variables this is proved as follows:For continuous random variables the proof isIn general, the linearity property is a consequence of the transformation theorem and of the fact that the Riemann-Stieltjes integral is a linear operator:
A stronger linearity property holds, which involves two (or more) random variables. The property can be proved only using the Lebesgue integral (see the lecture entitled Expected value and the Lebesgue integral).
The property is as follows: let and be two random variables and let and be two constants; then
Let be a -dimensional random vector and denote its components by , ..., . The expected value of , denoted by , is just the vector of the expected values of the components of . Suppose, for example, that is a row vector; then
Let be a random matrix, i.e., a matrix whose entries are random variables. Denote its -th entry by . The expected value of , denoted by , is just the matrix of the expected values of the entries of :
Denote the absolute value of a random variable by . If exists and is finite, we say that is an integrable random variable, or just that is integrable.
Let . The space of all random variables such that exists and is finite is denoted by or , where the triple makes the dependence on the underlying probability space explicit.
If belongs to , we write .
Hence, if is integrable, we write .
The following lectures contain more material about the expected value.
Introduces the conditional version of the expected value operator
Properties of the expected value
Statements, proofs and examples of the main properties of the expected value operator
Expected value and the Lebesgue integral
Provides a rigorous definition of expected value, based on the Lebesgue integral
Some solved exercises on expected value can be found below.
Let be a discrete random variable. Let its support be
Let its probability mass function be
Compute the expected value of .
Since is discrete, its expected value is computed as a sum over the support of :
Let be a discrete variable with support
and probability mass function
Compute its expected value.
Since is discrete, its expected value is computed as a sum over the support of :
Let be a discrete variable. Let its support be
Let its probability mass function be
Compute the expected value of .
Since is discrete, its expected value is computed as a sum over the support of :
Let be a continuous random variable with uniform distribution on the interval .
Its support is
Its probability density function is
Compute the expected value of .
Since is continuous, its expected value can be computed as an integral:
Note that the trick is to: 1) subdivide the interval of integration to isolate the sub-intervals where the density is zero; 2) split up the integral among the various sub-intervals.
Let be a continuous random variable. Its support is
Its probability density function is
Compute the expected value of .
Since is continuous, its expected value can be computed as an integral:
Let be a continuous random variable. Its support is
Its probability density function is
Compute the expected value of .
Since is continuous, its expected value can be computed as an integral:
Most of the learning materials found on this website are now available in a traditional textbook format.