Search for probability and statistics terms on Statlect
StatLect

Covariance

by , PhD

Covariance is a measure of association between two random variables.

Table of Contents

Definition

Let us start with a definition of covariance.

Definition The covariance between two random variables X and Y, denoted by [eq1], is defined as[eq2]provided the above expected values exist and are well-defined.

Understanding the definition

In order to better to better understand the definition of covariance, let us analyze how it is constructed.

Covariance is the expected value of the product [eq3], where $overline{X}$ and $overline{Y}$ are defined as follows:[eq4]$overline{X}$ and $overline{Y}$ are the deviations of X and Y from their respective means.

When [eq3] is positive, it means that:

On the contrary, when [eq3] is negative, it means that:

In other words, when [eq3] is positive, X and Y are concordant (their deviations from the mean have the same sign); when [eq3] is negative, X and Y are discordant (their deviations from the mean have opposite signs).

Thus, the product [eq3] can be interpreted as a measure of similarity between $overline{X}$ and $overline{Y}$ (actually, the product is a measure of similarity). As a consequence, the covariance [eq10]tells us how similar the deviations of the two variables (from their respective means) are on average. Intuitively, we could express the concept as follows:[eq11]

When [eq12], X and Y do not display any of the above two tendencies.

An equivalent definition

The covariance between two random variables can also be defined by the formula[eq13]which is equivalent to the formula in the definition above.

Proof

The equivalence of the two definitions is proved as follows:[eq14]

It is easy to see from this formula that the covariance between X and Y exists and is well-defined only as long as the expected values [eq15], [eq16] and [eq17] exist and are well-defined.

This formula is of great practical relevance and it is used very often in these lectures. It will be often referred to as covariance formula.

Example

The following example shows how to compute the covariance between two discrete random variables.

Example Let X be a $2	imes 1$ random vector and denote its components by X_1 and X_2. Let the support of X be [eq18]and its joint probability mass function be[eq19]The support of X_1 is[eq20]and its marginal probability mass function is[eq21]The expected value of X_1 is[eq22]The support of X_2 is[eq23]and its marginal probability mass function is[eq24]The expected value of X_2 is[eq25]Using the transformation theorem, we can compute the expected value of $X_{1}X_{2}$:[eq26]Hence, the covariance between X_1 and X_2 is[eq27]

More examples, including examples of how to compute the covariance between two continuous random variables, can be found in the solved exercises at the bottom of this page.

More details

The following subsections contain more details on covariance.

Covariance of a random variable with itself

Let X be a random variable, then[eq28]

Proof

It descends from the definition of variance:[eq29]

Symmetry

The covariance operator is symmetric:[eq30]

Proof

By the definition of covariance, we have[eq31]

Variance of the sum of two random variables

Let X_1 and X_2 be two random variables. Then the variance of their sum is [eq32]

Proof

The above formula is derived as follows:[eq33]

Thus, to compute the variance of the sum of two random variables we need to know their covariance.

Obviously then, the formula[eq34]holds only when X_1 and X_2 have zero covariance.

The formula for the variance of a sum of two random variables can be generalized to sums of more than two random variables (see variance of the sum of n random variables).

Bilinearity of the covariance operator

The covariance operator is linear in both of its arguments. Let X_1, X_2 and Y be three random variables and let a_1 and a_2 be two constants. Then, the first argument is linear:[eq35]

Proof

This is proved by using the linearity of the expected value:[eq36]

By symmetry, also the second argument is linear:[eq37]

Linearity in both the first and second argument is called bilinearity.

By iteratively applying the above arguments, one can prove that bilinearity holds also for linear combinations of more than two variables:[eq38]

Variance of the sum of n random variables

The variance of the sum of n random variables is[eq39]

Proof

This is demonstrated using the bilinearity of the covariance operator (see above):[eq40]

This formula implies that when all the random variables in the sum have zero covariance with each other, then the variance of the sum is just the sum of the variances:[eq41]This is true, for example, when the random variables in the sum are mutually independent (because independence implies zero covariance).

Solved exercises

Below you can find some exercises with explained solutions.

Exercise 1

Let X be a $2	imes 1$ discrete random vector and denote its components by X_1 and X_2. Let the support of X be [eq42]and its joint probability mass function be[eq43]

Compute the covariance between X_1 and X_2.

Solution

The support of X_1 is[eq44]and its marginal probability mass function is[eq45]The expected value of X_1 is[eq46]The support of X_2 is[eq47]and its marginal probability mass function is[eq48]The expected value of X_2 is[eq49]By using the transformation theorem, we can compute the expected value of $X_{1}X_{2}$:[eq50]Hence, the covariance between X_1 and X_2 is[eq51]

Exercise 2

Let X be a $2	imes 1$ discrete random vector and denote its entries by X_1 and X_2. Let the support of X be[eq52]and its joint probability mass function be[eq53]

Compute the covariance between X_1 and X_2.

Solution

The support of X_1 is[eq54]and its marginal probability mass function is[eq55]The mean of X_1 is[eq56]The support of X_2 is[eq57]and its probability mass function is[eq58]The mean of X_2 is[eq59]The expected value of the product $X_{1}X_{2}$ can be derived by using the transformation theorem:[eq60]Therefore, by putting pieces together, we obtain that the covariance between X_1 and X_2 is[eq61]

Exercise 3

Let X and Y be two random variables such that[eq62]

Compute the following covariance:[eq63]

Solution

By the bilinearity of the covariance operator, we have[eq64]

Exercise 4

Let [eq65] be a continuous random vector with support: [eq66]In other words, $R_{XY}$ is the set of all couples $left( x,y
ight) $ such that [eq67] and [eq68]. Let the joint probability density function of [eq65] be[eq70]Compute the covariance between X and Y.

Solution

The support of X is[eq71]thus, when [eq72], the marginal probability density function of X is 0, while, when [eq73], the marginal probability density function of X is[eq74]Therefore, the marginal probability density function of X is[eq75]The expected value of X is[eq76]The support of Y is[eq77]When [eq78], the marginal probability density function of Y is 0, while, when [eq79], the marginal probability density function of Y is[eq80]Therefore, the marginal probability density function of Y is[eq81]The expected value of Y is:[eq82]The expected value of the product $XY$ can be computed thanks to the transformation theorem:[eq83]Hence, by the covariance formula, the covariance between X and Y is[eq84]

Exercise 5

Let [eq65] be a continuous random vector with support [eq86]and its joint probability density function be[eq87]Compute the covariance between X and Y.

Solution

The support of Y is[eq88]When [eq89], the marginal probability density function of Y is 0, while, when [eq90], the marginal probability density function of Y is[eq91]By putting pieces together, we have that the marginal probability density function of Y is[eq92]The expected value of Y is[eq93]The support of X is[eq94]When [eq95], the marginal probability density function of X is 0, while, when [eq96], the marginal probability density function of X is:[eq97]We do not explicitly compute the integral, but we write the marginal probability density function of X as follows:[eq98]The expected value of X is[eq99]The expected value of the product $XY$ can be computed thanks to the transformation theorem:[eq100]Hence, the covariance formula gives[eq101]

Exercise 6

Let X and Y be two random variables such that[eq102]

Compute the following covariance:[eq103]

Solution

By the bilinearity of the covariance operator, we have that[eq104]

How to cite

Please cite as:

Taboga, Marco (2021). "Covariance", Lectures on probability theory and mathematical statistics. Kindle Direct Publishing. Online appendix. https://www.statlect.com/fundamentals-of-probability/covariance.

The books

Most of the learning materials found on this website are now available in a traditional textbook format.