StatlectThe Digital Textbook

# Linear correlation

Let and be two random variables. The linear correlation coefficient (or Pearson's correlation coefficient) between and , denoted by or by , is defined as follows:where is the covariance between and and and are the standard deviations of and . Of course, the linear correlation coefficient is well-defined only as long as , and exist and are well-defined. Moreover, while the ratio is well-defined only if and are strictly greater than zero, it is often assumed that when one of the two standard deviations is zero. This is equivalent to assuming that , because when one of the two standard deviations is zero.

## Interpretation

Linear correlation is a measure of dependence (or association) between two random variables. Its interpretation is similar to the interpretation of covariance (see the lecture entitled Covariance for a detailed explanation).

The correlation between and provides a measure of the degree to which and tend to "move together": indicates that deviations of and from their respective means tend to have the same sign; indicates that deviations of and from their respective means tend to have opposite signs; when , and do not display any of these two tendencies.

Linear correlation has the property of being bounded between and :

Thanks to this property, correlation allows to easily understand the intensity of the linear dependence between two random variables: the closer correlation is to , the stronger the positive linear dependence between and is (and the closer it is to , the stronger the negative linear dependence between and is).

## Terminology

The following terminology is often used:

1. If then and are said to be positively linearly correlated (or simply positively correlated).

2. If then and are said to be negatively linearly correlated (or simply negatively correlated).

3. If then and are said to be linearly correlated (or simply correlated).

4. If then and are said to be uncorrelated. Also note that , therefore two random variables and are uncorrelated whenever .

## Example

The following example shows how to compute the coefficient of linear correlation between two discrete random variables.

Example Let be a -dimensional random vector and denote its components by and . Let the support of be and its joint probability mass function beThe support of isand its probability mass function isThe expected value of isThe expected value of isThe variance of isThe standard deviation of is:The support of is:and its probability mass function isThe expected value of isThe expected value of isThe variance of isThe standard deviation of isUsing the transformation theorem, we can compute the expected value of :Hence, the covariance between and isand the linear correlation coefficient is:

## More details

The following sections contain more details about the linear correlation coefficient.

### Correlation of a random variable with itself

Let be a random variable, then

Proof

This is proved as follows:where we have used the fact that

### Symmetry

The linear correlation coefficient is symmetric:

Proof

This is proved as follows:where we have used the fact that covariance is symmetric:

## Solved exercises

Below you can find some exercises with explained solutions.

### Exercise 1

Let be a discrete random vector and denote its components by and . Let the support of beand its joint probability mass function be

Compute the coefficient of linear correlation between and .

Solution

The support of isand its marginal probability mass function isThe expected value of isThe expected value of isThe variance of isThe standard deviation of isThe support of isand its marginal probability mass function isThe expected value of isThe expected value of isThe variance of isThe standard deviation of isUsing the transformation theorem, we can compute the expected value of :Hence, the covariance between and isand the coefficient of linear correlation between and is

### Exercise 2

Let be a discrete random vector and denote its entries by and . Let the support of beand its joint probability mass function be

Compute the covariance between and .

Solution

The support of isand its marginal probability mass function isThe mean of isThe expected value of isThe variance of isThe standard deviation of isThe support of isand its probability mass function isThe mean of isThe expected value of isThe variance of isThe standard deviation of isThe expected value of the product can be derived using the transformation theoremTherefore, putting pieces together, the covariance between and isand the coefficient of linear correlation between and is

### Exercise 3

Let be an absolutely continuous random vector with support and let its joint probability density function beCompute the covariance between and .

Solution

The support of isWhen , the marginal probability density function of is , while, when , the marginal probability density function of can be obtained by integrating out of the joint probability density as follows:Thus, the marginal probability density function of isThe expected value of isThe expected value of isThe variance of isThe standard deviation of isThe support of isWhen , the marginal probability density function of is , while, when , the marginal probability density function of can be obtained by integrating out of the joint probability density as follows:We do not explicitly compute the integral, but we write the marginal probability density function of as follows:The expected value of isThe expected value of isThe variance of isThe standard deviation of isThe expected value of the product can be computed by using the transformation theorem:Hence, by the covariance formula, the covariance between and isand the coefficient of linear correlation between and is

The book

Most learning materials found on this website are now available in a traditional textbook format.

Glossary entries
Share