Search for probability and statistics terms on Statlect
Index > Glossary

Mean squared error

by , PhD

In the theory of point estimation, the mean squared error is frequently used to assess the risk of an estimator, that is, how large are on average the losses generated by the estimation errors committed when employing the estimator in question.

When the parameter to be estimated is a scalar, the mean squared error is equal to the expected value of the square of the difference between the value taken by the estimator and the true value of the parameter.

When the parameter to be estimated is a vector, we take the Euclidean norm of the difference before computing the square.

Table of Contents


The acronym MSE is often employed.


The following is a possible definition of mean squared error.

Definition Let $widehat{	heta }$ be an estimator of an unknown parameter $	heta _{0}$. The estimation error is[eq1]When the squared error [eq2]is used as a loss function, then the risk of the estimator (i.e., the expected value of the loss) is[eq3]and it is called the mean squared error of the estimator $widehat{	heta }$.

When $	heta _{0}$ is a scalar, the squared error is[eq4]because the Euclidean norm of a scalar is equal to its absolute value. Therefore, the MSE becomes[eq5]

Bias variance decomposition

The following decomposition is often used to distinguish between the two main sources of error, called bias and variance.

Proposition The mean squared error of an estimator $widehat{	heta }$ can be written as[eq6]where.[eq7] is the trace of the covariance matrix of.$widehat{	heta }$ and[eq8]is the bias of the estimator, that is, the expected difference between the estimator and the true value of the parameter.


Suppose the true parameter and its estimator are column vectors. Then we can write:[eq9]where: in step $box{A}$ we have expanded the products; in step $box{B}$ $box{C}$ $box{D}$ we have used the linearity of the expected value operator; in step $box{E}$ we have used the fact that the trace of a square matrix is equal to the sum of its diagonal elements.

When the parameter $	heta _{0}$ is a scalar, the above formula for the bias-variance decomposition becomes[eq10]

It is then clear that the mean squared error of an unbiased estimator (an estimator that has zero bias) is equal to the variance of the estimator itself.

More details

More details about loss functions, statistical risk and the mean squared error can be found in the lecture entitled Point estimation.

Keep reading the glossary

Previous entry: Mean

Next entry: Multinomial coefficient

How to cite

Please cite as:

Taboga, Marco (2017). "Mean squared error", Lectures on probability theory and mathematical statistics, Third edition. Kindle Direct Publishing. Online appendix.

The books

Most of the learning materials found on this website are now available in a traditional textbook format.