The QR decomposition (or QR factorization) allows to express a matrix having linearly independent columns as the product of 1) a matrix Q having orthonormal columns and 2) an upper triangular matrix R.
In order to fully understand how the QR decomposition is obtained, we should be familiar with the Gram-Schmidt process.
Remember that the Gram-Schmidt process is a procedure used to transform a set of linearly independent vectors into a set of orthonormal vectors (i.e., a set of vectors that have unit norm and are orthogonal to each other).
In the case of a matrix , denote its columns by . If these columns are linearly independent, they can be transformed into a set of orthonormal column vectors by using the Gram-Schmidt process, which alternates normalization steps and projection steps:
we start with the normalizationwhere denotes the norm of ;
we project on :where is the inner product between and and is the residual of the projection, orthogonal to ;
we normalize the residual:
we project on and :where the residual is orthogonal to and ;
we keep on alternating normalization steps (where projection residuals are divided by their norms) and projection steps (where is projected on ) until we have produced a set of orthonormal vectors .
Note that the residuals can be expressed in terms of normalized vectors asfor , where we have defined
Thus, the projections can be written as
The orthonormal vectors can be adjoined to form a matrixwhose columns are orthonormal.
The coefficients of the projections can be collected in an upper triangular matrix
By computing the matrix product between and , we recover the projections in equation (1). As a matter of fact, each column of the product is a linear combination of the columns of with coefficients taken from the corresponding column of (see the lecture on matrix products and linear combinations).
Therefore, we have that
We now provide a formal statement of the QR decomposition.
Proposition Let be a matrix. If the columns of are linearly independent, then can be factorized aswhere is a matrix whose columns form an orthonormal set, and is an upper triangular matrix whose diagonal entries are strictly positive.
In the previous section we have already shown a constructive proof of how the QR decomposition is obtained. The only important detail we have not mentioned is that the linear independence of the columns of guarantees that the residuals of the projections performed in the Gram-Schmidt process are different from zero. As a consequence, the normalized vectorsare well-defined because the norms are strictly positive. Moreover, the entries on the main diagonal of are strictly positive.
Note that is invertible because a triangular matrix is invertible if its diagonal entries are strictly positive.
It is time to make an example.
Example Define the matrixThe norm of the first column isso that the first normalized vector isThe inner product between and isThe projection of the second column on isand the residual of the projection isThe norm of the residual isThus,Let us verify that and are orthogonal:We now have performed all the calculations that lead to the QR factorizationThe matrix with orthonormal columns isand the upper triangular matrix isLet us check that indeed the product of and equals :
The QR decomposition is unique.
Proposition Under the assumptions of the previous proposition, the QR decomposition is unique, that is, the matrices and satisfying the stated properties are unique.
Suppose where is a second decomposition into a matrix having orthonormal columns and an upper triangular matrix having strictly positive diagonal elements. Since the columns of are orthonormal, we have thatwhere is the conjugate transpose of , and is the identity matrix (see Non-square matrices with orthonormal columns). By the same token,If we pre-multiply both sides of the equalityby we getorIf we instead pre-multiply the equality by we obtainorBy plugging equation (3) into equation (2), we obtainThe latter equation implies that, for , the -th row of can be written as a linear combination of all the rows of with coefficients taken from the -th row of the matrix (see Matrix multiplication and linear combinations). But is triangular with strictly positive diagonal entries, so its rows are linearly independent and they form a basis for the space of vectors. As a consequence, the only way to represent the -th row of as a linear combination of all the rows of is to put a unitary coefficient on the -th row itself and a zero coefficient on all the other rows (by the uniqueness of the representation in terms of a basis). In other words, the -th row of is the -th vector of the canonical basis. Since this is true for , we have thatorThus, is a unitary matrix (its conjugate transpose is equal to its inverse). Moreover, we have thatSince the inverse of an upper triangular matrix (UT) is UT and the product of two UT matrices is UT, is UT. It is also invertible, which means that its diagonal entries are strictly positive. To sum up, is both unitary and UT with strictly positive diagonal entries. Therefore, by a result on unitary and triangular matrices, we have that and, as a consequence,andThus, the two matrices involved in the QR decomposition are unique.
An important fact that we have discussed in the previous proof but we have not separately stated until now is that the matrix in the decomposition is such thatwhere is the conjugate transpose of . As a consequence,
If has only real entries, then the conjugate transpose coincides with the transpose and the two equations above becomeand
When the matrix
being decomposed is a square
where and are both square matrices.
But a square matrix having orthonormal columns is a unitary matrix.
Therefore, the QR decomposition of a square matrix having linearly independent columns is the product of a unitary matrix and an upper triangular matrix with strictly positive entries.
The QR method is often used to estimate linear regressions.
In a linear regression we have an vector of outputs and an matrix of inputs whose columns are assumed to be linearly independent. We need to find the coefficient vector that minimizes the mean squared errors made by using the fitted valuesto predict the actual values .
The well-known solution to this problem is the so-called ordinary least squares (OLS) estimator
We can simplify the formula for the OLS estimator, avoid to invert a matrix and thus reduce the computational burden (and the possible numerical instabilities) by computing the QR decomposition of :where is and is .
Then, the OLS estimator becomesor
The latter way of writing the solution is more convenient: since is upper triangular, we do not need to invert it, but we can use the back-substitution algorithm to find the solution .
Below you can find some exercises with explained solutions.
Compute the QR decomposition of
The norm of the first column of isThus, the first orthonormal vector isThe inner product between the second column of and isThe projection of on isand the residual of the projection isThe norm of the residual isThe second orthonormal vector isThus, the QR decomposition is given byand
Please cite as:
Taboga, Marco (2017). "QR decomposition", Lectures on matrix algebra. https://www.statlect.com/matrix-algebra/QR-decomposition.
Most of the learning materials found on this website are now available in a traditional textbook format.