Search for probability and statistics terms on Statlect
StatLect

Cayley-Hamilton theorem

by , PhD

The Cayley-Hamilton theorem shows that the characteristic polynomial of a square matrix is identically equal to zero when it is transformed into a polynomial in the matrix itself. In other words, a square matrix satisfies its own characteristic equation.

Table of Contents

Matrix polynomial

In the lecture on matrix polynomials we have explained that, if F is a field, such as the set of real numbers R or the set of complex numbers $U{2102} $, and $p:F
ightarrow F$ is an ordinary polynomial[eq1]then we can use p to define, by extension, an analogous matrix polynomial[eq2]provided that the entries of the square matrix A belong to the field F.

Characteristic polynomial

Remember that the characteristic polynomial of a square matrix A is[eq3]where [eq4] are the eigenvalues of A.

As explained above, an ordinary polynomial $pleft( z
ight) $ can be used to define a matrix polynomial $pleft( A
ight) $.

The proposition in the next section, known as Cayley-Hamilton theorem, shows that the characteristic polynomial of A is identically equal to zero when it is transformed into a polynomial in A.

The theorem

Here is the Cayley-Hamilton theorem.

Proposition Let A be a $K	imes K$ matrix. Let [eq4] be the eigenvalues of A. Then,[eq6]

Proof

Define[eq7]The matrix A has a Schur decomposition[eq8]where $T$ is an upper triangular matrix, $Q$ is a unitary matrix and $Q^{st }$ denotes the conjugate transpose of $Q$. Moreover, the diagonal entries of $T$ are the eigenvalues of A. Since $Q$ is unitary, [eq9]. Therefore,[eq10]We are going to show that [eq11]and, as a consequence, [eq12]. Define[eq13]We are going to prove by induction that the first k columns of $Y_{k}$ are zero (hence all the columns of $Y_{K}$ are zero and the proposition is true). Let us start from $Y_{1}$. Since $T$ is upper triangular, [eq14] is the only non-zero entry of the first column of $T$. Therefore, the first column of the matrix [eq15]is zero. Now suppose that the first $k-1$ columns of $Y_{k-1}$ are zero, that is,[eq16]The columns of [eq17]can be seen as linear combinations of the columns of $Y_{k-1}$ with coefficients taken from the corresponding columns of $T-lambda _{k}I$. In particular, the $j$-th column of $Y_{k}$ is [eq18]where: in step $rame{A}$ we have used the fact that $T-lambda _{k}I$ is upper triangular and, as a consequence, [eq19] when $m>j$; in step $rame{B}$ we have used the fact that the first $k-1$ columns of $Y_{k-1}$ are zero. When [eq20] is one of the first k columns of $Y_{k}$, then $jleq k$. If $j<k$, then the summation is over an empty set of indices and[eq21]If $j=k$, then the summation is over a single index and[eq22]because [eq23]Thus, we have reached the desired conclusion: the first k columns of $Y_{k} $ are zero. It follows that all the columns of $Y_{K}$ are zero and [eq24].

Thus, if $pleft( z
ight) $ is the characteristic polynomial of A, then[eq25]

An example

Let us make an example.

Example Define[eq26]The characteristic polynomial of A is[eq27]We can transform it into a polynomial in A as follows:[eq28]Let us carry out the multiplications so as to check that indeed the Cayley-Hamilton theorem holds:[eq29]

Consequence for matrix polynomials

An important consequence of the Cayley-Hamilton theorem is that any polynomial in a $K	imes K$ matrix A can be rewritten as a polynomial whose degree is at most $K-1$.

Proposition Let A be a $K	imes K$ matrix. Let $qleft( A
ight) $ be a matrix polynomial in A. Then, there exists a polynomial $r$ such that[eq30]and the degree of $rleft( A
ight) $ is at most $K-1$.

Proof

If the degree of $qleft( A
ight) $ is less than K, then there is nothing to prove. If the degree of $qleft( A
ight) $ is greater than or equal to $K $, we proceed as follows. By the Cayley-Hamilton theorem, we have[eq31]where the scalars [eq32] are obtained by expanding the product [eq33]. Thus, $A^{K}$ can be expressed as a linear combination of powers of A up to the $left( K-1
ight) $-th:[eq34]If we pre-multiply both sides of the previous equation by A, we obtain[eq35]By substituting (1) into (2), we obtain that also $A^{K+1}$ can be expressed as a linear combination of powers of A up to the $left( K-1
ight) $-th. With this technique (pre-multiply and substitute), we obtain the same result for any power $A^{K+j}$, where $j$ is any positive integer. Thus, given a polynomial $qleft( A
ight) $ of degree greater than or equal to $K,$, we can substitute all the powers of A that appear in the polynomials and are greater than or equal to K with lower powers. Hence, we have the stated result.

Solved exercises

Below you can find some exercises with explained solutions.

Exercise 1

Define[eq36]Re-write the polynomial[eq37]as a polynomial of degree 1 in A.

Solution

The characteristic polynomial of A is[eq38]Therefore, by the Cayley-Hamilton theorem, we have[eq39]We pre-multiply both sides of the previous equation so as to obtain[eq40]Then, we substitute (3) into (4) and get[eq41]Finally, we can re-write the given polynomial[eq42]

How to cite

Please cite as:

Taboga, Marco (2021). "Cayley-Hamilton theorem", Lectures on matrix algebra. https://www.statlect.com/matrix-algebra/Cayley-Hamilton-theorem.

The books

Most of the learning materials found on this website are now available in a traditional textbook format.