7. Preconditioning Krylov methods

In this section we will discuss some preconditioners and how to analyse them. The most important question is how quickly the preconditioned GMRES algorithm converges for a given matrix and preconditioner. We will focus on the link between stationary iterative methods and preconditioners.

7.1. Stationary iterative methods

As we have already discussed, given a matrix equation \(Ax=b\), iterative methods provide a way of obtaining a (hopefully) better approximate solution \({x}^{k+1}\) from a previous approximate \({x}^k\). Stationary iterative methods are defined from splittings as follows.

Definition 7.1 (Stationary iterative methods)

A stationary iterative method is constructed from matrices \(M\) and \(N\) with \(A=M+N\). Then the iterative method is defined by

\[M{x}^{k+1}=-N{x}^k+{b}.\]

The word “stationary” refers to the fact that exactly the same thing is done at each iteration. This contrasts with Krylov methods such as GMRES, where the sequence of operations depends on the previous iterations (e.g. a different size least square system is solve in each GMRES iteration).

In this section we will introduce/recall some classic stationary methods.

Definition 7.2 (Richardson iteration)

For a chosen parameter \(\omega>0\), take \(M=I/\omega\). This defines the iterative method given by

\[{x}^{k+1} = {x}^k+\omega\left({b}-A{x}^k\right).\]

Richardson, L.F. (1910). The approximate arithmetical solution by finite differences of physical problems involving differential equations, with an application to the stresses in a masonry dam. Philos. Trans. Roy. Soc. London Ser. A 210: 307-357.

This approach is convenient for parallel computing, because each entry in \(x^{k+1}\) can be updated independently, once \(Ax^k\) has been evaluated.

Definition 7.3 (Jacobi’s method)

Split \(A=L+D+U\) with \(L\) strictly lower triangular, \(D\) diagonal and \(U\) strictly upper triangular, i.e.

\[L_{ij}=0, \, j\geq i, \quad D_{ij}=0, \, i\neq j, \quad U_{ij}, \, i\geq j.\]

Then, Jacobi’s method is

\[D{x}^{k+1} = {b}-(L+U){x}^k.\]

\(D\) is very cheap to invert because it is diagonal; entries in \(x^{k+1}\) can be updated independently once \((L+U)x^k\) has been evaluated.

Jacobi, C.G.J. (1845). Ueber eine neue Aufloesungsart der bei der Methode der kleinsten Quadrate vorkommenden linearen Gleichungen, Astronomische Nachrichten, 22, 297-306.

Definition 7.4 (Gauss-Seidel Method)

Split \(A=L+D+U\) with \(L\) strictly lower triangular, \(D\) diagonal and \(U\) strictly upper triangular. The Gauss-Seidel method (forward or backwards) is

\[(L+D){x}^{k+1} = {b}-U{x}^k, \quad\mbox{or},\quad (U+D){x}^{k+1} = {b}-L{x}^k.\]

Each Gauss-Seidel iteration requires the solution of a triangular system by forward/backward substitution.

Exercise 7.5

Show that forward Gauss-Seidel is a modification Jacobi’s method but using new values as soon as possible.

Definition 7.6 (Scaled Gauss-Seidel method)

We introduce a scaling/relaxation parameter \(\omega>0\) and take \(M=D/\omega+L\), so that

\[\left(\frac{1}{\omega}D+L\right){x}^{k+1} = {b}+\left(\left(\frac{1}{\omega}-1\right)D-U\right){x}^k.\]

For \(\omega=1\), we recover Gauss-Seidel. For \(1<\omega<2\), we often obtain faster convergence. This is called Successive Over-Relaxation (SOR). The optimal value of \(\omega\) is known for some problems. This was state of the art for numerical solution of PDEs in the 50s and 60s.

  • Richardson and Jacobi are simultaneous displacement methods: updates can be done simultaneously (e.g. on a GPU). Changing variables by a permutation does not alter the algorithm.

  • Gauss-Seidel and SOR are successive displacement methods: we can only overwrite the old vector with the new one element by element. Successive displacement methods usually converge faster, and changing variables by a permutation does alter the algorithm.

7.2. Using splitting methods as preconditioners

A (non-symmetric) preconditioner \(\hat{A}\) can be built from a splitting method by applying one iteration with initial guess \({x}^0={0}\).

A preconditioner is used to compute \(v\) by solving

\[\hat{A}v = r,\]

such that \(Av \approx r\), and the above equation is easy to solve. We can use a stationary method by setting \(v=x^1\) and \(x^0=0\),

to get

\[Mv:= M{x}^1 = -N\underbrace{x^0}_{=0} + r = r,\]

i.e. we are choosing \(\hat{A}=M\). Later we shall see how to relate convergence properties of splitting methods to the convergence of preconditioned CG using \(\hat{A}=M\).

7.3. Symmetric iterative methods

Consider a symmetric matrix \(A=A^T\) If we can build iterative methods from the splitting \(A=M+N\), then we can also build iterative methods from the splitting \(A=A^T=M^T+N^T\). We can then combine them together.

Definition 7.7 (Symmetric iterative method)

Given a splitting \(A=M+N\), a symmetric method performs one stationary iteration using \(M+N\), followed by one stationary iteration using \(M^T+N^T\), i.e.

\[M{x}^{k+\frac{1}{2}}=-N{x}^k + {b}, \quad M^T{x}^{k+1}=-N^T{x}^{k+\frac{1}{2}} + {b}.\]
Example 7.8 (Symmetric Successive Over-Relaxation (SSOR).)

For a symmetric matrix \(A=L+D+U\), \(L=U^T\). The symmetric version of SOR is then

\[\begin{split}(L+\frac{1}{\omega}D){x}^{k+\frac{1}{2}}&=\left(\left(\frac{1}{\omega}-1\right) D-U\right){x}^k + {b}, \\ (U+\frac{1}{\omega}D){x}^{k+1}&=\left(\left(\frac{1}{\omega}-1\right) D-L\right){x}^{k+\frac{1}{2}} + {b}.\end{split}\]

Some Krylov methods, notably the Conjugate Gradient method, require the preconditioner \(\hat{A}\) to be symmetric. We can build symmetric preconditioners from symmetric splitting methods. Write the symmetric iteration as a single step with \({x}^0={0}\).

\[\begin{split}M^T{x}^{1}&=(M-A)^T{x}^{\frac{1}{2}} + {b}, \\ &= (M-A)^TM^{-1}{b} + {b}, \\ & = (M^T + M-A)M^{-1}{b},\end{split}\]

so that

\[x^1 = M^{-T}(M^T + M - A)M^{-1}b,\]

i.e. \(\hat{A}^{-1}=M^{-T}(M^T + M-A)M^{-1}\).

Example 7.9 (Symmetric Gauss-Seidel preconditioner)

\(\hat{A}^{-1} = (L+D)^{-T}D(L+D)^{-1}\).

7.4. Convergence criteria for stationary methods

In this section we will look at the convergence of stationary methods. This is relevant because it relates directly to the convergence properties of the corresponding preconditioned Krylov method when the stationary method is used as a preconditioner.

For a splitting \(A=M+N\), recall that the iterative method is

\[M{x}^{k+1} = -N{x}^k + {b}.\]

On the other hand, the solution \({x}^*\) of \(A{x}={b}\) satisfies

\[M{x}^* = -N{x}^* + {b}.\]

Subtracting these two equations gives

\[M{e}^{k+1} = -N{e}^k, \quad {e}^k = {x}^*-{x}^k,\]

so

\[{e}^{k+1}=C{e}^k \implies {e}^k = C^k{e}^0, \quad C:=-M^{-1}N = -M^{-1}(A-M) = I - M^{-1}A.\]

\(C\) is called the iteration matrix.

For a symmetric iterative method,

\[M{x}^{k+\frac{1}{2}}=-N{x}^k + {b}, \quad M^T{x}^{k+1}=-N^T{x}^{k+\frac{1}{2}} + {b},\]

we subtract \(Ax^*=b\) from both equations to get

\[M{e}^{k+\frac{1}{2}}=-N{e}^k, \quad M^T{e}^{k+1}=-N^T{e}^{k+\frac{1}{2}}.\]

Then eliminating \(e^{k+1/2}\) gives

\[M^Te^{k+1} = N^TM^{-1}Ne^k,\]

i.e. the iteration matrix is

\[C = M^{-T}N^TM^{-1}N\]
Exercise 7.10

Show that

\[C = I-\left(M_s\right)^{-1}A,\]

where

\[M_s = M(M+M^T-A)^{-1}M^T.\]

From the above exercise, note the relationship between \(M_s\) and \(\hat{A}\) for symmetric methods.

Definition 7.11 (Convergence of stationary methods)

An iterative method based on the splitting \(A=M+N\) with iteration matrix \(C=-M^{-1}N\) is called {convergent} if

\[{y}^k = C^k{y}^0 \to {0}\]

for any initial vector \({y}^0\).

Exercise 7.12

Show that this implies that \({e}^k={x}^*-{x}^k\to{0}\) i.e. \({x}^k\to x^*\) as \(k\to\infty\).

Theorem 7.13 (A first convergence criterion)
If \(\|C\|<1\), using the operator norm for some chosen vector norm,

then the iterative method converges.

Proof
\[\begin{split}\|{y}^k\| = & \|C^k{y}^0\| \\ \leq & \|C^k\|\|{y}^0\| \\ \leq & \ \left(\|C\|\right)^k\|{y}^0\| \to 0\quad\mbox{as}\,k\to\infty.\end{split}\]

This is only a sufficient condition. There may be matrices \(C\) with \(\|C\|>1\), but the method is still convergent.

To obtain a necessary condition, we need to use the spectral radius.

Definition 7.14

The spectral radius \(\rho(C)\) of a matrix \(C\) is the maximum of the absolute values of all the eigenvalues \(\lambda_i\) of \(C\):

\[\rho(C) = \max_{1\leq i\leq n}|\lambda_i|.\]
Theorem 7.15

An iterative method converges \(\iff \rho(C)<1\).

Proof

[Proof that \(\rho(C)\geq 1\implies\) non-convergence]

If \(\rho(C)\geq 1\), then \(C\) has an eigenvector \({v}\) with \(\|{v}\|_2=1\) and eigenvalue \(\lambda\) with \(|\lambda|>1\). Then

\[\|C^k{v}\|_2 = \|\lambda^k{v}\|_2 = |\lambda|^k\|{v}\|_2\geq 1,\]

which does not converge to zero.

[Proof that \(\rho(C)< 1\implies\) convergence]

Assume a linearly independent eigenvalue expansion (not necessary for the proof but it simplifies things a lot) \({z} = \sum_{i=1}^n\alpha_i{v}_i\). Then,

\[C^k{z} = \sum_{i=1}^n\alpha_iC^k{v}_i = \sum_{i=1}^n\alpha_i\lambda^k{v}_i\to 0.\]
  • For symmetric matrices \(B\), \(\rho(B)=\|B\|_2\), so

the two convergence theorems are related.

  • If \(\|C\|=c<1\), then

    \[\|{e}^{k+1}\| = \|C{e}^k\| \leq \|C\|\|{e}^k\| = c\|{e}^k\|.\]

    This guarantees that the error will be reduced by a factor of at least \(c\) in each iteration. If we only have \(\rho(C)<1\), not \(\|C\|<1\) then the error may not converge monotonically.

Example 7.16 (Range of SOR parameter)

We can use this to analyse the SOR parameter \(\omega\).

\[\left(\frac{1}{\omega}D+L\right){x}^{k+1} = {b}+\left(\left(\frac{1}{\omega}-1\right)D-U\right){x}^k\]

What values of \(\omega\)? For SOR,iteration matrix \(C\) is

\[C = \left(\frac{1}{\omega}D+L\right)^{-1} \left(\frac{1-\omega}{\omega}D-U\right) = (D+\omega L)^{-1} ((1-\omega)D-\omega U).\]

so

\[\begin{split}\det(C) & = \det\left((D+\omega L)^{-1} ((1-\omega)D-\omega U)\right) \\ & = \det\left((D+\omega L)^{-1}\right)\det\left( (1-\omega)D - \omega U\right) \\ & = \det\left(D^{-1}\right)\det(D)\det\left((I-\omega I) - \omega D^{-1}U\right)\\ & = \det\left((1-\omega)I\right) = (1-\omega)^n.\end{split}\]

The determinant is the product of the eigenvalues, hence \(\rho(C)<1\) requires \(|1-\omega|<1\).

7.5. Splitting methods as preconditioners

Recall that preconditioned GMRES converges well if the eigenvalues of \(\hat{A}^{-1}A\) are clustered together.

Theorem 7.17

Let \(A\) be a matrix with splitting \(M+N\), such that \(\rho(C) < c < 1\). Then, the eigenvalues of the left preconditioned matrix \(\hat{A}^{-1}A\) with \(\hat{A}=M\) are located in a disk of radius \(c\) around \(1\) in the complex plane.

Proof
\[C=-M^{-1}N = M^{-1}(M-A) = I-M^{-1}A.\]

Then,

\[1>c>\rho(C)=\rho(I-M^{-1}A),\]

and the result follows since \(I\) and \(M^{-1}A\) have a simultaneous eigendecomposition.

We deduce that good convergence of the GMRES algorithm occurs when \(c\) is small.

For symmetric splittings, we have already observed that the iteration matrix is

\[C = I-\left(M_s\right)^{-1}A,\]

where

\[M_s = M(M+M^T-A)^{-1}M^T.\]

For symmetric splittings we can say a little more about the preconditioner.

Theorem 7.18

Let \(A\) be a matrix with splitting \(M+N\), such that the symmetric splitting has iteration matrix

\[\rho(C) = c < 1,\]

and assume further that \(M_s\) is positive definite.

Then, the eigenvalues of the symmetric preconditioned matrix \(\hat{A}^{-1}A\) are contained in the interval \([1-c,1+c]\).

Proof

We have

\[\begin{split}C & = I-\left(M_s\right)^{-1}A, \\ & = I - \hat{A}^{-1}A,\end{split}\]

so \(\rho(I- \hat{A}^{-1}A) = \rho(C) = c\). Further, \(M_s\) is symmetric and positive definite, so there exists a unique symmetric positive definite matrix square root \(S\) such that \(SS = M_s\). Then,

\[M_s^{-1}A = SSA = S(SAS)S^{-1}.\]

Thus, \(M_s^{-1}A\) is similar to (and therefore has the same eigenvalues as) \(SAS\), which is symmetric, and therefore has real eigenvalues, and the result follows.

7.6. Convergence analysis for Richardson

First we examine Richardson iteration. In the unscaled case,

\[{x}^{k+1} = {x}^k - \left(A{x}^k - {b}\right), \quad M = I, \, N = A-I, \, \implies C = I-A.\]

Let \({e}\) be an eigenvector of \(A\) with eigenvalue \(\lambda\), so \(A{e}=\lambda{e}\). Then \((I-A){e}={e}-\lambda{e}=(1-\lambda){e}\). So, \({e}\) is an eigenvector of \(I-A\) with eigenvalue \(1-\lambda\). Richardson’s method will converge if \(\rho(C)<1\) i.e. \(|1-\lambda|<1\) for all eigenvalues \(\lambda\) of \(A\).

This is restrictive, which motivates the scaled Richardson iteration,

\[{x}^{k+1} = {x}^k - \omega\left(A{x}^k - {b}\right), \quad M = \frac{I}{\omega}, \, N = A-\frac{I}{\omega}, \, \implies C = I-\omega A.\]

If \(A\) has eigenvalues \(\lambda_1,\lambda_2,\ldots,\lambda_n\) then the iterative matrix \(C\) has eigenvalues \(1-\omega\lambda_1,1-\omega\lambda_2,\ldots,1-\omega\lambda_n\). This requires \(|1-\omega\lambda_i|<1\), \(i=1,\ldots,n\), for convergence.

If, further, \(A\) is symmetric positive definite, then all eigenvalues are real and positive. Then, all of the eigenvalues of \(C\) lie between \(1-\omega\lambda_{\min}\) and \(1-\omega\lambda_{\max}\). We can minimise \(\rho(C)\) by choosing \(\omega=2/(\lambda_{\min}+\lambda_{\max})\). The resulting iteration matrix has spectral radius

\[\rho(C) = 1-2\frac{\lambda_{\min}}{\lambda_{\min}+\lambda_{\max}} = \frac{\lambda_{\max}-\lambda_{\min}}{\lambda_{\min}+\lambda_{\max}}.\]

7.7. Convergence analysis for symmetric matrices

For a symmetric positive definite matrix \(A\), recall the Rayleigh Quotient formula,

\[\lambda_{\max}=\max_{{x}\ne 0}\frac{{x}^TA{x}}{{x}^T{x}}\equiv \|A\|_2^2, \quad \lambda_{\min}=\min_{{x}\ne 0}\frac{{x}^TA{x}}{{x}^T{x}},\]

implying that

\[\lambda_{\min}\|{y}\|_2^2\leq {y}^TA{y} \leq\lambda_{\max}\|{y}\|_2^2\]

for any non-zero vector \({y}\).

Definition 7.19 (\(A\)-weighted norm)

For symmetric positive definite \(A\), we can define the weighted vector norm

\[\|{x}\|_A = \sqrt{{x}^TA{x}},\]

and the corresponding matrix (operator) norm

\[\|B\|_A = \|A^{1/2}BA^{-1/2}\|_2.\]

These norms are useful for studying convergence of iterative methods for \(A{x}={b}\) in the symmetric positive definite case.

Theorem 7.20

For a splitting \(A=M+N\), if the (symmetric) matrix \(M+M^T-A\) is positive definite then

\[\|I-M^{-1}A\|_A<1.\]
Proof

If \({y}=(I-M^{-1}A){x}\), \({w}=M^{-1}A{x}\), then

\[\begin{split}\|{y}\|_A^2 & = ({x}-{w})^TA({x}-{w}) = {x}^TA{x}-2{w}^TM{w} + {w}^TA{w} \\ &= {x}^TA{x}-{w}^T(M+M^T){w} + {w}^TA{w} \\ &= {x}^TA{x}-{w}^T(M+M^T-A){w} \\ &\leq \|{x}\|_A^2 - \mu_{\min}\|{w}\|^2_2,\end{split}\]

where \(\mu_{\min}\) is the (positive) minimum eigenvalue of \(M^T+M-A\).

Further,

\[\begin{split}\|{w}\|_2^2 & = {x}^TA\left(M^{-1}\right)^TM^{-1}A{x} \\ &= \left(A^{1/2}{x}\right)^TA^{1/2} \left(M^{-1}\right)^TM^{-1}A^{1/2}\left(A^{1/2}{x}\right) \\ &\geq \hat{\mu}_{\min}\|A^{{1/2}}{x}\|_2^2 = \hat{\mu}_{\min}\|{x}\|^2_A,\end{split}\]

where \(\hat{\mu}_{\min}\) is the minimum eigenvalue of \(A^{1/2}\left(M^{-1}\right)^TM^{-1}A^{1/2}\) i.e. the square of the minimum eigenvalue of \(M^{-1}A^{1/2}\), which is invertible so \(\hat{\mu}_{\min}>0\). If \({y}=(I-M^{-1}A){x}\), \({w}=M^{-1}A{x}\), then

\[\|{y}\|^2_A \leq \left(1-\mu_{\min}\hat{\mu}_{\min}\right)\|{x}\|_A^2<\|{x}\|_A^2.\]

This enables us to show the following useful result for symmetric positive definite matrices.

Theorem 7.21

Let \(A\) be a symmetric positive definite matrix with splitting \(A=M+N\), if \(M\) is positive definite, then

\[\rho(I-M^{-1}A) = \|I-M^{-1}A\|_A=\|I-M^{-1}A\|_M.\]
Proof
\[\begin{split}I-A^{1/2}M^{-1}A^{1/2} &= A^{1/2}(I-M^{-1}A)A^{-1/2}, \\ I-M^{-1/2}AM^{-1/2} &= M^{1/2}(I-M^{-1}A)M^{-1/2}, \\\end{split}\]

so \(I-M^{-1}A\), \(I-A^{1/2}M^{-1}A^{1/2}\), and \(I-M^{-1/2}AM^{-1/2}\) all have the same eigenvalues, since they are similar matrices. Hence,

\[\begin{split}\rho(I-M^{-1}A) & = \rho(I-A^{1/2}M^{-1}A^{1/2}) \\ & = \|I-A^{1/2}M^{-1}A^{1/2}\|_2 \\ & = \|I-M^{-1}A\|_A,\end{split}\]

and similarly for \(I-M^{-1/2}AM^{-1/2}\).

The consequence of this is that if \(M+M^T-A\) is symmetric positive definite then there is a guaranteed reduction in the \(A\)-norm of the error in each iteration. If \(M\) is also symmetric positive definite the there is guaranteed reduction in the \(M\)-norm of the error in each iteration.

Now we apply this to the convergence of Jacobi iteration. In this case \(M=D\), so \(M^T+M-A=2D-A\) which may not be positive definite. We generalise to scaled Jacobi iteration with \(M=D/\omega\).

Proposition 7.22

Let \(A\) be a symmetric positive definite matrix. Let \(\lambda\) be the (real) maximum eigenvalue of \(D^{-1/2}AD^{-1/2}\). If \(\omega < 2/\lambda\) then scaled Jacobi iteration converges.

Proof

For scaled Jacobi iteration with \(M=D/\omega\), we have \(M^T+M-A=2D/\omega-A\). To check positive definiteness we need to show that

\[x^T\left(\frac{2}{\omega}D - A\right)x > 0,\]

for all \(x\neq 0\).

To show this, we write \(x = D^{-1/2}y\), so that

\[\begin{split}x^T\left(\frac{2}{\omega}D - A\right)x & = y^T\left(\frac{2}{\omega}I - D^{-1/2}AD^{-1/2}\right)y \\ & \geq \mu \|y\|^2 > 0,\end{split}\]

provided that the minimum eigenvalue \(\mu\) of \(F=\frac{2}{\omega}I - D^{-1/2}AD^{-1/2}\) is positive (it is real since \(F\) is symmetric). We have \(\mu=2/\omega - \lambda\) Hence, \(2D/\omega-A\) is positive definite (so scaled Jacobi converges) if \(2/\omega-\lambda>0\) i.e. \(\omega<2/\lambda\).

Proposition 7.23

Let \(A\) be a symmetric positive definite matrix. Then Gauss-Seidel iteration always converges.

Proof

For Gauss-Seidel,

\[M^T+M-A = (D+L)^T + D+L - A = D+U+D+L-A = D,\]

which is symmetric-positive definite, so Gauss-Seidel always converges.

Proposition 7.24

Let \(A\) be a symmetric positive definite matrix. Then SOR converges provided that \(0<\omega 2\).

Proof

For SOR,

\[\begin{split}M^T+M-A =& \left(\frac{1}{\omega}D+L\right)^T + \frac{1}{\omega}D+L - A \\ &= \frac{2}{\omega}D +U+L-(L+D+U)=\left(\frac{2}{\omega}-1\right)D,\end{split}\]

which is symmetric positive definite provided that \(0<\omega<2\).

7.8. An example matrix (Non-examinable in 2024/25)

We consider stationary methods for an example arising from the finite difference discretisation of the two point boundary value problem

\[-\frac{d^2u}{dx^2} = f, \quad u(0) = u(1) = 0.\]

Here, \(f\) is assumed known and we have to find \(u\). We approximate this problem by writing \(u_k = u(k/(n+1))\) for \(k=0,1,2,\ldots,n+1\). From the boundary conditions we have \(u_0=u_{n+1}=0\), meaning we just have to find \(u_k\) with \(1\leq k \leq n\), that solve the finite difference approximation

\[-u_{k-1} + 2u_k - u_{k+1} = f_k, \quad 1\leq k \leq n,\]

where \(f_k=f(k/n)/n^2\), \(1\leq k\leq n\). Taking into account the boundary conditions \(u_0=u_{n+1}=0\), we can write this as a matrix system \(Ax=b\) with

\[\begin{split}A = \begin{pmatrix} 2 & -1 & \cdots & \cdots & 0 \\ -1 & 2 & -1 & \cdots & 0 \\ 0 & -1 & 2 & \cdots & \vdots \\ \vdots & & & & \vdots \\ \vdots & 0 & -1 & 2 & -1 \\ 0 & 0 & \cdots & -1 & 2 \end{pmatrix}, \quad x = \begin{pmatrix} u_1 \\ u_2 \\ \vdots \\ u_n \end{pmatrix}, \quad b = \begin{pmatrix} f_1 \\ f_2 \\ \vdots \\ f_n \end{pmatrix}.\end{split}\]

However, it is possible to evaluate \(Ax\) and to implement our classic stationary iterative methods without ever forming \(A\). This is critically important for efficient implementations (especially when extending to 2D and 3D problems).

We introduce this example matrix because it is possible to compute spectral radii for all of the matrices arising in the analysis of classic stationary methods. In the next example we consider Jacobi.

Example 7.25 (Jacobi iteration for the example matrix)

In this case, \(D=2I\). Thus in fact, scaled Jacobi and scaled Richardson are equivalent. We have to find the maximum eigenvalue of \(K=D^{-1}A\). We can compute this by knowing that the eigenvectors \(v\) of \(K\) are all of the form

\[\begin{split}v = \left( \begin{pmatrix} \sin(l\pi/(n+1)) \\ \sin(2l\pi/(n+1)) \\ \sin(nl\pi/(n+1)) \\ \end{pmatrix} \right),\end{split}\]

with one eigenvector for each value of \(0< l <n+1\). This can be proved by considering symmetries of the matrix, but here we just assume this form and establish that we have eigenvectors after substituting into the definition of an eigenvector \(Av=\lambda v\). This is a general approach that can be tried for any matrices arising in the analysis of convergence of classic stationary methods for this example matrix.

\[(D^{-1}A u)_k = -u_{k-1}/2 + u_k - u_{k+1}/2 = \lambda u_k\]

which becomes

\[\lambda\sin(kl\pi/(n+1)) = -\sin((l-1)k\pi/(n+1))/2 + \sin(kl\pi/(n+1)) - \sin((l+1)k\pi/(n+1))/2,\]

and you can use trigonometric formulae, or write

\[\begin{split}\lambda\sin(kl\pi/(n+1) & = -\sin((l-1)k\pi/(n+1))/2 + \sin(lk\pi/(n+1)) - \sin((l+1)k\pi/(n+1))/2\\ & = \sin(kl\pi/(n+1)) - \Im\left(\exp(ik(l-1)\pi/(n+1)) + \exp(ik(l+1)\pi/(n+1))\right)/2 \\ & = \sin(kl\pi/(n+1)) - \Im\left(\left( \exp(-ik\pi/(n+1)) + \exp(ik\pi/(n+1))\right)\exp(ikl\pi/(n+1))\right)/2 \\ & = \sin(kl\pi/(n+1)) - \Im\left(\sin(k\pi/(n+1))\exp(ikl\pi/(n+1))\right) \\ & = \sin(kl\pi/(n+1))(1 - \sin(k\pi/(n+1)))\end{split}\]

and we conclude that \(\lambda=1-\sin(k\pi/(n+1))\) are the eigenvalues with \(0<k<n+1\). The maximum eigenvalue corresponds to \(k=1\) and \(k=n\), with \(\lambda=1-\sin(\pi/(n+1))\).

The condition \(\omega<2/\lambda\) thus requires that

\[\omega < \frac{2}{1-\sin(\pi/(n+1))}.\]
Exercise 7.26

Find the value of \(\omega\) for scaled Jacobi such that the convergence rate is maximised, i.e. so that \(\rho(C)\) is minimised. What happens to this rate as \(n\to \infty\)?

7.9. Chebyshev acceleration (examinable in 2024/25)

Say we have computed iterates \({x}^0,{x}^1,\ldots,{x}^k\) using

\[M{x}^{k+1} = -N{x}^k + {b}.\]

If the method is convergent, then these iterates are homing in on the solution. Can we use extrapolation through these iterates to obtain a better guess for the solution?

\[\mbox{Find}\,c_{jk},\,j=1,\ldots,k,\,\mbox{with}\, {y}^k = \sum_{j=0}^kc_{jk}{x}^j,\]

with \({y}^k\) the best possible approximation to \({x}^*\).

The usual iterative method has \(c_{kk}=1\), and \(c_{jk}=0\) for \(j<k\). If \({x}^i={x}^*\), \(i=0,1,\ldots,k\) then

\[{y}^k = \sum_{j=0}^kc_{jk}{x}^*={x}^*\sum_{j=0}^kc_{jk},\]
so we need \(\sum_{j=0}^kc_{jk}=1\). Subject to this constraint, we

seek to minimise \({y}^k-{x}^* = \sum_{j=0}^kc_{jk}({x}^j-{x}^*)\).

We can interpret this in terms of matrix polynomials by writing

\[\begin{split}{x}^*-{y}^k &= \sum_{j=0}^kc_{jk}({x}^*-{x}^j), \\ & = \sum_{j=0}^kc_{jk}\left(-M^{-1}N\right)^j{e}^0, \\ & = p_k\left(-M^{-1}N\right){e}^0,\end{split}\]

where

\[p_k(X) = c_{0k} + c_{1k}X + c_{2k}X^2 + \ldots + c_{kk}X^k,\]

with \(p_k(1)=1\) (from our condition \(\sum_{j=0}^kc_{jk}=1)\).

We want to try to minimise \({y}^k-{x}^*\) by choosing \(c_{0k}\), \(c_{1k}\), \(\ldots\), \(c_{kk}\) so that the eigenvalues of \(p_k\) are as small as possible. If \(\lambda\) is an eigenvalue of \(C=-M^{-1}N\), then \(p_k(\lambda)\) is an eigenvalue of \(p_k(C)\). It is not practical to know all the eigenvalues of a large matrix, so we will develop methods that work if we know that all eigenvalues of \(C\) are real, and satisfy \(-1<\alpha<\lambda<\beta<1\), for some constants \(\alpha\) and \(\beta\) (we know that \(|\lambda|<1\) otherwise the basic method is not convergent.

If all eigenvalues of \(C\) are real, and satisfy \(-1<\alpha<\lambda<\beta<1\), then we try to make \(\rho_{\max} = \max_{\alpha\leq t\leq\beta}|p_k(t)|\) as small as possible. Then, if \(\lambda\) is an eigenvalue of \(C\), then the corresponding eigenvalue of \(p_k(C)\) will satisfy \(|\lambda_{p_k}| = |p_k(\lambda)| \leq \rho_{\max}\). We have reduced the problem to trying to find polynomials \(p(t)\) that have the smallest absolute value in a given range, subject to \(p(1)=1\). The solution to this problem is known: Chebyshev polynomials.

Definition 7.27 (The Chebyshev polynomial of degree \(k\), \(T_k(t)\) is defined by the recurrence)
\[T_0(t) = 1, \, T_1(t)=t, \, T_k(t)=2tT_{k-1}(t)-T_{k-2}(t).\]

For example: \(T_2(t) = 2tT_1(t)-T_0(t) = 2t^2-1\).

If we search for the \(k\)-th degree polynomial \(p_k(t)\) that minimises

\[\max_{-1\leq t\leq 1}|p_k(t)|\]

subject to the constraint that the coefficient of \(t^k\) is \(2^{k-1}\) then we get the \(k\)-th order Chebyshev polynomial \(T_k(t)\). The maximum value is \(1\).

This is not quite what we want, so we change variables, to get

\[T_k\left(\frac{2t-\beta-\alpha}{\beta-\alpha}\right)\quad\mbox{minimises} \quad \max_{\alpha\leq t\leq \beta}|p_k(t)|\]

subject to the constraint that the coefficient of \(t^k\) is \(2^{2k-1}/(\beta-\alpha)\). The maximum value is \(1\).

Then we scale the polynomial to reach the condition \(p_k(0)=1\).

\[p_k=\frac{T_k\left(\frac{2t-\beta-\alpha}{\beta-\alpha}\right)} {T_k\left(\frac{2-\beta-\alpha}{\beta-\alpha}\right)} \quad\mbox{minimises} \quad \max_{\alpha\leq t\leq \beta}|p_k(t)|\]

subject to the constraint that \(p_k(0)=1\). The maximum value is

\[\frac{1}{T_k\left(\frac{2-\beta-\alpha}{\beta-\alpha}\right)}.\]

Say we have computed iterates \({x}^0,{x}^1,\ldots,{x}^k\) using

\[M{x}^{k+1} = -N{x}^k + {b}.\]

Write

\[p_k=\frac{T_k\left(\frac{2t-\beta-\alpha}{\beta-\alpha}\right)} {T_k\left(\frac{2-\beta-\alpha}{\beta-\alpha}\right)}\]

in the form

\[p_k(t) = c_{0k} + c_{1k}t + c_{2k}t^2 + \ldots + c_{kk}t^k,\]

then

\[{y}^k = \sum_{j=0}^kc_{jk}{x}^k.\]

There appears to be a practical problem: we need to store \({x}^0\), \({x}^1\), \(\ldots\), \({x}^k\) in order to calculate \({y}^k\). However, we can get a formula for \({y}^k\) in terms of \({y}^{k-1}\) and \({y}^{k-2}\) by using

\[T_k(t) = 2tT_{k-1}(t)-T_{k-2}(t).\]

We get

\[p_k(t) = 2\frac{2t-\beta-\alpha}{\beta-\alpha} \frac{T_{k-1}(s)}{T_k(s)}p_{k-1}(t) - \frac{T_{k-2}(s)}{T_k(s)}p_{k-2}(t),\]

where \(s=\frac{2-\beta-\alpha}{\beta-\alpha}\).

After some manipulations we obtain

\[{y}^k = \omega_k\left({y}^{k-1}-{y}^{k-2}+\gamma{z}^{k-1} \right)+{y}^{k-2},\]

where

\[\gamma=\frac{2}{2-\beta-\alpha}, \quad M{z}^{k-1}={b}-A{y}^{k-1}.\]

with starting formulas

\[\begin{split}{y}^0 & = {x}^0 \\ {y}^1 & = {x}^0 + \gamma M^{-1}({b}-A{x}^0).\end{split}\]

Also,

\[\omega_k = \frac{1}{1-\omega_{k-1}/(4s^2)}, \, \omega_1=2.\]

(See Golub and Van Loan for details).

Chebyshev can dramatically accelerate preconditioners provided that the preconditioned operator is positive definite and upper and lower bounds on the eigenvalues are known.