2. Linear algebra preliminaries

In this preliminary section, we revise a few key linear algebra concepts that will be used in the rest of the course, emphasising the column space of matrices. We will quote some standard results that should be found in an undergraduate linear algebra course.

Hint

Before you attempt any exercises, you need to make sure that you have everything you need set up on your computer. See the checklist in the previous section.

2.1. Matrices, vectors and matrix-vector multiplication

Supplementary video

We will consider the multiplication of a vector

\[\begin{split}x = \begin{pmatrix} x_1 \\ x_2 \\ \vdots \\ x_n \\ \end{pmatrix}, \quad x_i \in \mathbb{C}, \, i=1,2,\ldots,n, \mbox{ i.e. } x \in \mathbb{C}^n,\end{split}\]

by a matrix

\[\begin{split}A = \begin{pmatrix} a_{11} & a_{12} & \ldots & a_{1n} \\ a_{21} & a_{22} & \ldots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{m1} & a_{m2} & \ldots & a_{mn} \\ \end{pmatrix},\end{split}\]

i.e. \(A\in \mathbb{C}^{m\times n}\). \(A\) has \(m\) rows and \(n\) columns so that the product

\[b = Ax\]

produces \(b \in \mathbb{C}^m\), defined by

(2.1)\[b_i = \sum_{j=1}^n a_{ij}x_j, \, i=1,2,\ldots,m.\]

In this course it is important to consider the general case where \(m \neq n\), which has many applications in data analysis, curve fitting etc. We will usually state generalities in this course for vectors over the field \(\mathbb{C}\), noting where things specialise to \(\mathbb{R}\).

Supplementary video

We can quickly check that the map \(x \to Ax\) given by matrix multiplication is a linear map from \(\mathbb{C}^n \to \mathbb{C}^m\), since it is straightforward to check from the definition that

\[A(\alpha x + y) = \alpha Ax + Ay,\]

for all \(x,y \in \mathbb{C}^n\) and \(\alpha\in \mathbb{C}\). (Exercise: show this for yourself.)

Supplementary video

It is very useful to interpret matrix-vector multiplication as a linear combination of the columns of \(A\) with coefficients taken from the entries of \(x\). If we write \(A\) in terms of the columns,

\[\begin{split}A = \begin{pmatrix} a_1 & a_2 & \ldots & a_n \\ \end{pmatrix},\end{split}\]

where

\[a_i \in \mathbb{C}^m, \, i=1,2,\ldots,n,\]

then

\[b = \sum_{j=1}^n x_j a_j,\]

i.e. a linear combination of the columns of \(A\) as described above.

Supplementary video

We can extend this idea to matrix-matrix multiplication. Taking \(A\in \mathbb{C}^{m\times l}\), \(C\in \mathbb{C}^{l\times n}\), \(B\in \mathbb{C}^{m\times n}\), with \(B=AC\), then the components of \(B\) are given by

\[b_{ij} = \sum_{k=1}^l a_{ik}c_{kj}, \quad 1\leq i \leq m, \, 1\leq j \leq n.\]

Writing \(b_j \in \mathbb{C}^m\) as the jth column of \(B\), for \(1\leq j \leq n\), and \(c_j\) as the jth column of \(C\), we see that

\[b_j = Ac_j.\]

This means that the jth column of \(B\) is the matrix-vector product of \(A\) with the jth column of \(C\). This kind of “column thinking” is very useful in understanding computational linear algebra algorithms.

Supplementary video

An important example is the outer product of two vectors, \(u \in \mathbb{C}^m\) and \(v \in \mathbb{C}^n\). Here it is useful to see these vectors as matrices with one column, i.e. \(u \in \mathbb{C}^{m\times 1}\) and \(v \in \mathbb{C}^{n\times 1}\). The outer product is \(u v^T \in \mathbb{C}^{m\times n}\). The columns of \(v^T\) are just single numbers (i.e. vectors of length 1), so viewing this as a matrix multiplication we see

\[uv^T = \begin{pmatrix} uv_1 & uv_2 & \ldots & uv_n \end{pmatrix},\]

which means that all the columns of \(uv^T\) are multiples of \(u\). We will see in the next section that this matrix has rank 1. In the complex number case, the transpose \(^T\) is replaced by the adjoint \(^*\) which is the complex conjugate of the transpose. There will be more about this later.

2.1.1. Your first programming exercises

In this course, there will be programming exercises, the first one of which is coming up right now. The aim of these programming exercises is to gain understanding of the mathematical algorithms by expressing them as code. The numpy Python package has a module called numpy.linalg that contains many of these algorithms. Hence for this course we will not use this module, just use the functions and classes available when you import numpy itself. There is one exception, which is that numpy.linalg.norm() is quite useful, but also covers a lot of different cases which are not very edifying to replicate. Hence, we have included numpy.linalg.norm() in the cla_utils package as cla_utils.norm(), should you wish to use it.

Exercise

The cla_utils.exercises1.basic_matvec() function has been left unimplemented. To finish the function, add code so that it computes the matrix-vector product \(b=Ax\) from inputs \(A\) and \(x\). In this first implementation, you should simply implement (2.1) with a double nested for loop (one for the sum over \(j\), and one for the \(i\) elements of \(b\)). Run this script to test your code (and all the exercises from this exercise set):

py.test test/test_exercises1.py

from the Bash Terminal. Make sure you commit your modifications and push them to your course repository.

Hint

Don’t forget to activate the virtual environment before running the tests to make sure that you have access to all the necessary packages

Hint

The fast array features of Python are provided by Numpy for which there is a helpful tutorial. There is also a handy guide for Matlab users. In that context, the code provided in this course will always use Numpy arrays, and never Numpy matrices.

Exercise

The cla_utils.exercises1.column_matvec() function has been left unimplemented. To finish the function, add code so that it computes the matrix-vector product \(b=Ax\) from inputs \(A\) and \(x\). This second implementation should use the column-space formulation of matrix-vector multiplication, i.e., \(b\) is a weighted sum of the columns of \(A\) with coefficients given by the entries in \(x\). This should be implemented with a single for loop over the entries of \(x\). The test script test_exercises1.py will also test this function.

Hint

It will be useful to use the Python “slice” notation, for example:

A[:, 3]

will return the 4th (since Python numbers from zero) column of \(A\). For more information, see the Numpy documentation on slicing.

Exercise

The cla_utils.exercises1.time_matvecs() function computes the execution time for these two implementations for some example matrices and compares them with the built-in Numpy matrix-vector product. Run this function and examine the output. You should observe that the basic implementation is much slower than the built-in implementation. This is because built-in Numpy operations use compiled C code that is wrapped in Python, which avoids the overheads of run-time interpretation of the Python code and manipulation of Python objects. Numpy is really useful for computational linear algebra programming because it preserves the readability and flexibility of Python (writing code that looks much more like maths, access to object-oriented programming models) whilst giving near-C speed if used appropriately. You can read more about the advantages of using Numpy here. You should also observe that the column implementation is somewhere between the speed of the basic implementation and the built-in implementation. This is because (if you did it correctly), each iteration of the for loop involves adding an entire array (a scaling of one of the columns of \(A\)) to another array (where \(b\) is being calculated). This will also use compiled C code through Numpy, removing some (but not all) of the Python overheads in the basic implementation.

In this course, we will present algorithms in the notes that generally do not express the way that Numpy should be used to implement them. In these exercises you should consider the best way to make use of Numpy built-in operations (which will often make the code more maths-like and readable, as well as potentially faster).

2.2. Range, nullspace and rank

Supplementary video

In this section we’ll quickly rattle through some definitions and results.

Definition 2.1 (Range)

The range of \(A\), \(\mbox{range}(A)\), is the set of vectors that can be expressed as \(Ax\) for some \(x\).

The next theorem follows as a result of the column space interpretation of matrix-vector multiplication.

Theorem

\(\mbox{range}(A)\) is the vector space spanned by the columns of \(A\).

Definition 2.2 (Nullspace)

The nullspace \(\mbox{null}(A)\) of \(A\) (or kernel) is the set of vectors \(x\) satisfying \(Ax=0\), i.e.

\[\mbox{null}(A) = \{x \in \mathbb{C}^n: Ax=0\}.\]
Supplementary video
Definition 2.3 (Rank)

The column rank \(\mbox{rank}(A)\) of \(A\) is the dimension of the column space of \(A\). The row rank \(\mbox{rank}(A)\) of \(A\) is the dimension of the row space of \(A\). It can be shown that the column rank and row rank of a matrix are equal, so we shall just refer to the rank.

If

\[\begin{split}A = \begin{pmatrix} a_1 & a_2 & \ldots & a_n \\ \end{pmatrix},\end{split}\]

the column space of \(A\) is \(\mbox{span}(a_1,a_2,\ldots,a_n)\).

Definition

An \(m\times n\) matrix \(A\) is full rank if it has maximum possible rank i.e. rank equal to \(\min(m, n)\).

If \(m\geq n\) then \(A\) must have \(n\) linearly independent columns to be full rank. The next theorem is then a consequence of the column space interpretation of matrix-vector multiplication.

Theorem

An \(m\times n\) matrix \(A\) is full rank if and only if it maps no two distinct vectors to the same vector.

Definition

A matrix \(A\) is called nonsingular, or invertible, if it is a square matrix (\(m=n\)) of full rank.

Exercise

The cla_utils.exercises1.rank2() function has been left unimplemented. To finish the function, add code so that it computes the rank-2 matrix \(A = u_1v_1^* + u_2v_2^*\) from \(u_1,u_2\in \mathbb{C}^m\) and \(v_1,v_2 \in \mathbb{C}^n\). As you can see, the function needs to implement this rank-2 matrix by first forming two matrices \(B\) and \(C\) from the inputs, and then forming \(A\) as the product of \(B\) and \(C\). The test script test_exercises1.py in the test directory will also test this function.

To measure the rank of \(A\), we can use the built-in rank function:

r = numpy.linalg.matrix_rank(A)

and we should find that the rank is equal to 2. Can you explain why this should be the case (use the column space interpretation of matrix-matrix multiplication)?

2.3. Invertibility and inverses

Supplementary video

This means that an invertible matrix has columns that form a basis for \(\mathbb{C}^m\). Given the canonical basis vectors defined by

\[\begin{split}e_j = \begin{pmatrix} 0 \\ \ldots \\ 0 \\ 1 \\ 0 \\ \ldots \\ 0 \\ \end{pmatrix},\end{split}\]

i.e. \(e_j\) has all entries zero except for the jth entry which is 1, we can write

\[e_j = \sum_{k=1}^m z_{jk} a_k, \quad 1\leq j \leq m.\]

In other words,

\[ \begin{align}\begin{aligned}I = \begin{pmatrix} e_1 & e_2 & \ldots & e_m \end{pmatrix}\\= ZA.\end{aligned}\end{align} \]
We call \(Z\) a (left) inverse of \(A\). It can be shown that \(Z\) is the

unique left inverse of \(A\), and that \(Z\) is also the unique right inverse of \(A\), satisfying \(I = AZ\). We write \(Z=A^{-1}\).

The first four parts of the next theorem are a consequence of what we have so far, and we shall quote the fifth and sixth (see a linear algebra course).

Theorem

Let \(A \in \mathbb{C}^{m\times m}\). Then the following are equivalent.

  1. \(A\) has an inverse.

  2. \(\mbox{rank}(A)=m\).

  3. \(\mbox{range}(A)=\mathbb{C}^m\).

  4. \(\mbox{null}(A)=\{0\}\).

  5. 0 is not an eigenvalue of \(A\).

  6. The determinant \(\det(A)\neq 0\).

Supplementary video

Finding the inverse of a matrix can be seen as a change of basis. Considering the equation \(Ax= b\), we have \(x = A^{-1}b\) for invertible \(A\). We have seen already that \(b\) can be written as

\[b = \sum_{j=1}^m x_j a_j.\]

Since the columns of \(A\) span \(\mathbb{C}^m\), the entries of \(x\) thus provide the unique expansion of \(b\) in the columns of \(A\) which form a basis. Hence, whilst the entries of \(b\) give basis coefficients for \(b\) in the canonical basis \((e_1,e_2,\ldots,e_m)\), the entries of \(x\) give basis coefficients for \(b\) in the basis given by the columns of \(A\).

Exercise

For matrices of the form, \(A = I + uv^*\), where \(I\) is the \(m\times m\) identity matrix, and \(u,v \in \mathbb{C}^m\), show that whenever \(A\) is invertible, the inverse is of the form \(A^{-1} = I + \alpha uv^*\) where \(\alpha \in \mathbb{C}\), and calculate the form of \(\alpha\).

The cla_utils.exercises1.rank1pert_inv() function has been left unimplemented. To finish the function, add code so that it computes \(A^{-1}\) using your formula (and not any built-in matrix inversion routines). The test script test_exercises1.py in the test directory will also test this function.

Add a function to cla_utils.exercises1 that measures the time to compute the inverse of \(A\) for an input matrix of size 400, and compare with the time to compute the inverse of \(A\) using the built-in inverse:

numpy.linalg.inv(A)

What do you observe? Why do you think this is? We will examine the cost of general purpose matrix inversion algorithms later.

2.4. Adjoints and Hermitian matrices

Supplementary video
Definition 2.4 (Adjoint)

The adjoint (or Hermitian conjugate) of \(A\in \mathbb{C}^{m\times n}\) is a matrix \(A^* \in \mathbb{C}^{n\times m}\) (sometimes written \(A^\dagger\) or \(A'\)), with

\[a^*_{ij} = \bar{a_{ji}},\]

where the bar denotes the complex conjugate of a complex number. If \(A^* = A\) then we say that \(A\) is Hermitian.

For real matrices, \(A^*=A^T\). If \(A=A^T\), then we say that the matrix is symmetric.

The following identity is very important when dealing with adjoints.

Theorem

For matrices \(A\), \(B\) with compatible dimensions (so that they can be multiplied),

\[(AB)^* = B^*A^*.\]
Exercise

(This is an advanced exercise if the other exercises are complete. If you are behind on the exercises please skip this one.)

Consider a matrix \(A=B + iC\) where \(B,C\in\mathbb{R}^{m\times m}\) and \(A\) is Hermitian. Show that \(B=B^T\) and \(C=-C^T\). To save memory, instead of storing values of \(A\) (\(m\times m\) complex numbers to store), consider equivalently storing a real-valued \(m\times m\) array \(\hat{A}\) with \(\hat{A}_{ij}=B_{ij}\) for \(i\geq j\) and \(\hat{A}_{ij}=C_{ij}\) for \(i<j\).

The cla_utils.exercises1.ABiC() function has been left unimplemented. It should implement matrix vector multiplication \(z=Ax\), returning the real and imaginary parts of \(z\), given the real and imaginary parts of \(x\) as inputs, and given the real array \(\hat{A}\) as above. You should implement the multiplication using real arithmetic only, with just one loop over the entries of \(x\), using the column space interpretation of matrix-vector multiplication. The test script test_exercises1.py in the test directory will also test this function.

Hint

You can use the Python “slice” notation, to assign into a slice of an array, for example:

x[3:5] = y[3:5]

will copy the 4th and 5th entries of \(y\) (Python numbers from zero, and the upper limit of the slice is the first index value not to use. For more information, see the Numpy documentation on slicing.

2.5. Inner products and orthogonality

Supplementary video

The inner product is a critical tool in computational linear algebra.

Definition 2.5 (Inner product)

Let \(x,y\in \mathbb{C}^m\). Then the inner product of \(x\) and \(y\) is

\[x^*y = \sum_{i=1}^m \bar{x}_iy_i.\]

We will frequently use the natural norm derived from the inner product to define size of vectors.

Definition 2.6 (2-Norm)

Let \(x\in \mathbb{C}^m\). Then the 2-norm of \(x\) is

\[\|x\| = \sqrt{\sum_{i=1}^m |x_i|^2} = \sqrt{x^*x}.\]

Orthogonality will emerge as an early key concept in this course.

Definition 2.7 (Orthogonal vectors)

Let \(x,y\in \mathbb{C}^m\). The two vectors are orthogonal if \(x^*y=0\).

Similarly, let \(X\), \(Y\) be two sets of vectors. The two sets are orthogonal if

\[x^*y = 0,\quad \forall x\in X, \, y\in Y.\]

A set \(S\) of vectors is itself orthogonal if

\[x^*y = 0,\quad\forall x,y \in S.\]

We say that \(S\) is orthonormal if we also have \(\|x\|=1\) for all \(x\in S\).

2.6. Orthogonal components of a vector

Supplementary video

Let \(S=\{q_1,q_2,\ldots,q_n\}\) be an orthonormal set of vectors in \(\mathbb{C}^m\), and take another arbitrary vector \(v\in \mathbb{C}^m\). Now take

\[r = v - (q_1^*v)q_1 - (q_2^*v)q_2 - \ldots - (q_n^*v)q_n.\]

Then, we can check that \(r\) is orthogonal to \(S\), by calculating for each \(1\leq i \leq n\),

\[ \begin{align}\begin{aligned}q^*_ir = q_i^*v - (q_1^*v)(q_i^*q_1) - \ldots - (q_n^*v)(q_i^*q_n)\\= q_i^*v - q_i^*v = 0,\end{aligned}\end{align} \]

since \(q_i^*q_j=0\) if \(i\neq j\), and 1 if \(i=j\). Thus,

\[v = r + \sum_{i=1}^n (q_i^*v)q_i = r + \sum_{i=1}^n \underbrace{(q_i q_i^*)}_{\mbox{rank-1 matrix}}v.\]

If \(S\) is a basis for \(\mathbb{C}^m\), then \(n=m\) and \(r=0\), and we have

\[v = \sum_{i=1}^m (q_i q_i^*)v.\]
Exercise

The cla_utils.exercises2.orthog_cpts() function has been left unimplemented. It should implement the above computation, returning \(r\) and the coefficients of the component of \(v\) in each orthonormal direction. The test script test_exercises2.py in the test directory will test this function.

2.7. Unitary matrices

Supplementary video
Definition 2.8 (Unitary matrices)

A matrix \(Q\in \mathbb{C}^{m\times m}\) is unitary if \(Q^* =Q^{-1}\).

For real matrices, a matrix \(Q\) is orthogonal if \(Q^T=Q^{-1}\).

Theorem

The columns of a unitary matrix \(Q\) are orthonormal.

Proof

We have \(I = Q^*Q\). Then using the column space interpretation of matrix-matrix multiplication,

\[e_j = Q^*q_j,\]

where \(q_j\) is the jth column of \(Q\). Taking row i of \(e_j\), we have

\[\begin{split}\delta_{ij} = q_i^*q_j, \mbox{ where } \delta_{ij} = \left\{ \begin{array}{ccc} 1 & \mbox{if} & i=j, \\ 0 & \mbox{otherwise} & \\ \end{array}\right. .\end{split}\]

Extending a theme from earlier, we can interpret \(Q^*=Q^{-1}\) as representing a change of orthogonal basis. If \(Qx = b\), then \(x=Q^*b\) contains the coefficients of \(b\) expanded in the basis given by the orthonormal columns of \(Q\).

Exercise

The cla_utils.exercises2.solveQ() function has been left unimplemented. Given a square unitary matrix \(Q\) and a vector \(b\) it should solve \(Qx=b\) using information above (it is not expected to work when \(Q\) is not unitary or square). The test script test_exercises2.py in the test directory will test this function.

Add a function to cla_utils.exercises2 that measures the time to solve \(Qx=b\) using solveQ for an input matrix of sizes 100, 200, 400, and compare with the times to solve the equation using the general purpose solve (which uses LU factorisation, which we will discuss later):

x = numpy.linalg.solve(Q, b)

What did you expect and was it observed?

A quick way to get an orthogonal matrix is to take a general matrix $A$ and find the QR factorisation, which we will cover in the next section.

Q, R = numpy.linalg.qr(A)

returns two matrices, of which \(Q\) is orthogonal.

2.8. Vector norms

Supplementary video

Various vector norms are useful to measure the size of a vector. In computational linear algebra we need them for quantifying errors etc.

Definition 2.9 (Norms)

A norm is a function \(\|\cdot\|:\mathbb{C}^m \to \mathbb{R}\), such that

  1. \(\|x\|\geq 0\), and \(\|x\|=0\implies x =0.\)

  2. \(\|x+y\| \leq \|x\| + \|y\|\) (triangle inequality).

  3. \(\|\alpha x\| = |\alpha|\|x\|\) for all \(x \in \mathbb{C}^m\) and \(\alpha \in \mathbb{C}\).

We have already seen the 2-norm, or Euclidean norm, which is part of a larger class of norms called p-norms, with

\[\|x\|_p = \left(\sum_{i=1}^m |x_i|^p\right)^{1/p}, \quad\]

for real \(p>0\). We will also consider weighted norms

\[\|x\|_{W,p} = \|Wx \|_p,\]

where \(W\) is a matrix.

2.9. Projectors and projections

Supplementary video
Definition 2.10 (Projector)

A projector \(P\) is a square matrix that satisfies \(P^2=P\).

If \(v \in \mbox{range}(P)\), then there exists \(x\) such that \(Px = v\). Then,

\[Pv = P(Px) = P^2x = Px = v,\]

and hence multiplying by \(P\) does not change \(v\).

Now suppose that \(Pv \neq v\) (so that \(v\notin \mbox{range}(P)\)). Then,

\[P(Pv - v) = P^2v - Pv = Pv - Pv = 0,\]

which means that \(Pv-v\) is the nullspace of \(P\). We have

\[Pv -v = -(I-P)v.\]
Definition 2.11 (Complementary projector)

Let \(P\) be a projector. Then we call \(I-P\) the complementary projector.

To see that \(I-P\) is also a projector, we just calculate,

\[(I-P)^2 = I^2 - 2P + P^2 = I - 2P + P = I - P.\]

If \(Pu=0\), then \((I-P)u = u\).

In other words, the nullspace of \(P\) is contained in the range of \(I-P\).

On the other hand, if \(v\) is in the range of \(I-P\), then there exists some \(w\) such that

\[v = (I-P)w = w - Pw.\]

We have

\[Pv = P(w-Pw) = Pw - P^2w = Pw - Pw = 0.\]

Hence, the range of \(I-P\) is contained in the nullspace of \(P\). Combining these two results we see that the range of \(I-P\) is equal to the nullspace of \(P\). Since \(P\) is the complementary projector to \(I-P\), we can repeat the same argument to show that the range of \(P\) is equal to the nullspace of \(I-P\).

We see that a projector \(P\) separates \(\mathbb{C}^m\) into two subspaces, the nullspace of \(P\) and the range of \(P\). In fact the converse is also true: given two subspaces \(S_1\) and \(S_2\) of \(\mathbb{C}^m\) with \(S_1 \cap S_2 = \{0\}\), then there exists a projector \(P\) whose range is \(S_1\) and whose nullspace is \(S_2\).

Supplementary video

Now we introduce orthogonality into the concept of projectors.

Definition 2.12 (Orthogonal projector)

\(P\) is an orthogonal projector if

\[(Pv)^*(Pv-v) = 0, \, \forall v \in \mathbb{C}^m.\]

In this case, \(P\) separates the space into two orthogonal subspaces.

2.10. Constructing orthogonal projectors from sets of orthonormal vectors

Let \(\{q_1,\ldots,q_n\}\) be an orthonormal set of vectors in \(\mathbb{C}^m\). We write

\[\begin{split}\hat{Q} = \begin{pmatrix} q_1 & q_2 & \ldots & q_n \\ \end{pmatrix}.\end{split}\]

Previously we showed that for any \(v\in \mathbb{C}^m\), we have

\[v = \underbrace{r}_{\mbox{Orthogonal to column space of }\hat{Q}} + \underbrace{\sum_{i=1}^n (q_iq^*_i)v}_{\mbox{in the column space of }\hat{Q}}.\]

Hence, the map

\[v \mapsto Pv = \underbrace{\sum_{i=1}^n (q_iq^*_i)}_{=P}v,\]

is an orthogonal projector. In fact, \(P\) has very simple form.

Theorem

The orthogonal projector \(P\) takes the form

\[P = \hat{Q}\hat{Q}^*.\]
Proof

From the change of basis interpretation of multiplication by \(\hat{Q}^*\), the entries in \(\hat{Q}^*v\) gives coefficients of the projection of \(v\) onto the column space of \(\hat{Q}\) when expanded using the columns as a basis. Then, multiplication by \(\hat{Q}\) gives the projection of \(v\) expanded again in the canonical basis. Hence, multiplication by \(\hat{Q}\hat{Q}^*\) gives exactly the same result as multiplication by the formula for \(P\) above.

This means that \(\hat{Q}\hat{Q}^*\) is an orthogonal projection onto the range of \(\hat{Q}\). The complementary projector is \(P_{\perp} = I - \hat{Q}\hat{Q}^*\) is an orthogonal projection onto the nullspace of \(\hat{Q}\).

An important special case is when \(\hat{Q}\) has just one column, and then

\[P = q_1q_1^*, \, P_{\perp}=I - q_1q_1^*.\]

We notice that \(P^* = (\hat{Q}\hat{Q}^*) = \hat{Q}\hat{Q}^* = P\). In fact the following is true.

Theorem

\(P=P^*\) if and only if \(\hat{Q}\) is orthogonal.

Exercise

The cla_utils.exercises2.orthog_proj() function has been left unimplemented. Given an orthonormal set \(q_1,q_2,\ldots,q_n\), it should provide the orthogonal projector \(P\). The test script test_exercises2.py in the test directory will also test this function.