Solutions of Linear Mapping

Purdue University

EAS 657

Geophysical Inverse Theory

Robert L. Nowack

Lecture 4a

Solutions of Linear Mapping

Consider the matrix equation

Lv = w

where given w, we want to find v.

The possibilities include

1) A unique solution exists to Lv = w

2) A solution exists, but it is not unique (i.e., many v satisfy Lv = w)

3) No solution exists and we want the best approximation (Lv w).

We will find that

Let V = N(L) + N^perp(L) with the solution to Lv = w as

v = v₁ + v₂

where

v₁ N(L)

v₂ N^perp(L)

Claim: v₂ is always unique. To show this, assume

but we’ve assumed

Define the minimum length solution v as the solution where || v₁ || = 0.

For case 3), no solution exists, but we want the best approximation solution of w which lies in the range of L. Thus, if Lv w, then given w, minimize || Lv₁ – w ||².

Define W = R(L) + R^perp(L) and w = w₁ + w₂, with w₁ R(L) and w₂ R^perp(L), where Lv = w₁. The resulting solution for the case 3) can be either unique or nonunique.

Adjoint Transformation, L*

If L is a mapping from v to w, let L* be a mapping from w to v.

L: v w

L*: w v

Note that L* is not L^-1. Thus, even if L is not onto W, L* is still defined over W.

Assume an inner product is defined in both V and W. Then the adjoint is defined such that

Ex) For real matrices, A* = A^T, the transpose.

Properties of L*

1) L* always exists

2) L* is unique

3) If L is linear, then L* is linear

4) For

then

5) If L is invertible, then (L^-1)* = (L*)^-1

6) (L*)* = L

Proof of 2) – Assume two different adjoints , then

and

for all x, y

This can only occur if .

Adjoint Theorems

I) N(L) = R^perp(L*)

Also

II) R(L) = R(LL*)

III) N(L) = N(L*L)

(Strang (1988) calls these adjoint theorems the Fundamental Theorem of Linear Algebra, Part II)

Proof of I) – For , if , then

Thus, or

We can then decompose

Proof of III – N(L) = N(L*L)

If we are given two transformations A and B, is N(A) = N(BA)? No, since B can only increase N(BA). Thus, N(A) is a subset of N(BA).

We want R(A) N(B) = {0} for N(A) = N(BA).

From I) we know that R(L) N(L*) = {0}. Thus, N(L) = N(L*L).

Solutions of linear mappings Lv = w

1) Lv = w where L is one to one. The matrix A representing L is square, DetA 0. The mapping is invertible and a unique solution occurs. For this case, N(L) = {0} and R^perp(L) = {0},

2) Lv = w where L is not one to one. For this case, N(L) {0}, but R^perp(L) = {0}, a compatible system. There will be a nonunique solution and we want to find the minimum norm solution, which is unique.

3) Lv w. This is an incompatible solution, and we want to find the best approximation to w in R(L). This may result in an unique or nonunique solution v. For the nonunique case, find the minimum norm solution.

Case 1) – Lv = w and L is invertible. L is 1 to 1, (N(L) = {0}, and R^perp(L) = {0} (the system is compatible).

Ex) L = A:R^N R^M where M=N, then

Ax = y, and x = A^-1y

where

A^-1A = AA^-1 = I

and

where the Adjugate(A) is the transpose of the cofactor matrix.

Ex)

Since DetA = -1 0, the matrix A is an invertible square matrix, and

Case 2) – Lv = w where v is not unique. For this case, N{L} {0}, and we want to find the minimum norm solution.

We can write V as V = N(L) +N^perp(L) or v = v₁ + v₂ where v₂ is the unique part of the solution. Show this by picking , then

which implies v₂ = . Since N(L) is perpendicular to N^perp(L), then || v ||² = || v₁ ||² + || v₂ ||². The minimum norm solution consists of picking || v₁ || = 0. Thus, || v ||² = || v₂ ||².

2a) Let L be represented by A:R^N R^M where N(A) {0}, N > M.

We want a solution v₂ in N^perp(A).

Since we have assumed no data errors, this is a purely underdetermined problem. Since R^perp(A) = {0}, then Dim R(A) = M and R(A) = W. Since Dim N^perp(A) = Dim R(A), then Dim N^perp(A) = M Dim N(A) = N-M.

In the case of a continuous Earth, N can be large and we may want to work in the smaller dimensional R^M space. This might include the case where V is a signal space.

The solution v of Lv = w can be decomposed into

We want the part of solution v₂ in N^perp(L). This is the minimum norm solution.

Now for any , then where v₂ R(L*) and thus v₂ N^perp(L), since R(L*) = N^perp(L) from Adjoint Theorem I. Then, and forms a compatible system (assuming ). This will always be the case for the purely underdetermined case.

Thus, if R^perp(L) = {0} = N(L*), the operator LL* is invertible and, from the Adjoint Theorem III, N(LL*) = N(L*) = {0}. Thus, LL* can be represented by an invertible M x M matrix (assuming W = R^M) and

v₂ = L*(LL*)^-1w

is the minimum norm solution for the purely undertermined case.

If R^perp(L) {0} N(L*) = N(LL*), then LL* is not invertible, but we can still follow the prescription

(1)

(2)

for any

If we can find any solution to (1) for , then pushing it back through L* will give v₂.

Ex) For a square matrix with N(L*) 0, then R^perp(L) = N(L*) = N(LL*) 0. Thus, LL* is not invertible even for some square matrix cases.

Case 3) – Lv w

Ex) . In this case, there cannot be a x that gives an unequal b₁ and b₂.

In this case, we want the best approximation to w in R(L). Thus, we want to minimize || w - Lv ||² = || e ||². Once we have found w₁ R(L) where || w – w₁ ||² is minimized, we can either have

a) solution v is unique and N(L) = {0}

b) solution v is non-unique. For this case, find the minimum norm solution

First, decompose w,

Lv ~ w = w₁ + w₂

where

Define w₁ R(L) such that Lv = w₁. Then, Lv ~ w = w₁ + e where e = w₂. We want e perpendicular to R(L). Then,

L*Lv ~ L*w = L*(w₁) + L*(e)

If e R^perp(L) = N(L*), then L*(e) = 0 and L*Lv = L*w. If e = w₂, then from the projection theorem, e has the minimum length. This forms a consistent set of equations called the “normal equations” with e = w₂ R^perp(L).

3a) Let N(L) = {0} = N(L*L). This is the purely overdetermined case for M > N.

L*L is then invertible and v = (L*L)^-1 L*w. This is the best approximation solution.

Ex) Ax = b where

This type of system arises in multiple observations of a given parameter. First determine the four fundamental subspaces.

R(A) is spanned by the columns of A. A basis is.

R^perp(A) is then spanned by .

N^perp (A) is spanned by the rows of A. The basis is (1).

N(A) = {0}.

Since N(A) = {0}, we need not worry about nonuniqueness, only incompatibility.

Any vector b R^perp(A) cannot be reached by the operator A. Thus,

where b_II R^perp(A)

Any vector with a nonzero b_II has a component of b R^perp(A). Thus in general, this will form an incompatible system

Typically, values of b R^perp(A) are caused by

1) a poorly formulated model problem

2) measurement errors in b which can’t be predicted by the model

This problem was first formulated by Gauss and Legendre and is known as the method of least squares method for obtaining a compatible system. It is a primary method today for making theory and data compatible.

To find the best approximation to Ax b, form the normal equations A*Ax = A*b. For the example above

A*A = 2 (A*A)^-1 = , A*b = [b₁ + b₂]

x_B.A. = (A*A)^-1A*b = [b₁ + b₂]

Thus, the inverse solution is simply the average of the observations.

Also, b_I R(A) is then

and

3b) Lu w and N(L) {0}

Now we have both an incompatible system and a nonunique solution. This case is best handled with the singular value decomposition or SVD, but we attempt here to use adjoints.

First, kill off the part of w in R^perp(L) by forming the normal equations

L*Lv = L*w

Write this as

with

Since N(L) {0} N(L*L) N(), this system is nonunique and = L*L is not invertible. We need to find the minimum norm solution. Chose any , then

where

and

But since , then

and the equation is not invertible.

Find any solution to . Then push back through to find the minimum norm solution

This procedure is sometimes called the generalized inverse resulting in the minimum norm – best approximation solution.

Finally, other information may be used to find the components v₁ N(L) in the solution for v. Thus,

where P is the dimension of the nullspace N(L). The can then be chosen to satisfy the additional constraints on the problem.

Finally, one can modify orthogonality by changing the inner product. Hence, the adjoint operators and the inverse operators can be changed by changing the inner product! Thus, optimality is, in some sense, in the eye of the beholder.

In a following section, we will derive some solutions for the generalized inverse operators by developing spectral and generalized spectral methods in terms of the singular value decomposition or SVD. This will provide closed form solutions for the incompatible and nonunique case 3b).