EAS 657
Geophysical Inverse Theory
Robert L. Nowack
Lecture 4
Optimization in a Hilbert Space
With an inner product, I.P., defined, we can perform a direct sum decomposition of V, thus
V = V1 + V2
Note in a direction sum decomposition, V1 and V2 are orthogonal subspaces, but don’t include all vectors in V.
Now for every I.P., we can find an orthonormal bases via Gram Schmidt. The converse is also true. For any basis for V {v1…vn} we can find some inner product for which these vectors are orthonormal. (This can be used later to show that optimality depends on the particular inner product and error norm chosen!)
Now, the problem for given any vector v and subspace V1 in V,
find a vector in V1 which is the “best approximation” to v, i.e., find a vector v1 in V1 which is “closest” to v in some sense. Alternatively, given an error vector e = v – v1, we want to minimize e.
The projection theorem states that given a vector v in V and a subspace V1, there exists a unique vector such that | e | = || v – v1 || is minimized. In addition, the vector e is orthogonal to the subspace V1.
1) Decompose V into where by a direct sum decomposition of an orthogonal basis. Let, the vector v = v1 + v2 where .
An arbitrary error vector can be written, where as
Now,
The error is minimized if we let . Then, where the error vector e is perpendicular to all vectors in V1.
Many optimization problems in a Hilbert Space are based on this idea!
Application of
Projections
Ex) Let V = R2 be spanned by and .
Gram Schmidt can be used to find the projection of v2 onto v1. First,
The first vector is already normalized. The projection of v2 onto v1 is
Now,
or
This is also normalized, thus .
Ex) Find the projection of onto the subspace defined by the equation aTx = 0 (defining an R2 subspace perpendicular to the vector a) where and V = R3.
Let the subspace spanned by a be called Sperp and the plane perpendicular to a be called S. Then,
S + Sperp = V = R3
Any vector can then be uniquely projected onto S and onto Sperp. In this case, it is easiest to first project onto . Let
(the normalized
version of a)
Now,
then,
and
Thus the vector bS in S that “best approximates” is its projection onto S.
We can always represent an arbitrary vector x in terms of a basis in V. Thus,
If we have constructed an orthonormal basis, , then we can write
Ex) Let v(t) be a periodic signal space V with period T such that
Now let,
where v1(t) is also a periodic signal with period T in a subspace V1 spanned by the first 2M + 1 Fourier coefficients with M < N. Then the best approximation of a signal v(t) in the subspace V1 is just the truncated series. However, a smoother signal could be obtained by tapering the truncation window, but this would require a modified inner product.
Ex) In R3 with a standard inner product, what is the best approximation to the vector in the x1 – x2 plane? It is just the vector .
Ex) What is the best approximation to the vector in the subspace V1 defined by 8x + 5y + 4z = 0? The projection theorem states that we must decompose v into and , then
v = v1 + v2
where
Let a subspace S be spanned by the columns of a matrix A (M x N) in RM. For the case above, find two independent vectors in V1 perpendicular to and put as columns of A. Now project a vector b onto the subspace S spanned by the column vectors a1, a2, … aN. Then,
bP = x1a1
+ x2a2 … xNaN or,
bP = Ax = [a1, a2,
… aN]
Now, the error vector e will be perpendicular to all vectors in S. Then for e = b – Ax
AT(b – Ax) = 0
This gives
ATb = ATA x
and
x = (ATA)-1 ATb
(Note: for independent columns of A, (ATA) is invertible, i.e. if the columns of A, a1, a2, … aN form a basis for S.
Then,
bP = Ax = A(ATA)-1AT b
This is the general case of projecting a vector b onto a subspace S with a basis specified by the columns of a matrix A.
Linear Transformations
Between Finite Dimensional Vector Spaces
Let L map a vector to a vector
A linear transformation satisfies
1) L(v1 + v2) = L(v1) + L(v2)
2) L(av) = aL(v)
The second part can be remembered as: if a system is linear, then “if you double the input, you double the output”.
A linear transformation is “onto” if for any , there is a vector such that L(v) = w
The “range” of a linear transformation is the subspace corresponding to L(v). This is referred to as R(L) (for an “onto” transformation R(L) equals all of W).
A linear transformation is “one-to-one” if for any , there is a unique v such that L(v) = w.
A linear transformation is “invertible” if
1) it is one-to-one (invertible on its range)
2) onto
The “null space” of a linear transformation, N(L) is the subspace of V defined by the vectors vi, such that L(vi) = 0.
Dim{N(L)} =
0 if and only if the transformation is one-to-one. To show this, assume L is not one-to-one. Then there exists vectors v1 and v2, such that L(v1)
= L(v2) = w. Then, L(v1) – L(v2)
= L(v1 – v2)
= 0. Then, , but N(L) = {0}, so (v1
– v2) = 0 and v1 = v2.
Eigenvectors and
Eigenvalues
These are defined for L:V V. If , then is called an eigenvalue of L and v is the corresponding eigenvector. (More on this will be given with respect to matrices later).
Matrix Transformations
Let A:RN RM with inner products
.
These will be called the usual inner products. Vectors will be represented by columns and A by an M x N matrix aij, i = 1,M, j = 1,N. (M is the number of rows (first index), N is the number of columns (second index)). A can be written
R(A) is the vector space spanned by the columns of A, and Nperp(A) is the vector space spanned by rows of A. The “rank” of a matrix A is the number of linearly independent columns of A (or the number of independent rows of A as well).
We want to show that the columns of A span R(A) and the rows of A span Nperp(A). Let
where ei
are the standard basis in RN
specified as
Then,
Let ai be the ith column of the matrix A
then,
with Thus, the range of A, R(A), is spanned by the columns of A. Let
then,
If , then Av = 0, and . Then, is perpendicular to all bi, the rows of the matrix A. Thus, is spanned by rows of A.
Now the rank of A is noted as r and is equal to the dimension of the space spanned by the columns of A, as well as the dimension of the space spanned by the rows of A. Thus, the rank r is
r(A) = Dim {R(A)} = Dim {Nperp(A)}
Ex) Let
with A:R3 R3.
Find the basis for the range R(A).
1) Check Det A. If Det A = 0, then the columns of A are linearly dependent. Since Det A = 0, choose the basis for R(A) as
Thus, the rank of A is equal to 2.
Choose as a basis for Nperp(A) the independent rows of A as
(1 0 1)
(0 1 1)
To find N(A), use fact that V = N(A) + Nperp(A). Thus, for N(A), find all vectors perpendicular to Nperp(A).
Matrices are examples of linear transformations. Also, any linear transformation between finite dimensional vector spaces is “isomorphic” with a matrix.
Use L1 to go from V to RN and L2 to go from W to RM. Then
L(v) = L2AL1(v)
where L1, L2 are just basis representations of vectors in V and W. Thus,
then,
and
Thus, A tells us how to map from basis functions in V to basis functions in W