In the linear algebra way, the best x is a projection of right hand side onto the column space of the matrix. Because the solution is only possible in the A's column space (means the solution is only represented by the A's column vector's linear combination), the best solution is the projection of b onto the column space. Figure 1 shows the geometric interpretation. The matrix A and b are:
Here A is a vector, so the projected best x is:
The result is the same to the calculus way. This is also the average. (If you are not familiar with projection matrix, see the Figure 1. In the Figure, e is perpendicular with A, i.e., A^T e = 0. You can derive it from here.)
Figure 1. Project b to A's column space
My interest is why these are the same. Minimization of quadric is a calculus (analysis) problem, and projection is more a geometric problem. Is this just a coincidence? Of course not. They have the same purpose.
As in Figure 1, projection is the minimal distance between b and A's column space. The distance is square root of squared sum in the Euclidean space (Pythagoras's theorem). Minimizing it is the same to the calculus way, though, the idea and how to compute it looks different.
Once I wrote the following, I like Keigo Higashino's books. One of his book, ``Yougisha X no Kenshin'', describes about deceiving a problem makes wrong conclusion. The example is mathematics, if you deceive someone as this is a calculus problem, but actually it is a geometry problem, one can not reach the correct answer. A clever guy deceive the police by putting an wrong answer, that's a fun detective story. Higashino's book usually describes science quite well, though, I have a question on this point. I found interesting that when the problem setting is the same, even different approcaches can conclude the same result in mathematics. I feel this more and more when I study mathematics more. I am surprised when the different looking problems are originated from the same idea. It looks like climbing a mountain. When I was at the sea level, I can only see some of the views from specific directions only, these views look different, however, once I see the overview from the mountain, they are just the same view from different positions. Studying mathematics is like to have a map. Some of the roads seems not connected, but if I have a map, I can see actually they are connected. I think all the roads are somewhat connected. If I could not see, that means I need to study more.
Here we found the closest vector from a vector by
- calculus: minimize the error
- linear algebra: projection.
Next time, I would like to talk about Kalman filter. Kalman filter estimates a near future based on observed data that have some noise. The basic idea to predict near future is the same as mentioned in this blog. But I recently happen to know it has one interesting aspect and I would like to explain that.