Skip to main content

Why A^{T}A is invertible? (1) Linear Algebra

We can see matrix multiplication A^{T}A frequently, for example, in the least square method. If matrix A has independent columns, this matrix has the following great properties: it is square, symmetric, and has the inverse. I am interested in why A^{T}A has this properties and would like to explain that. The explanation is based on Gilbert Strang's Introduction of Linear Algebra, 4th edition. The explanation uses null space and quadric form, it is an elegant explanation. But, if I just follow the Strang's explanation, I only need to write: see Strang book. So, I would like to add a geometric explanation. Although, I don't have a rigid proof, it is just a little bit intuitive for me. Yet, I hope you can enjoy that. Even if my explanation is bad, you can still see the Strang's explanation.

Properties of A^{T}A

We can see matrix multiplication A^{T}A frequently. We can see this pattern in a projection operation, least square computation (Actually, those two are the same).

In general, we can assume the following assumptions.

  • When A is not a square matrix and would like to know the least square solution. In this case, the matrix is m by n and m > n. This means, the number of equations is larger than the unknowns. When this happens? For example, you observe some signals or take some statistics, like take a photograph to analyze something. You sometimes take samples exact the same, but the result might differ because of observation error. In this case, you have more samples than the parameters. If m < n case, you can still take more samples. So, m by n is a general case.
  • We could make the columns of A independent. If not, we can trash such columns. If we don't know how many parameters are in the model and remove too many columns, we might have a wrong answer. This is rather a technical issue, it is just possible. This just says, we can do a kind of best effort.

If the second assumption, independent columns, is true, A^{T}A has the following nice properties:

  • Square
  • Symmetric
  • Existence of inverse

Therefore, this matrix is useful.

I would like to stress that the column independence of A is necessary. Otherwise, there is no inverse of A^{T}A. You can easily see the following.

It is rather easy to see why A^{T}A is square matrix, since [n by m] [m by n] is [n by n] because of the matrix multiplication rule.

You can also see the following shows the symmetric property.

Where A^{{T}^{T}} = A, transpose of transpose is the original matrix. The question is existence of the inverse.

Comments

Popular posts from this blog

Why A^{T}A is invertible? (2) Linear Algebra

Why A^{T}A has the inverse Let me explain why A^{T}A has the inverse, if the columns of A are independent. First, if a matrix is n by n, and all the columns are independent, then this is a square full rank matrix. Therefore, there is the inverse. So, the problem is when A is a m by n, rectangle matrix.  Strang's explanation is based on null space. Null space and column space are the fundamental of the linear algebra. This explanation is simple and clear. However, when I was a University student, I did not recall the explanation of the null space in my linear algebra class. Maybe I was careless. I regret that... Explanation based on null space This explanation is based on Strang's book. Column space and null space are the main characters. Let's start with this explanation. Assume  x  where x is in the null space of A .  The matrices ( A^{T} A ) and A share the null space as the following: This means, if x is in the null space of A , x is also in the null spa

Gauss's quote for positive, negative, and imaginary number

Recently I watched the following great videos about imaginary numbers by Welch Labs. https://youtu.be/T647CGsuOVU?list=PLiaHhY2iBX9g6KIvZ_703G3KJXapKkNaF I like this article about naming of math by Kalid Azad. https://betterexplained.com/articles/learning-tip-idea-name/ Both articles mentioned about Gauss, who suggested to use other names of positive, negative, and imaginary numbers. Gauss wrote these names are wrong and that is one of the reason people didn't get why negative times negative is positive, or, pure positive imaginary times pure positive imaginary is negative real number. I made a few videos about explaining why -1 * -1 = +1, too. Explanation: why -1 * -1 = +1 by pattern https://youtu.be/uD7JRdAzKP8 Explanation: why -1 * -1 = +1 by climbing a mountain https://youtu.be/uD7JRdAzKP8 But actually Gauss's insight is much powerful. The original is in the Gauß, Werke, Bd. 2, S. 178 . Hätte man +1, -1, √-1) nicht positiv, negative, imaginäre (oder gar um

Why parallelogram area is |ad-bc|?

Here is my question. The area of parallelogram is the difference of these two rectangles (red rectangle - blue rectangle). This is not intuitive for me. If you also think it is not so intuitive, you might interested in my slides. I try to explain this for hight school students. Slides:  A bit intuitive (for me) explanation of area of parallelogram  (to my site, external link) .