Why A^{T}A has the inverse
Let me explain why A^{T}A has the inverse, if the columns of A are independent. First, if a matrix is n by n, and all the columns are independent, then this is a square full rank matrix. Therefore, there is the inverse. So, the problem is when A is a m by n, rectangle matrix. Strang's explanation is based on null space. Null space and column space are the fundamental of the linear algebra. This explanation is simple and clear. However, when I was a University student, I did not recall the explanation of the null space in my linear algebra class. Maybe I was careless. I regret that...
Explanation based on null space
This explanation is based on Strang's book. Column space and null space are the main characters. Let's start with this explanation.
Assume x where
x is in the null space of A. The matrices (A^{T} A) and A share the null space as the following:
This means, if x is in the null space of A, x is also in the null space of A^{T} A. If x = 0 is only the possible case, A^{T} A has the inverse since the column space spans the whole dimension of A.
If we can multiply the inverse of A^{T} from the left, but, A is a rectangle matrix, therefore there is no inverse of it. Instead, we can multiply x^T from the left.
x^{T} A^{T} is a row vector, Ax is a column vector. The multiplication of them are an inner product.If Ax = b, then x^{T} A^{T} = (A x)^{T} = b^{T}, b^{T} b = 0. (Note, the last 0 is not a vector, a scalar) Inner product of the identical vectors become 0 if and only if 0. Since the inner product is \sum (b_i)^2 = 0 (squared sum = 0).
This is a nice explanation. We can also use the independence of A's columns, this concludes null space has only 0. A^{T}A shares the null space with A, this means A^{T}A's columns are also independent. Also, A^{T}A is a square matrix. Then, A^{T}A is a full rank square matrix. The columns of A are independent, but, it doesn't span the m-dimensional space since A is a rectangle matrix. Instead, the columns of A^{T}A span the n-dimensional space. Therefore, there is the inverse.
I would like to add one point. Assume B where A \neq B,
Therefore, I first thought B and A share the null space. It's wrong. Because,
This means only two vectors: a = (x^{T} B) and b = (A x) are perpendicular. It doesn't mean (x^{T} B) = 0. The transpose of this is the following.
We actually don't know x^{T} B is 0. Therefore, we don't know x is in the left null space of B or not. A and A^{T}A share the nulls pace, but, given arbitrary B, B and BA usually don't share the null space. In the Strang's book, this is not mentioned. Maybe it is too obvious, but, I misunderstand it at the first time.
Next time, I will explain this another point of view.
Let me explain why A^{T}A has the inverse, if the columns of A are independent. First, if a matrix is n by n, and all the columns are independent, then this is a square full rank matrix. Therefore, there is the inverse. So, the problem is when A is a m by n, rectangle matrix. Strang's explanation is based on null space. Null space and column space are the fundamental of the linear algebra. This explanation is simple and clear. However, when I was a University student, I did not recall the explanation of the null space in my linear algebra class. Maybe I was careless. I regret that...
Explanation based on null space
This explanation is based on Strang's book. Column space and null space are the main characters. Let's start with this explanation.
Assume x where
x is in the null space of A. The matrices (A^{T} A) and A share the null space as the following:
This means, if x is in the null space of A, x is also in the null space of A^{T} A. If x = 0 is only the possible case, A^{T} A has the inverse since the column space spans the whole dimension of A.
If we can multiply the inverse of A^{T} from the left, but, A is a rectangle matrix, therefore there is no inverse of it. Instead, we can multiply x^T from the left.
x^{T} A^{T} is a row vector, Ax is a column vector. The multiplication of them are an inner product.If Ax = b, then x^{T} A^{T} = (A x)^{T} = b^{T}, b^{T} b = 0. (Note, the last 0 is not a vector, a scalar) Inner product of the identical vectors become 0 if and only if 0. Since the inner product is \sum (b_i)^2 = 0 (squared sum = 0).
This is a nice explanation. We can also use the independence of A's columns, this concludes null space has only 0. A^{T}A shares the null space with A, this means A^{T}A's columns are also independent. Also, A^{T}A is a square matrix. Then, A^{T}A is a full rank square matrix. The columns of A are independent, but, it doesn't span the m-dimensional space since A is a rectangle matrix. Instead, the columns of A^{T}A span the n-dimensional space. Therefore, there is the inverse.
I would like to add one point. Assume B where A \neq B,
Therefore, I first thought B and A share the null space. It's wrong. Because,
This means only two vectors: a = (x^{T} B) and b = (A x) are perpendicular. It doesn't mean (x^{T} B) = 0. The transpose of this is the following.
We actually don't know x^{T} B is 0. Therefore, we don't know x is in the left null space of B or not. A and A^{T}A share the nulls pace, but, given arbitrary B, B and BA usually don't share the null space. In the Strang's book, this is not mentioned. Maybe it is too obvious, but, I misunderstand it at the first time.
Next time, I will explain this another point of view.
Comments