Normal vector as a covariant vector

This third explanation includes how to transform normal vectors. This explanation is based on one of my favorite book[5] by Sugihara. This explanation is a bit formal and less intuitive compare to the first and the second explanation. If you are not interested in a formal explanation, you can skip this section.

First I would like to introduce the affine transformation.

Affine transformation is one of linear transformations. This is quite often used in computer graphics area. Affine transformation transform a line to a line and keep the ratio on a line. If we include a degenerated case, a triangle is always transformed into a triangle. Assume a representation of a three-dimensional affine transformation is a 3x3 matrix. This transformation is a transformation between two coodinate systems. Therefore, I think an object deformation by the transformation is a secondary effect. As a result, we can deform an object. However, this is a transformation of coordinate systems, we can not deform an object arbitrarily. In general, this transformation is a combination of scaling, roration, translation, and shearing. This means, an affine transformation can not transform a triangle to a circle.

If we consider an affine transformation as an coordinate transformation, our interest is how the coordinate of a point P represented in the two different coordinate systems \Sigma and \Sigma'. Note, P itself doesn't move as shown in Figure[3]. P doesn't move, but the coordinates are changed depends on the coordinate system.  These coordinate systems may not have perpendicular basis. They may change the distance depends on the axis direction. For example, x direction is two times magnified to y direction.
 Figure 3. A matrix A transforms the coordinate system \Sigma to \Sigma'. Note: the point P doesn't move, but its coordinates representation may differ depends on the coordinate system.

It's a bit cumbersome, but I will write down how the coordinates of P is represented in the coordinate system \Sigma and \Sigma'.

A point P is represented in a coordinate system \Sigma,
and is represented in a coordinate system \Sigma',
A representation means you can define the concrete coordinates, e.g., (1,1,0)^T. On the other hand, if you just have a point P, you don't need to know what is the exact coordinates. Even you don't need to know this point P is in 2-dimensional space or 3-dimensional space. The relationship between a point P and its representation is similar to a linear operator T and its representation matrix M. I can also think the relationship between an interface and its implementation in a programming language context. (I just think the next step of this analogy is related with Futamura projections[1], but it is beyond this article and my understanding is not enought to explain it yet.)