Can we measure the complexly of natural language by an entropy based compression method?(1)

Many of my friends came from other countries. We often talk about our own mother tongues. The discussion goes to which language is difficult or what kind of unique property each language has. German has a complex grammar system, Japanese has complex characters and unique counting system, and English has a huge vocabulary. I wonder ``What is the complexity of natural languages?'' and ``Can we measure them?''

Together with my friends I translated one Japanese text to English and German. Then we apply an entropy based compression method on them to see how much information each translated text has. This might tell which language is complex in a sense of entropy. Namely, I try to measure that ``If the contents are the same, how much information entropy differs depends on a language?''

I will write a few articles regarding with this topic.

Comments

Geometric Multiplicity: eignvectors (2)

If eigenvectors of a matrix A are independent, it is a happy property. Because the matrix A can be diagonalized with a matrix S that column vectors are eigenvectors of A . For example, Why this is a happy property of A? Because I can find A's power easily. A^{10} is not a big deal. Because Λ is a diagonal matrix and power of a diagonal matrix is quite simple. A^{10} = SΛ^{10} S^{-1} Then, why if I want to compute power of A ? That is the same reason to find eigenvectors. Eigenvectors are a basis of a matrix. A matrix can be represented by a single scalar. I repeat this again. This is the happy point, a matrix becomes a scalar. What can be simpler than a scalar value. But, this is only possible when the matrix S's columns are independent. Because S^{-1} must be exist. Now I come back to my first question. Is the λ's multiplicity related with the number of eigenvectors? This time I found this has the name. Geometric multiplicity (GM): the number of in...

Gauss's quote for positive, negative, and imaginary number

Recently I watched the following great videos about imaginary numbers by Welch Labs. https://youtu.be/T647CGsuOVU?list=PLiaHhY2iBX9g6KIvZ_703G3KJXapKkNaF I like this article about naming of math by Kalid Azad. https://betterexplained.com/articles/learning-tip-idea-name/ Both articles mentioned about Gauss, who suggested to use other names of positive, negative, and imaginary numbers. Gauss wrote these names are wrong and that is one of the reason people didn't get why negative times negative is positive, or, pure positive imaginary times pure positive imaginary is negative real number. I made a few videos about explaining why -1 * -1 = +1, too. Explanation: why -1 * -1 = +1 by pattern https://youtu.be/uD7JRdAzKP8 Explanation: why -1 * -1 = +1 by climbing a mountain https://youtu.be/uD7JRdAzKP8 But actually Gauss's insight is much powerful. The original is in the Gauß, Werke, Bd. 2, S. 178 . Hätte man +1, -1, √-1) nicht positiv, negative, imaginäre (oder gar um...

Tezuka Osamu's Black Jack, "Shrinking"

I like several novel authors. My first favorite author is probably Teduka, Osamu. I still love him. The list grows by adding Hoshi, Shinichi, Agatha Christie, Hermann Hesse, and so forth. My first favorite article of Tezuka was Atom as most of the (boy's) Tezuka fans did. But my favorite is Black Jack. I try to summarize one story, it is still quite vivid in my memory. I first read this story when I was 13 - 15 years old. I re-read it at least several times since Black Jack is composed of many short episodes. The title should be "ちぢむ (SHRINKING)" or it might be "縮む(Shrinking)". (It is not so convenient to translate this to English, since English does not have a system to say the exact same word in several ways. So I just simulate it with capital letters.) Black Jack is a genius surgeon, but he does not have the license. In short, his medical activity is illegal. His skill is top level in the world, but, the fee is also out-of-law expensive. In the story ...

Lx=d: SundayResearch

Search This Blog