Python PIL experiment (a image comparison tool) continued

PIL and numpy

When I ran this program on my data files, I found the processing time is around 6 seconds, the memory consumption size is 230MB on a 1024x1024 size image. When I processed images resolution of 3840x2160, it took 263 seconds and 2.3 GB memory is consumed. The difference of these resolutions makes only eight times different number of pixels. But the processing time is increased more than 40 times. In my program I only use three buffers for processing, my first estimated minimal program sizes are 10MB for 1024x1024 resolution and 72MB for 3840x2160 resolution. However, the `top' reported 30 times more memory size.

When I profiled the program, the most of the time is consumed by the tuple construction (RBG value) and abs function. Therefore, I tried to use numpy to vectorize these code. A table below shows the result. My test environment of Intel Core i7-2720 2.20GHz Linux (Kubuntu 12.10, kernel 3.5.0-27), Python 2.7.

+-----------+--------------------------------------+
| image res |   1024x1024      |    3840x2160      |
+-----------+-------+----------+--------+----------+
|           |  mem  | time     |  mem   | time     |
+-----------+-------+----------+--------+----------+
| native    | 230MB | 6.0  sec | 2300MB | 263  sec | 
| numpy     | 110MB | 0.21 sec | 320MB  | 1.18 sec |
+-----------+-------+----------+--------+----------+

The performance was improved 30 times and up to 200 times faster. The memory consumption size reduced to 50% up to 15%. Actually, my first implementation can be improved only twice faster, so I was disappointed. After profiling, I found the sum function spend most of the time. I used the sum function to count the non-zero elements in the array. This sum function is python build-in function and can access to the numpy's array. However, I expect this sum function accesses to the each data and return to the python environment. When I replaced this sum with numpy.sum, the numpy.sum executed almost no time. I achieved 200 times better performance. This is pretty much like to matlab programming. (numpy is a matlab's Python port. I mean it is similar not only the syntax, but it is also similar to how to get the performance.)

ImgCompNumpy.py code

Comments

Gauss's quote for positive, negative, and imaginary number

Recently I watched the following great videos about imaginary numbers by Welch Labs. https://youtu.be/T647CGsuOVU?list=PLiaHhY2iBX9g6KIvZ_703G3KJXapKkNaF I like this article about naming of math by Kalid Azad. https://betterexplained.com/articles/learning-tip-idea-name/ Both articles mentioned about Gauss, who suggested to use other names of positive, negative, and imaginary numbers. Gauss wrote these names are wrong and that is one of the reason people didn't get why negative times negative is positive, or, pure positive imaginary times pure positive imaginary is negative real number. I made a few videos about explaining why -1 * -1 = +1, too. Explanation: why -1 * -1 = +1 by pattern https://youtu.be/uD7JRdAzKP8 Explanation: why -1 * -1 = +1 by climbing a mountain https://youtu.be/uD7JRdAzKP8 But actually Gauss's insight is much powerful. The original is in the Gauß, Werke, Bd. 2, S. 178 . Hätte man +1, -1, √-1) nicht positiv, negative, imaginäre (oder gar um...

Why A^{T}A is invertible? (2) Linear Algebra

Why A^{T}A has the inverse Let me explain why A^{T}A has the inverse, if the columns of A are independent. First, if a matrix is n by n, and all the columns are independent, then this is a square full rank matrix. Therefore, there is the inverse. So, the problem is when A is a m by n, rectangle matrix. Strang's explanation is based on null space. Null space and column space are the fundamental of the linear algebra. This explanation is simple and clear. However, when I was a University student, I did not recall the explanation of the null space in my linear algebra class. Maybe I was careless. I regret that... Explanation based on null space This explanation is based on Strang's book. Column space and null space are the main characters. Let's start with this explanation. Assume x where x is in the null space of A . The matrices ( A^{T} A ) and A share the null space as the following: This means, if x is in the null space of A , x is also in the n...

Why parallelogram area is |ad-bc|?

Here is my question. The area of parallelogram is the difference of these two rectangles (red rectangle - blue rectangle). This is not intuitive for me. If you also think it is not so intuitive, you might interested in my slides. I try to explain this for hight school students. Slides: A bit intuitive (for me) explanation of area of parallelogram (to my site, external link) .

Lx=d: SundayResearch

Search This Blog