Skip to main content

World Cup 2010 Monte-Carlo Simulator (1)

One of my friends suggested us to predict the world cup points. However, I have no idea. Then I decided to write a Monte-Carlo simulator for the world cup points prediction. As a Sunday researcher, I am interested in Monte-Carlo simulation and discuss about this method with my colleagues. But, I have not implemented such simulator that simulates specific probability distribution. I see this is a good opportunity to implemented it.

My company produces programs that uses a variant of Monte-Carlo simulators. This is good for some specific area, like physical simulation, however, can I use this for world cup prediction? I doubt it. I also don't want to spend more than an hour to implement it. If you are a world cup fun, to simulate it doesn't make sense and no fun, I presume. Also this method can not predict each result anyway. It is like Hari Seldon's Psychohistory in Asimov's Foundation. We could predict average or distribution of the points in this world cup 2010 based on past world cups, however, we hardly predict the each result. Although, this method has this limitation, I have totally no idea about the teams, I think it is better than my prediction. I heard this method is also used to predict the stock market.

In mathematics, we could solve some problems in the range of the assumption. Assumption is important. The problem is how much my assumption works. I set the world cup prediction problem with the following assumptions.

  • Assumption 1. Each match, each team's point is independent.
  • Assumption 2. The point distribution of this 2010 world cup is the same as the last 2006's one.

First, the assumption 1, this is outrageous assumption. This means the point prediction doesn't matter the opposite team. For example, Japan against any team, the predicted points are the same. Unreasonable. However, to be honest, I don't know anything about world cup (Yesterday, I happened to know Japan plays in the world cup this year). So I imagine, maybe most of the team have the similar skills. This is assumption 1. If this instinct is wrong, the result will tell me. If the assumption is wrong, any mathematics gives us garbage.

Second, I set the assumption 2 since my friend's web page has only the last time's result. It might be better to use the past world cup data as many as possible if the rule did not changed. Well, I am just lazy. This assumption might be also wrong.

I implemented a simulator based on these assumptions (wc2010.rb). This program generates a similar point distribution based on 2006's point distribution via ruby's pseudo random number generator. One problem is how to initialize the pseudo-random number generator. This is just a luck. I need one number, called seed. I could use my birthday, or current time in seconds from this January 1st, ... I just pick my friend's suggestion, 42.

Last world cup points distribution is as follows.

WC2006 result distribution

Points
  0    :************************************************
  1    :************************************
  2    :****************************
  3    :***********
  4    :****
  5    :
  6    :*

My simulator's distribution

Points
  0    :******************************************
  1    :******************************************
  2    :*******************************
  3    :*******
  4    :*****
  5    :
  6    :*

They are kind of similar. In world cup 2006, no team had a point 5, therefore, the prediction doesn't have point 5 also. If there is point 5 this time, no chance. Also there is no points more than 6.

The following is the prediction result of the simulator. So far only one result is correctly predicted. I think the assumption 1 is not so good.

 Estimate       Result
 [1]: 2 1       1 1
 [2]: 2 0       0 0
 [3]: 2 1       2 0
 [4]: 1 0       1 0     -> match ARG:NGA
 [5]: 2 3       1 1
 [6]: 1 2       0 1
 [7]: 1 1       0 1
 [8]: 2 3       4 0
 [9]: 2 2       2 0
[10]: 0 0       1 0
[11]: 0 1       1 1
[12]: 0 2       1 1
[13]: 2 0       0 0
[14]: 0 0
[15]: 1 1
[16]: 0 0
[17]: 1 1
[18]: 0 4
[19]: 2 2
[20]: 1 2
[21]: 1 4
[22]: 0 6
[23]: 2 1
[24]: 1 0
[25]: 1 1
[26]: 0 1
[27]: 1 2
[28]: 1 3
[29]: 1 3
[30]: 0 2
[31]: 1 0
[32]: 0 1
[33]: 0 0
[34]: 0 2
[35]: 1 0
[36]: 3 3
[37]: 0 2
[38]: 1 0
[39]: 1 2
[40]: 2 1
[41]: 1 0
[42]: 4 0
[43]: 0 0
[44]: 1 1
[45]: 0 1
[46]: 2 0
[47]: 0 4
[48]: 0 1
[49]: 2 0
[50]: 1 2
[51]: 1 0
[52]: 2 0
[53]: 0 1
[54]: 1 2
[55]: 0 2
[56]: 0 0
[57]: 1 1
[58]: 2 4
[59]: 3 1
[60]: 0 2
[61]: 1 2
[62]: 1 2
[63]: 1 0
[64]: 0 2

Comments

Unknown said…
You are crazy! I don't understand a word of what you say, but please let me know, if math and reality match.
Do you actually watch soccer?
After Japan has lost today, I hope you will support the German team on Saturday?!
Best, Rebecca
Unknown said…
You are crazy! I don't understand a word of what you say, but please let me know, if math and reality match.
Do you actually watch soccer?
After Japan has lost today, I hope you will support the German team on Saturday?!
Best, Rebecca
Shitohichi said…
Yes, it is a kind of crazy idea. Math is a method of finding similarity/patterns. If something happened in the past, and it could happen again, we can predict it somehow. But, this method has a huge limitation and I don't want to bet on these numbers. Practically this is throwing a dice to decide the points, but, it is a little bit better than that since this dice respects the last WC result. For example, this dice has 0,1,2,3,4,6 (no 5) and 0 will show up at 40.1 percent probability based on the last WC. This time there was a 7-0 match and this can not be predicted since the last WC, no team had 7 points. So, it's just for fun. Currently I have 51 points (2 points for correct winner, 3 points for correct winner + correct difference, 4 points for exact prediction).

Popular posts from this blog

Why A^{T}A is invertible? (2) Linear Algebra

Why A^{T}A has the inverse Let me explain why A^{T}A has the inverse, if the columns of A are independent. First, if a matrix is n by n, and all the columns are independent, then this is a square full rank matrix. Therefore, there is the inverse. So, the problem is when A is a m by n, rectangle matrix.  Strang's explanation is based on null space. Null space and column space are the fundamental of the linear algebra. This explanation is simple and clear. However, when I was a University student, I did not recall the explanation of the null space in my linear algebra class. Maybe I was careless. I regret that... Explanation based on null space This explanation is based on Strang's book. Column space and null space are the main characters. Let's start with this explanation. Assume  x  where x is in the null space of A .  The matrices ( A^{T} A ) and A share the null space as the following: This means, if x is in the null space of A , x is also in the n...

Gauss's quote for positive, negative, and imaginary number

Recently I watched the following great videos about imaginary numbers by Welch Labs. https://youtu.be/T647CGsuOVU?list=PLiaHhY2iBX9g6KIvZ_703G3KJXapKkNaF I like this article about naming of math by Kalid Azad. https://betterexplained.com/articles/learning-tip-idea-name/ Both articles mentioned about Gauss, who suggested to use other names of positive, negative, and imaginary numbers. Gauss wrote these names are wrong and that is one of the reason people didn't get why negative times negative is positive, or, pure positive imaginary times pure positive imaginary is negative real number. I made a few videos about explaining why -1 * -1 = +1, too. Explanation: why -1 * -1 = +1 by pattern https://youtu.be/uD7JRdAzKP8 Explanation: why -1 * -1 = +1 by climbing a mountain https://youtu.be/uD7JRdAzKP8 But actually Gauss's insight is much powerful. The original is in the Gauß, Werke, Bd. 2, S. 178 . Hätte man +1, -1, √-1) nicht positiv, negative, imaginäre (oder gar um...

Why parallelogram area is |ad-bc|?

Here is my question. The area of parallelogram is the difference of these two rectangles (red rectangle - blue rectangle). This is not intuitive for me. If you also think it is not so intuitive, you might interested in my slides. I try to explain this for hight school students. Slides:  A bit intuitive (for me) explanation of area of parallelogram  (to my site, external link) .