My 3-year-old kid has become a huge, huge fan of his card game “Ballons.” We’ve been playing many games every day during the last three months. The game is straightforward, so after a while, I had to find something to fuel some interest in playing with him. What about finding a better strategy than him? To do so, we will study the game’s properties and write many Monte Carlo simulations.
Ballons is a 2-4 players card game.
These are the rules of the game:
That’s it. The ratings often say “it’s a great first game for small kids, but it quickly becomes boring for everybody.”
Initial state of the game
The game after a few action cards
There were many, but let’s see:
We will focus on two-player games because that’s easier to visualize, and because we only play this.
To answer the last question, we need to model and answer the following question:
The only possible choice in this game happens when you can recover a balloon, and you popped more than one. How do you choose which balloon to recover?
My kid seems to choose randomly, so modeling its play was easy:
I wrote a lot of rust and python code in order to find the answers to my questions. I implemented the rules of the game, then wrote various simulations in order to identify some numbers that we’ll see in this article. The 200 lines of the ballons library contains the game code. This library is then used in many simulations at the root of the project.
The simulations all are Monte Carlo based, which means, in short, “run as many games with random initial conditions as you can, then average the results.” Due to the simplicity of the game, standard Monte Carlo is enough.
I did not need to write two implementations: it just happens that it was a fun project, and I got carried away with a never-ending stream of «just one more thing»: more speed, better algorithms, FFI, parallelization, and so on.
Monte Carlo is iterative. Accuracy in a Monte Carlo simulation converge with a square root law: in order to have one more digit accuracy (so a x10 improvement), you need 10^2 = 100 more iterations. So in order to have a good accuracy for enough digits, you need a lot of iterations, and powers of 100 grow quickly.
I ran between 10k and 49x10 millions simulations per program, which was a bit slow for certain configurations (~30mn) in Python. Using a faster language like Rust helped. Every simulation is independant: Monte Carlo can easily be parallelized, so in the end I took advantage of that as well.
What makes this game interesting to me is that it sounds easy to simulate and very limited:
Let’s see first what the game depth is, then we will estimate the branching factor.
That was my initial question: some games are very long. This is sometimes a complain in the reviews. In my experience, the perfect match requires no more than two deals of the action deck. Otherwise, my kid loses interest. He likes dealing cards and playing many games, not spending a lot of time during the actual game.
So I wrote a Monte Carlo simulation: I ran many random games with random hands, counted the length of the games, then plotted a bar chart of the results:
When you change the number of parent cards, the curve gets skewed towards shorter games. That’s quite intuitive, but that’s the first result that shows we can learn some stuff (the actual probabilities) with our code.
In my experience, the ideal number of parent card is 3 or 4. With 5, games tend to be too long; they are often very short with 1 or 2.
Let’s move on to some more fun stuff!
The branching factor counts, from a given state, to how many states we can go.
For us, the branching factor derives from the 25 action cards:
There are:
The intuition is that the branching factor is low. How low?
We can find some bounds:
Then, we can average it:
So an upper bound for the branching factor is 20/25 * 1 + 5/25 * 4 = 0.8 + 4*0.2 = 1.6
Finally, we can simulate it: every time we’ll pick a card in a game, we’ll count of many branches are possible.
We will do so in many games:
python3 gen_branching_factor.py --nb_iterations 10000
1.2827724636190512
With 10000 iterations, the number we find is 1.28
.
That’s very low ; in chess, the branching factor is around 30 on average. That’s the source of complains in the review: there is not a lot of choice for the adult :(
That being said, even with a shallow branching factor but with exceptionally long games, many games are possible:
>>> 1.28 ** 100
52601359015.48384
It’s hard to explore all the branches.
We’ll run many random games against in a classic Monte Carlo fashion, but this time we’ll look at:
The 25 balloon cards: there are 5 balloons for each of the 5 different colors:
With this, the hands of 5 cards follow a structure that I’ll use during the rest of this article. They can be one of those:
where:
In the sample initial state above, even though we did not sort the cards, we have seen a game “311 vs. 2111”.
It turns out we can enumerate all the hands. I did this through Monte Carlo At first. It’s not very difficult to list all the seven different structures’ hands. So if we pick five random cards, here are the odds that they’ll follow a given configuration:
The 5
s and the 14
s overlap, so its hard to read:
Some conclusions:
I ran, again, many monte carlo games.
At first, I picked 2 random hands during a given number of iterations, then ran a game for this encounter.
time python3 gen_hand_heatmap.py -p 5 -b 5 -i 10000000
…
real 9m57,285s
The problem with this approach is that we’ve seen some hands are very infrequent, and it was tough to have good numbers for some encounters.
Here is a more visual translation of this problem, through a heatmap, with all the games after the 10000000 iterations :
We can see that there were:
Those numbers are about what the theory suggests:
0.22 * 10000000 games = 2200000 games, which is more or less what we found during the simulation (2264360 games of 1112 vs 1112).
We can derive those numbers for the other examples in the same manner.
So if we go for a purely random approach, we have a problem:
A solution to those two problems? We will generate all the hands for each of the seven structures, then generate all the possible encounters for the 7x7 possible hand structure pairs.
I wrote a short utility
. It turns out there are not that many hands for each structure:
$ time python3 gen_every_hands.py
11111 1
5 5
41 20
32 20
311 60
221 60
2111 120
In total, that’s 286 different hands.
So now, for each pairs of structures, we will run many iterations of games with two hands picked randomly in the encounter list we generated. By doing so, we can have a completely uniform distribution of the games played. All the probabilities we will have in the end will have the same level of trust, no matter how rare the encounter is.
We can also notice that some games are impossible: with a standard deck, it is impossible to deal a 5 vs. 11111.
Now, we can simulate the games and plot the winning rates for each kind of encounter:
time python3 gen_hand_heatmap_uniform.py --nb_iterations 100000
…
real 4m42,866s
Many things to say already:
Ok, so here we are: we can simulate many games with two players who play randomly, and we can estimate the winning odds when two hands face each other.
It’s time to simulate the same thing, but we will change the strategy for one player (ours). Can we crush him ? Again, the only strategy we need is how to choose which to recover. There are many possible strategies, like:
Those do not look promising though: we want to improve our odds. There is something we can use: deck knowledge.
We can count the remaining cards in the action deck for each color. When you have to choose which color to recover, we will choose the least available in the rest of the action deck.
The idea for this heuristic is that, in the long run, the card we will recover is less likely than the others to be popped again.
Let’s simulate many games for this heuristic:
python3 gen_hand_heatmap_uniform_with_counter.py -i 100000
Then, I put the result in a heatmap:
That’s a bit hard to read. Did we improve? Let’s compute the difference between the last two charts to see how much we improved:
That a small improvement for 11111 and 2111, but that’s an improvement anyway! Given the low branching factor and the high randomness, we couldn’t expect a considerable improvement regardless.
note: substracting two imprecise numbers increases the imprecision. I never properly quantified the error margin, so I would’nt put too much trust on those numbers.
Now you know how to play against your 3yo ;)
Ok, so that was a very complicated way to prove my point: it is possible to improve your odds (here, only or certain configurations) with a simple heuristic, and to quantify the improvement.
Anyway, down the road we’ve learn a bunch of things about game theory, so that’s a net win.
That was fun but I’ve already spent way too much time with this game. Here are some ideas if you think this is a serious project that deserves some more research:
red/red/red/blue/blue
vs. yellow/yellow/yellow/green/purple
is one permutation away from yellow/yellow/yellow/blue/blue
vs. red/red/red/green/purple
, which asymptotically have the same odds.