It's interesting to watch the videos they link of deepmind playing against the top-level Stratego masters [0]. I usually find Stratego to be a bit of a dull game (less elegant and more drawn out than Go and chess), but I'm a sucker for watching top-level AIs play.

Its skills for bluffing are both fascinating and a bit scary.


Call me a cynic but the fact that after almost 10 years of AI hype we are still working our way down the list of popular board games is a bit of a downer for me. I mean, having AIs to play Stratego, Risk, Go, Diplomacy and what have you against sure is nice. But there are literally billions of dollars spent on these projects and I really come to the point where I just don't believe anymore that the current AI approaches will ever generalize to the real world, even in relatively limited scopes, without the need for significant human intervention and/or monitoring. What am I missing?
Great article. I played Stratego a lot as a kid and it always felt simpler than chess, go , or poker so it’s surprising it’s a much bigger game tree unless you stop and think.

I’m curious about the comparisons to poker. I know the hot algorithm in poker solvers is counter factual regret minimization. The article indicates that the feedback cycle is too long for those algorithms to work but I’d be curious to learn more about the relationship from CFR to what’s tried here, if any.

Can anyone shed light on in what way this is more challenging than the starcraft or dota agents, which also had to work with imperfect information?
I remember seeing a version of the paper earlier in the year (it talked a lot about getting the bot to be aggressive to avoid stalemates).

Feels like the secret sauce has to be probability distributions guessing what all the pieces are.

Bluffing in stratego seems like it requires long-term planning (if you move a 2 like a 10, you have to keep treating it like that for the bluff to work).

There's an extra space in the link to their code (at the end of the article). The correct URL is:

On a related note and surely interesting to the HN crowd I invented Clesto which is similar to Stratego but a bit more like chess, based on an old chinese game: - it is more quick to play and has open information.
Did anyone else call the dude with the tall hat "Hat Guy" despite them all having hats?
The paper is sadly paywalled. I believe this is the preprint:

I 404'ed when I tried to access the source code?

Someone needs to create a web front end for this -- I would love to play it.

Tangentially taking this opportunity to mention the far-superior "Lying and Cheating" version of Stratego, that (as far as I know) my father invented.

It makes the game so much more interesting, IMO. Played it a lot as a child.

Here are the basic rules, when a piece is attacked:

  * The attacker says what their piece is, without showing it (they can lie)
  * The defender says whether they believe that
  * The defender says what their piece is, without showing it (they can lie)
  * The attacker says whether they believe that
  * ONLY IF someone calls a bluff is that piece revealed. Otherwise, it is treated as the piece it was claimed to be, and kept hidden.
  * If someone calls a bluff, and they were right, then the other player loses a piece (reach over and remove any piece you like)
  ** If you pick their flag, then you win — game over.
  * Likewise, if someone calls a bluff but is wrong, then *they* lose a piece. 
  * After all of that is resolved, do combat as normal, with pieces having either their revealed or not-revealed claimed value, as appropriate.
Once you resolve all this, there is no "memory" - you can claim it is a different piece in the future.

Some minutiae:

  * You can move any piece as though it were a Scout (9), but when you do the move, the other player can call your bluff since you're essentially claiming it is a Scout at that moment. Resolve that bluff/call before completing the move.
  * You could even call a bluff on *any* move someone makes, if you believe that piece is a bomb or flag (and thus cannot move).
  * You can attack with a bomb! It's a two-step process: first you move (and they could call your bluff, if they know it is a bomb - see above). Then, when the attack happens, you say it *is* a bomb. Of course, your opponent may say their piece is a Miner, and if you haven't seen it, it's a dangerous proposition (since bombs are rare).
  ** You can also do a variant where bombs can't attack (by attacking, you are claiming it is *not* a bomb). I prefer the above version.
Overall, I find this version of the game is a lot less boring. Since you'll probably get several pieces zapped over the course of the game, it affects your flag placement. Plus, you can move flags and bombs, making it more dynamic. Also, the "remember where things were" aspect is even more poignant, since once a piece has been revealed, it loses all the power of being whatever-is-needed-right-now (assuming the other player has a good memory).

So, for instance, you can do something crazy like move your bomb as though it were a Scout, all the way across the board, onto an opponent's piece, but then claim it's a "5" instead for the attack. Then if it survives, just let it sit there, continuing to be a bomb in the future (causing havoc).

One of my favourite things about watching AIs learn and play games I’ve played is seeing if they come across the same weird strategies I did.

I remember my brother and I as kids playing Stratego and discovering the “impenetrable bunker of bombs” to put your flag in. Which evolved to “put a scout in as a ruse” and later “don’t actually enclose it because now brother just assumes it’s enclosed.”

I'd love to see them tackle Chess
What happened to mastering StarCraft? Why did these guys give up after a mid tier pro player defeated the bot handily on live stream? They were super enthusiastic up until that point.
I'd rather learn how to play Strategema.