# Anything you can do AI can do better (?): Playing games at a new level

Learning to play games has been a great test for AI.  Being able to generalise from relatively simple rules to find optimal solutions shows a form of intelligence that we humans always hoped would be impossible.  Back in 1997, when IBMs Deep Blue beat Gary Kasparov in chess1 we saw that machines were capable of more than brute force solutions to problems.  20 years later2 and not only has AI mastered Go with Google’s DeepMind winning 4-1 against the world’s best player and IBM’s Watson has mastered Jeopardy,  there have also been some great examples of game play with many of the games I grew up playing: Tetris,  PacMan3, Space Invaders and other Atari games.  I am yet to see any AI complete Repton 2.

All these gaming victories, while exceptionally impressive without any brute force calculation are based on known data.  There may be $10^{170}$ moves in a game of Go, but all the data is available.  You (and any AI) can see the entire board.  You know the number of pieces.  All the information you need to make a decision is available to you.  Similarly with Jeopardy – not so much moves but data connections based on a known corpus to create the (very limited type of) question from the answer.

18 months ago I started thinking about a different gaming problem.  I have played a lot of poker, both live and online, and was curious about applying deep learning to a situation where there is more to consider than just facts.  Poker is not only a problem of missing information (you can only see your own and any shared cards, which are a subset of the total pack) but player personalities are also part of the game4.  How you bet gives an indication of the strength of your hand, but you can lie with this.  Limping in when you have a pair of aces can entice other players to bet when they may not have a great hand.  A ballsy all-in with a 5-2 off suit can win you a pot you don’t deserve.  This is a problem of judgement, and I was keen to see if any AI could master this intuition.

As I was gainfully employed at the time, working long hours5, I never progressed this more than pondering while doing other things, but it’s always been something I’ve wanted to pick up.  However, in the past few days, an early version of a paper has been released on Arxiv with an AI that can beat professional poker players, so I get my answer!  You can read their article at: https://arxiv.org/abs/1701.01724 and it’s very accessible, even if you don’t have a background in maths or computing (although you may need an understanding of poker!).

A collaboration between the University of Alberta, Canada and Charles University and Czech Technical University, both in Prague, Czech Republic, has resulted in a new algorithm called DeepStack, which is designed for imperfect information situations.  Unlike previous attempts at creating poker playing AI, what is novel about DeepStack is that it judges each situation as it occurs rather than deciding on a style of play at the start of each hand, avoiding redundant reasoning on outcomes that are removed due to the actions of other players or new revealed cards.

This is a much more natural6 way of reasoning.  When playing against humans, the DeepStack AI had a win rate of over 450 milli-big-blinds per game7, which is considerable.  The algorithm looks at several concepts: what is currently known (visible cards and probabilities, hands that are impossible), plus two look ahead actions.  The first look ahead is shallow and covers what could happen and how play would adapt for each outcome.  The second look ahead is deeper, focussing on what the algorithm thinks is most likely to happen based on its own “intuition”.  This is constantly updated so that the other players betting styles can be taken into account, just as a human player would determine whether the large raise is a bluff or not.  A combination of all three approaches give the resulting action.  Every assumption acted on is scored based on desired outcome and this information is fed back into the decision.

What interests me most is that the authors describe this as a general purpose algorithm for situations with imperfect information.  I’m very excited by the potential applications to which this could lead.

And I’d love to give it a game 🙂

1.   It was a narrow victory due to draws but very impressive never the less.
2.   Yes, 1997 was 20 years ago!
3. Which used to be the defacto standard for games testing with AI.
4.   If you don’t know Poker, the rest of this post may not make a huge amount of sense, but you can get a summary here
5. And doing my maths degree and a 3D printer etc etc 🙂
6. i.e. Human
7. A general measure of winning in poker
Dr Janet is a Molecular Biochemistry graduate from Oxford University with a doctorate in Computational Neuroscience from Sussex. I’m currently studying for a third degree in Mathematics with Open University.

During the day, and sometimes out of hours, I work as a Chief Science Officer. You can read all about that on my LinkedIn page.