Quality Diversity for Texas Hold'Em

AI Poker playing agent using RL

This research project was conducted as part of COMPSCI 683: Artificial Intelligence, under the instruction of Professor Yair Zick. Thank you to my collaborators John Raisbeck, Ruiqi Hu, and Giang Nguyen.

Abstract from:

Quality Diversity for Texas Hold’Em

Limit Texas Hold’em is a popular variant of Poker in which there are both community cards, which are common knowledge, and private cards, which are not known to other players, each with numerical values. We train agents to play one-on-one matches of Limit Texas Hold’em in a tournament setting, where each player is randomly matched with an opponent, drawn from a set containing itself, all other learning models, and a number of hand-coded Texas Hold’em agents. The learning agents are trained using a Reinforcement Learning Algorithm, PPO, which is an actor-critic method. The reward function is created by combining the gain of the player during a around and a diversity bonus at each slate, which rewards learning players for being different from one another, inspired by work on Novelty Search[1]. We provide agents with several abstractions on the state-space to simplify the game and improve training speed. By running all the models simultaneously in a large training tournament with diversity bonuses, we are able to maintain a high-quality and diverse training regime for each player.

Source