AutoCFR: Learning to Design Counterfactual Regret Minimization Algorithms
Hang Xu* , Kai Li*, Haobo Fu, Qiang Fu, Junliang Xing#
AAAI 2022 (Oral)
sudo apt install graphviz xdg-utils xvfb
conda create -n AutoCFR python==3.7.10
conda activate AutoCFR
pip install -e .
pytest testsTo easily run the code for training, we provide a unified interface. Each experiment will generate an experiment id and create a unique directory in logs. We use games implemented by OpenSpiel [1].
python scripts/train.pyYou can modify the configuration in the following ways:
- Modify the operations. Some codes are from Meta-learning curiosity algorithms[2]. Specify your operations in
autocfr/program/operations.pyand specify a list of operations to use inautocfr/generator/mutate.py. - Modify the type and number of games used for training. Specify your game in
autocfr/utils.py:load_game_configs. - By default, we learn from bootstrapping. If you want to learn from scratch, Set
init_algorithms_fileto["models/algorithms/empty.pkl]inscripts/train.py. - Modify the hyperparameters. Edit the file
scripts/train.py. - Train on distributed servers. Follow the instructions of ray to setup your private cluster and set
ray.init(address="auto")inscripts/train.py.
You can use Tensorboard to monitor the training process.
tensorboard --logdir=logsRun the following script to test algorithms learned by AutoCFR. By default, we will test the algorithm with the highest score. logid is the generated unique experiment id. The results are saved in the folder models/games.
python scripts/test_learned_algorithm.py --logid={experiment id}Run the following script to test learned algorithms in Paper, i.e., DCFR+, AutoCFR4, and AutoCFR8. The results are saved in the folder models/games.
python scripts/test_learned_algorithm_in_paper.pyWe use PokerRL [3] to test learned algorithms in HUNL Subgames.
cd PokerRL
pip install -e .
tar -zxvf texas_lookup.tar.gzRun the following script to test learned algorithms in Paper, i.e., DCFR+, AutoCFR4, and AutoCFRS. The results are saved in the folder PokerRL/models/.
python PokerRL/scripts/run_cfr.py --iters 20000 --game subgame3 --algo=DCFRPlus
python PokerRL/scripts/run_cfr.py --iters 20000 --game subgame3 --algo=AutoCFR4
python PokerRL/scripts/run_cfr.py --iters 20000 --game subgame3 --algo=AutoCFRSIf you use AutoCFR in your research, you can cite it as follows:
@inproceedings{AutoCFR,
title = {AutoCFR: Learning to Design Counterfactual Regret Minimization Algorithms},
author = {Hang, Xu and Kai, Li and Haobo, Fu and Qiang, Fu and Junliang, Xing},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2022},
pages = {5244--5251}
}
[1] Lanctot, M.; Lockhart, E.; Lespiau, J.-B.; Zambaldi, V.; Upadhyay, S.; P´erolat, J.; Srinivasan, S.; Timbers, F.; Tuyls, K.; Omidshafiei, S.; Hennes, D.; Morrill, D.; Muller, P.; Ewalds, T.; Faulkner, R.; Kram´ar, J.; Vylder, B. D.; Saeta, B.; Bradbury, J.; Ding, D.; Borgeaud, S.; Lai, M.; Schrittwieser, J.; Anthony, T.; Hughes, E.; Danihelka, I.; and Ryan-Davis, J. 2019. OpenSpiel: A Framework for Reinforcement Learning in Games. CoRR, abs/1908.09453.
[2] Alet, F.; Schneider, M. F.; Lozano-Perez, T.; and Kaelbling, L. P. 2019. Meta-learning curiosity algorithms. In International Conference on Learning Representations, 1–21.
[3] Steinberger, E. 2019. PokerRL. https://github.com/TinkeringCode/PokerRL.