Gymnasium Interface
For reinforcement learning purposes, we provide an Open AI Gym / Gymnasium environment. To use, in the root of the catanatron repository:
pip install -e .[gym]Make your training loop, ensuring to respect info['valid_actions'] :
import random
import gymnasium
import catanatron.gym
env = gymnasium.make("catanatron/Catanatron-v0")
observation, info = env.reset()
for _ in range(1000):
# your agent here (this takes random actions)
action = random.choice(info["valid_actions"])
observation, reward, terminated, truncated, info = env.step(action)
done = terminated or truncated
if done:
observation, info = env.reset()
env.close()For action documentation see here.
For observation documentation see here.
You can access env.unwrapped.game.state and build your own "observation" (features) vector as well.
For evaluation and using your model in the simulator for testing / benchmarking you might want to checkout: https://github.com/bcollazo/catanatron/blob/master/catanatron_experimental/catanatron_experimental/machine_learning/players/reinforcement.py
Stable-Baselines3 Example
Catanatron works well with SB3, and better with the Maskable models of the SB3 Contrib repo. Here a small example of how it may work.
Configuration
You can also configure what map to use, how many vps to win, among other variables in the environment, with the config keyword argument. See source for details.
Last updated