Agent
The final policy is used to make the agent, which like the analytical agent, follows some rules.
The final policy is used to make the agent, which like the analytical agent, follows some rules.
Two modifications are made to the agent: the minimax algorithm is optimized with alpha-beta pruning and complexity is added to the heuristic.
I made the class for the environment using OpenAI gymnasium and initialized an environment.
This is the Kaggle competition regarding Game AI and Reinforcement Learning.
This is the Kaggle competition regarding Game AI and Reinforcement Learning.
The agent follows this algorithm to decide its next move deterministically:
The policy is a custom CNN policy made using PyTorch: