András Balogh, Márk Jelasity
The study evaluates the soundness of generative models in chess by using adversarial sequences to reveal their limitations and improve training techniques.
This research explores whether generative models, which are designed to predict sequences like moves in a game, truly understand the rules of the game, using chess as a test case. The authors developed a method to test these models by creating sequences that challenge the model to make a mistake. They found that while no model was completely sound, some training methods and data choices improved performance. Additionally, they examined how well the models understood the chessboard state, finding that most models did not rely on the board state for predictions.