Yuma Fujimoto, Kenshi Abe, Kaito Ariu
The paper demonstrates that using extra optimism in Weighted Optimistic Gradient Descent-Ascent (WOGDA) significantly accelerates convergence in bilinear games with delayed feedback.
In many real-world scenarios where multiple agents learn together, delays in feedback can slow down their learning process. This study looks at how we can speed up learning in such situations by using a method called Weighted Optimistic Gradient Descent-Ascent (WOGDA). By predicting future rewards more optimistically, the method helps agents converge to the best strategy more quickly, even when there are delays. The findings show that this 'extra optimism' approach can effectively counteract the negative effects of feedback delays.