Adaptive Learning in Continuous Games: Optimal Regret Bounds and Convergence to Nash Equilibrium
Yu-Guan Hsieh , Kimon Antonakopoulos , Panayotis Mertikopoulos
Session: Online Learning, Game Theory 2 (A)
Session Chair: Vidya K Muthukumar
Poster: Poster Session 2
Abstract:
In game-theoretic learning, several agents are learning simultaneously following their individual interests, so the environment is non-stationary from each player’s perspective. In these circumstances, the performance of a learning algorithm is often measured by its regret. However, no-regret algorithms are not all created equal in terms of game-theoretic guarantees: depending on how they are tuned, some of them may drive the system to an equilibrium, while others could produce cyclic, chaotic, or otherwise divergent trajectories. To account for this, we propose a range of no-regrets policies based on optimistic mirror descent, with the following desirable properties: i) they do not require any prior tuning or knowledge of the game; ii) they all achieve O(√T) regret against arbitrary, adversarial opponents; iii) they converge to the best response against convergent opponents; and, if employed by all players, iv) they guarantee O(1) social regret; and v) the induced sequence of play converges to Nash equilibrium while guaranteeing O(1) individual regret in all variationally stable games (a class of games that includes all monotone and convex-concave zero-sum games).