Probability Colloqium
RUD 25; 1.115
Huyen Pham (Université Paris Cité)

Learning in continuous time mean-field control problems

The theory and applications of mean-field games/control have stimulated a growing interest and generated important literature over the last decade since the seminal papers by Lasry/Lions and Caines, Huang, Malhamé. This talk will address some learning methods for numerically solving  continuous time mean-field control problems, also called McKean-Vlasov control (MKV). In a first part, we consider a model-based setting, and present  numerical approximation methods for the Master Bellman equation that characterises the solution to MKV, based on the one hand on particles approximation of the Master equation, and on the other hand on cylindrical neural networks approximation of functions defined on the Wasserstein space. The second part of the lecture is devoted to  a model-free setting, a.k.a. reinforcement learning. We develop a policy gradient approach under entropy regularisation based on a suitable representation of  the gradient value function with respect to parametrised randomised policies. This study leads to actor-critic algorithms for learning simultaneously and alternately value function and optimal policies. Numerical examples in a linear-quadratic mean-field setting illustrate our results.