Learning in continuous time mean-field control problems
The theory and applications of mean-field games/control have stimulated a growing interest and generated important literature over the last decade since the seminal papers by Lasry/Lions and Caines, Huang, Malhamé. This talk will address some learning methods for numerically solving continuous time mean-field control problems, also called McKean-Vlasov control (MKV). In a first part, we consider a model-based setting, and present numerical approximation methods for the Master Bellman equation that characterises the solution to MKV, based on the one hand on particles approximation of the Master equation, and on the other hand on cylindrical neural networks approximation of functions defined on the Wasserstein space. The second part of the lecture is devoted to a model-free setting, a.k.a. reinforcement learning. We develop a policy gradient approach under entropy regularisation based on a suitable representation of the gradient value function with respect to parametrised randomised policies. This study leads to actor-critic algorithms for learning simultaneously and alternately value function and optimal policies. Numerical examples in a linear-quadratic mean-field setting illustrate our results.