Learning to reflect: data-driven stochastic optimal control strategies for diffusions and Lévy processes
Theoretical solutions to stochastic optimal control problems are well understood in many scenarios, however their practicability suffers from the assumption of known dynamics of the underlying stochastic process. This raises the challenge of developing purely data-driven strategies, which we explore for ergodic singular control problems associated to continuous diffusions and Lévy processes. In case of diffusion processes, the primary challenge consists of solving an exploration/exploitation tradeoff based on a minimax optimal estimation procedure of the optimal reflection boundaries with data collected in the exploration periods. Even though for Lévy processes such exploration/exploitation problem does not occur due to spatial homogeneity of the process, in this scenario we face the statistical challenge of estimating a generator functional of a subordinator associated to the Lévy process which a) cannot be observed directly from the data and b) is non-ergodic in time. We solve this problem by considering a space/time transformation of the process in form of its overshoots such that we can work with a spatially ergodic process that allows the construction of an unbiased estimator of the generator functional determining the optimal reflection boundary. We compare the results of our statistical procedure with those from deep learning approaches.