Machine Learning and Artificial Intelligence Seminar——Overcoming the curse of dimensionality: from nonlinear Monte Carlo to the training of neural networks
Abstract:
Partial differential equations (PDEs) are among the most universal tools used in modelling problems in nature and man-made complex systems. Nearly all traditional approximation algorithms for PDEs in the literature suffer from the so-called "curse of dimensionality" in the sense that the number of required computational operations of the approximation algorithm to achieve a given approximation accuracy grows exponentially in the dimension of the considered PDE. With such algorithms it is impossible to approximatively compute solutions of high-dimensional PDEs even when the fastest currently available computers are used. In the case of linear parabolic PDEs and approximations at a fixed space-time point, the curse of dimensionality can be overcome by means of Monte Carlo approximation algorithms and the Feynman-Kac formula. In this talk we present an efficient machine learning algorithm to approximate solutions of high-dimensional PDE and we also prove that deep artificial neural network (ANNs) do indeed overcome the curse of dimensionality in the case of a general class of semilinear parabolic PDEs. Moreover, we specify concrete examples of smooth functions which can not be approximated by shallow ANNs without the curse of dimensionality but which can be approximated by deep ANNs without the curse of dimensionality. In the final part of the talk we present some recent mathematical results on the training of neural networks.
Brief bio:
Arnulf Jentzen (*November 1983) is appointed as a presidential chair professor at the Chinese University of Hong Kong, Shenzhen (since 2021) and as a full professor at the University of Münster (since 2019). In 2004 he started his undergraduate studies in mathematics at Goethe University Frankfurt in Germany, in 2007 he received his diploma degree at this university, and in 2009 he completed his PhD in mathematics at this university. The core research topics of his research group are machine learning approximation algorithms, computational stochastics, numerical analysis for high dimensional partial differential equations (PDEs), stochastic analysis, and computational finance. Currently he serves in the editorial boards of several scientific journals such as the Annals of Applied Probability, Communications in Mathematical Sciences, the Journal of Machine Learning, the SIAM Journal on Scientific Computing, and the SIAM Journal on Numerical Analysis. In 2020 he was the recipient of the Felix Klein Prize of the European Mathematical Society (EMS), in 2022 he has been awarded an ERC Consolidator Grant from the European Research Council (ERC), and in 2022 he has been awarded the Joseph F. Traub Prize for Achievement in Information-Based Complexity. Further details on the activities of his research group can be found at the webpage http://www.ajentzen.de.
References:
[1] Becker, S., Jentzen, A., Müller, M. S., and von Wurstemberger, P., Learning the random variables in Monte Carlo simulations with stochastic gradient descent: Machine learning for parametric PDEs and financial derivative pricing. arXiv:2202.02717 (2022), 70 pages. Revision requested from Math. Financ.
[2] Gallon, D., Jentzen, A., Lindner, F., Blow up phenomena for gradient descent optimization methods in the training of artificial neural networks. arXiv:2211.15641 (2023), 84 pages.
[3] Gonon, L., Graeber, R., and Jentzen, A., The necessity of depth for artificial neural networks to approximate certain classes of smooth and bounded functions without the curse of dimensionality. ArXiv:2301.08284 (2022), 101 pages.
[4] Hutzenthaler, M., Jentzen, A., Kruse, T., and Nguyen, T. A., A proof that rectified deep neural networks overcome the curse of dimensionality in the numerical approximation of semilinear heat equations. Partial Differ. Equ. Appl. 1 (2020), no. 2, Paper no. 10, 34 pp.
[5] Jentzen, A. and Riekert, A., On the existence of global minima and convergence analyses for gradient descent methods in the training of deep neural networks. J. Mach. Learn. 1 (2022), no. 2, 141–246.