ARCHIVES
-
2025
Jun 05ColloquiumCan large language models solve compositional tasks? A study of out-of-distribution generalization
Speaker(s): Yiqiao Zhong (University of Wisconsin—Madison)
-
2025
Apr 10Colloquiumhttps://gpcms.pku.edu.cn/cms/main#collapseTwoHow does gradient descent work?
Speaker(s): Jeremy Cohen (Flatiron Institute)
-
2025
Jun 05ColloquiumCan large language models solve compositional tasks? A study of out-of-distribution generalization
Speaker(s): Yiqiao Zhong (University of Wisconsin—Madison)
-
2025
Apr 22Freya page: First optimal time complexity for large-scale nonconvex finite-sum optimization with heterogeneous asynchronous computations
Speaker(s): Kaja Gruntkowska
-
2025
Apr 14Finetuning LLMs cost-efficiently
Speaker(s): Bingcong Li
-
2025
Apr 10ColloquiumHow does gradient descent work?
Speaker(s): Jeremy Cohen (Flatiron Institute)
-
2025
Mar 12ColloquiumRingmaster ASGD: The First Asynchronous SGD with Optimal Time Complexity
Speaker(s): Artavazd Maranjyan
-
2025
Feb 25ColloquiumUnderstanding LLMs through Statistical Learning
Speaker(s): Jingzhao Zhang (Tsinghua University)
-
2024
Nov 26ColloquiumThe First Optimal Parallel SGD in the Presence of Data, Compute and Communication Heterogeneity
Speaker(s): Prof. Peter Richtarik
-
2024
Nov 22ColloquiumProvably Efficient Adiabatic Learning for Quantum-Classical Dynamics
Speaker(s): Jinpeng Liu (Tsinghua University)