Greetings!

I am a PhD student at Mila, Quebec AI Institute affiliated with Université de Montréal (UdeM), under the supervision of Ioannis Mitliagkas. I am expecting to graduate in March 2025.

The core of my PhD research focused on first-order optimization for deep learning (both theoretical and empirical). I am convinced that the shortcomings of optimization theory in explaining modern practice is due to over-reliance on inadequate assumptions. I have seeked to improve our understanding of the geometrical properties of empirical objective functions to improve the theoretical frameworks through which they are studied. I have also conducted many application-oriented projects on a variety of topics, such as pretraining a 7b LLM on 1.2T tokens, out-of-distribution detection, current prediction in aircrafts, adversarial robustness, detection on images, etc.

During my PhD, I also had the priviledge to intern at ServiceNow Research with Joao Monteiro and Torsten Scholak, and at Apple MLR with Eugène Ndiaye. Before starting my PhD, I was a visiting scholar for a year at UC Berkeley’s EECS department under the supervision of Alexandre Bayen. I completed my Masters in applied mathematics at École Normale Supérieure de Paris-Saclay.

On top of my research, I have been a co-organizer for the 4th Neural Scaling Laws Workshop, a teaching assistant for UdeM’s IFT3395 and IFT6390, HEC Montreal’s MATH80629A and Edulib’s SD1FR MOOC. I also co-supervised the internship of three students from Mila’s professional masters program.

Moving forward, I intend to focus my effort on research grounded in practical applications, seeking to achieve immediate improvements over existing methods.