Research Scientist at Google Research
PhD Student at Carnegie Mellon University
Machine Learning Department
shkarp [at] cs [dot] cmu [dot] edu
Hi! I'm a Research Scientist at Google Research and a final-year PhD student in the Machine Learning Department at CMU. I'm focused on building the theory of deep learning using a combination of mathematics and experiments. I have studied the optimization, generalization, and feature learning capabilities of various neural network architectures under different data modeling assumptions, and I am currently working on understanding and improving the training of Transformers for language (ranging from mathematical analysis to training industry-scale LLMs). More broadly, I'm motivated by trying to (1) understand the nature of intelligence (in both machines and humans), (2) use this understanding to improve our algorithms, and (3) ultimately unlock human-level (and beyond) machine intelligence.
At CMU, I am advised by Aarti Singh and often work with Yuanzhi Li. At Google Research, my collaborators have included Satyen Kale, Pranjal Awasthi, Mehryar Mohri, and Behnam Neyshabur (among many others).
Before grad school, I worked as a software engineer at Google NYC on search quality and the Google Assistant. Before that, I was an undergrad at Princeton, where I studied theoretical computer science and worked with Robert Tarjan and Mark Braverman. (Before that, I was a high school student at Thomas Jefferson High School for Science and Technology, where I had the most incredible teachers!)
See my CV for more details.