Stefani Karp

Hi! I'm a Research Scientist at Google Research and a final-year PhD student in the Machine Learning Department at CMU. I'm focused on building the theory of deep learning using a combination of mathematics and experiments. I have studied the optimization, generalization, and feature learning capabilities of various neural network architectures under different data modeling assumptions, and I am currently working on understanding and improving the training of Transformers for language (ranging from mathematical analysis to training industry-scale LLMs). More broadly, I'm motivated by trying to (1) understand the nature of intelligence (in both machines and humans), (2) use this understanding to improve our algorithms, and (3) ultimately unlock human-level (and beyond) machine intelligence.

At CMU, I am advised by Aarti Singh and often work with Yuanzhi Li. At Google Research, my collaborators have included Satyen Kale, Pranjal Awasthi, Mehryar Mohri, and Behnam Neyshabur (among many others).

Before grad school, I worked as a software engineer at Google NYC on search quality and the Google Assistant. Before that, I was an undergrad at Princeton, where I studied theoretical computer science and worked with Robert Tarjan and Mark Braverman. (Before that, I was a high school student at Thomas Jefferson High School for Science and Technology, where I had the most incredible teachers!)

See my CV for more details.

Papers & Publications

Landscape-Aware Growing: The Power of a Little LAG. Stefani Karp*, Nikunj Saunshi*, Sobhan Miryoosefi, Sashank J. Reddi, Sanjiv Kumar. Preprint, 2024.
Role of Locality and Weight Sharing in Image-Based Tasks: A Sample Complexity Separation between CNNs, LCNs, and FCNs. Aakash Lahoti, Stefani Karp, Ezra Winston, Aarti Singh, Yuanzhi Li. ICLR 2024 (Spotlight).
Learning from Setbacks: The Impact of Adversarial Initialization on Generalization Performance. Kavya Ravichandran, Yatin Dandi, Stefani Karp, Francesca Mignacco. NeurIPS 2023 Workshop on Mathematics of Modern Machine Learning.
Provable Gradient-Descent-Based Learning of Decision Lists by Transformers. Stefani Karp, Pranjal Awasthi, Satyen Kale. DeepMath 2023 (Contributed Talk). YouTube.
Efficient Training of Language Models using Few-Shot Learning. Sashank J. Reddi, Sobhan Miryoosefi, Stefani Karp, Shankar Krishnan, Satyen Kale, Seungyeon Kim, Sanjiv Kumar. ICML 2023.
Agnostic Learnability of Halfspaces via Logistic Loss. Ziwei Ji, Kwangjun Ahn, Pranjal Awasthi, Satyen Kale, Stefani Karp. ICML 2022.
Local Signal Adaptivity: Provable Feature Learning in Neural Networks Beyond Kernels. Stefani Karp, Ezra Winston, Yuanzhi Li, Aarti Singh. NeurIPS 2021.
PAC-Bayes Learning Bounds for Sample-Dependent Priors. Pranjal Awasthi, Satyen Kale, Stefani Karp, Mehryar Mohri. NeurIPS 2020.
On the Algorithmic Stability of SGD in Deep Learning. Stefani Karp, Behnam Neyshabur, and Mehryar Mohri. 2020.

Google Scholar.

Other interests: longevity, automated science, consciousness, philosophy, psychology, creative writing, puns. I want to understand the way the world works and harness this understanding to bring about incredible, transformative scientific and technological change. AGI and longevity (i.e., defeating aging) are two such examples. Before embarking on my current research journey, I also considered studying either quantum complexity or consciousness.

Papers & Publications

Awards

More