I am an Assistant Professor in the Computer Science Department at Carnegie Mellon University. I am also affiliated with the Machine Learning Department.
I aim to advance our scientific understanding of frontier models. I am particularly interested in uncovering the root causes of their failures and designing principled approaches to improve their reliability and safety.
I received my PhD from Stanford University in 2021 where I was fortunate to be advised by Percy Liang. My thesis won the Arthur Samuel Best Thesis award at Stanford. Previously, I obtained my BTech in Computer Science from IIT Madras in 2016.
My group's research is generously supported by Schmidt Sciences, Apple, Cisco, Google, OpenAI, NSF, and Open Philanthropy.
If you are a current CMU undergraduate or masters student interested in working with my group, please apply here .
I aim to advance our scientific understanding of frontier models. I am particularly interested in uncovering the root causes of their failures and designing principled approaches to improve their reliability and safety.
Our recent results have revealed important blind spots that impede the efficiency of data curation and quantization. We developed new frameworks to evaluate AI agent safety, distribution shifts, and identified pitfalls in post-training defenses against jailbreaks and context fidelity.
Current focus: My group is actively developing new pre-training approaches for improved safety, privacy, and reasoning by design. Stay tuned for our findings.
Preprints
-
Overtrained Language Models Are Harder to Fine-Tune
Jacob Mitchell Springer, Sachin Goyal, Kaiyue Wen, Tanishq Kumar, Xiang Yue, Sadhika Malladi, Graham Neubig, Aditi Raghunathan
ICLR 2025 Workshops Oral at SCOPE and Spotlight at ICBINB
Publications
-
Scaling Laws for Precision
Tanishq Kumar, Zachary Ankner, Benjamin F. Spector, Blake Bordelon, Niklas Muennighoff, Mansheej Paul, Cengiz Pehlevan, Christopher Ré, Aditi Raghunathan ICLR 2025 (Oral)
-
Context-Parametric Inversion: Why Instruction Finetuning May Not Actually Improve Context Reliance
Sachin Goyal, Christina Baek, Zico Kolter, Aditi Raghunathan ICLR 2025 (Oral)
-
Dissecting Adversarial Robustness of Multimodal Agents
Chen Henry Wu, Jing Yu Koh, Ruslan Salakhutdinov, Daniel Fried, Aditi Raghunathan
ICLR 2025 (Oral at NeurIPS 2024 Open World Agents Workshop)
-
Repetition Improves Language Model Embeddings
Jacob Mitchell Springer, Suhas Kotha, Daniel Fried, Graham Neubig, Aditi Raghunathan ICLR 2025
-
Theory of Agreement-on-the-Line in Linear Models and Gaussian Data Christina Baek, Aditi Raghunathan, Zico Kolter AISTATS 2025
-
Testing the Limits of Jailbreaking Defenses via the Purple Problem
Taeyoun Kim, Suhas Kotha, Aditi Raghunathan NeurIPS Safe Generative AI Workshop 2024
-
Predicting the Performance of Foundation Models via Agreement-on-the-Line
Rahul Saxena, Taeyoun Kim, Aman Mehra, Christina Baek, Zico Kolter, Aditi Raghunathan NeurIPS 2024
-
Test-Time Adaptation Induces Stronger Accuracy and Agreement-on-the-Line
Eungyeup Kim, Mingjie Sun, Christina Baek, Aditi Raghunathan, Zico Kolter NeurIPS 2024
-
Understanding Finetuning for Factual Knowledge Extraction
Gaurav Ghosal, Tatsunori Hashimoto, Aditi Raghunathan ICML 2024
-
Scaling Laws for Data Filtering: Data Curation cannot be Compute Agnostic
Sachin Goyal, Pratyush Maini, Zachary Chase Lipton, Aditi Raghunathan, Zico Kolter CVPR 2024 (Best Paper Award at ICLR 2024 DPFM Workshop)
-
Multitask Learning can Improve Worst-Group Outcomes
Atharva Kulkarni, Lucio M. Dery, Amrith Setlur, Aditi Raghunathan, Ameet Talwalkar, Graham Neubig TMLR 2024
-
Understanding Catastrophic Forgetting in Language Models via Implicit Inference
Suhas Kotha, Jacob Mitchell Springer, Aditi Raghunathan ICLR 2024
-
Sharpness-Aware Minimization Enhances Feature Quality via Balanced Learning
Jacob Springer, Vaishnavh Nagarajan, Aditi Raghunathan ICLR 2024
-
T-MARS: Improving Visual Representations by Circumventing Text Feature Learning
Pratyush Maini, Sachin Goyal, Zachary Lipton, Zico Kolter, Aditi Raghunathan ICLR 2024 (Oral at ICCV 2023 Datacomp Workshop)
-
Why is Sharpness-Aware Minimization Robust to Label Noise?
Christina Baek, Zico Kolter, Aditi Raghunathan ICLR 2024
-
Complementary Benefits of Contrastive Learning and Self-Training Under Distribution Shift
Saurabh Garg, Amrith Setlur, Zachary Lipton, Sivaraman Balakrishnan, Virginia Smith, Aditi Raghunathan NeurIPS 2023
-
Contextual Reliability: When Different Features Matter in Different Contexts
Gaurav Rohit Ghosal, Amrith Setlur, Daniel S. Brown, Anca Dragan, Aditi Raghunathan
ICML 2023
-
Automatically Auditing Large Language Models via Discrete Optimization
Erik Jones, Anca Dragan, Aditi Raghunathan, Jacob Steinhardt ICML 2023
-
Finetune like you pretrain: Improved finetuning of zero-shot vision
models
Sachin Goyal, Ananya Kumar, Sankalp Garg, Zico Kolter, Aditi Raghunathan CVPR 2023
-
Using language to extend to unseen domains
Lisa Dunlap, Clara Mohri, Devin Guillory, Han Zhang, Trevor Darrell, Joseph E. Gonzalez, Aditi Raghunathan, Anja Rohrbach ICLR 2023 (Spotlight)
-
Bitrate-Constrained DRO: Beyond Worst Case Robustness To Unknown Group Shifts
Amrith Setlur, Don Dennis, Benjamin Eysenbach, Aditi Raghunathan, Chelsea Finn, Virginia Smith, Sergey Levine ICLR 2023
-
Agreement-on-the-Line: Predicting the Performance
of Neural Networks under Distribution Shift
Christina Baek, Yiding Jiang, Aditi Raghunathan, Zico Kolter NeurIPS 2022 (Oral)
-
Test-time adaptation via conjugate pseudo-labels
Sachin Goyal, Mingjie Sun, Aditi Raghunathan, Zico Kolter NeurIPS 2022
-
Learning representations that enable generalization in assistive tasks
Jerry Zhi-yang He, Zackory Erickson, Daniel S. Brown, Aditi Raghunathan, Anca Dragan
CoRL 2022
-
Calibrated ensembles can mitigate accuracy tradeoffs under distribution shift
Ananya Kumar, Tengyu Ma, Percy Liang, Aditi Raghunathan
UAI 2022
-
Fine-tuning can Distort Pre-trained Features and Underperforms Out-of-Distribution
Ananya Kumar, Aditi Raghunathan, Robbie Jones, Tengyu Ma, Percy Liang
ICLR 2022 (Oral)
-
An Explanation of In-context Learning as Implicit Bayesian Inference
Sang Michael Xie, Aditi Raghunathan, Percy Liang,
Tengyu Ma
ICLR 2022
-
Accuracy on the Line: On the Strong Correlation Between Out-of-Distribution and In-Distribution Generalization
John Miller,
Rohan Taori,
Aditi Raghunathan,
Shiori Sagawa,
Pang Wei Koh,
Vaishaal Shankar ,
Percy Liang,
Yair Carmon*,
Ludwig Schmidt
ICML 2021
-
Just Train Twice: Improving Group Robustness without Training Group Information
Evan Liu* ,
Behzaad Haghgoo*,
Annie Chen*,
Aditi Raghunathan,
Pang Wei Koh,
Shiori Sagawa,
Percy Liang,
Chelsea Finn
ICML 2021
-
Explore then Execute: Adapting without Rewards via Factorized Meta-Reinforcement Learning
Evan Liu,
Aditi Raghunathan,
Percy Liang,
Chelsea Finn
ICML 2021
-
Enabling certification of verification-agnostic networks via memory-efficient semidefinite programming
Sumanth Dathathri*,
Krishnamurthy Dvijotham*,
Alexey Kurakin*,
Aditi Raghunathan*,
Jonathan Uesato,
Rudy Bunel,
Shreya Shankar,
Jacob Steinhardt,
Ian Goodfellow,
Percy Liang,
Pushmeet Kohli
NeurIPS 2020
-
The Pitfalls of Simplicity Bias in Neural Networks
Harshay Shah,
Kaustav Tamuly,
Aditi Raghunathan,
Prateek Jain,
Praneeth Netrapalli
NeurIPS 2020
-
DROCC: Deep Robust One-Class Classification
Sachin Goyal,
Aditi Raghunathan,
Moksh Jain,
Harsha Vardhan Simhadri ,
Prateek Jain
ICML 2020
-
An Investigation of Why Overparameterization Exacerbates Spurious Correlations
Shiori Sagawa*,
Aditi Raghunathan*,
Pang Wei Koh*,
Percy Liang
ICML 2020
-
Understanding and Mitigating the Tradeoff Between Robustness and Accuracy
Aditi Raghunathan*,
Sang Michael Xie*,
Fanny Yang ,
John Duchi and
Percy Liang
ICML 2020
-
Robust Encodings: A Framework for Combating Adversarial Typos
Erik Jones,
Robin Jia*,
Aditi Raghunathan*,
Percy Liang
ACL 2020
-
Adversarial Training Can Hurt Generalization
Aditi Raghunathan*,
Sang Michael Xie*,
Fanny Yang ,
John Duchi and
Percy Liang
Identifying and Understanding Deep Learning Phenomena ICML 2019 Workshop
-
Unlabeled Data Improves Adversarial Robustness
Yair Carmon*, Aditi Raghunathan*,
Ludwig Schmidt ,
Percy Liang and
John Duchi
NeurIPS 2019
-
Certified robustness to adversarial word substitutions
Robin Jia, Aditi Raghunathan,
Kerem Göksel,
Percy Liang
EMNLP 2019
-
Maximum Weighted Loss Discrepancy
Fereshte Khani , Aditi Raghunathan and
Percy Liang
SafeML ICLR 2019 Workshop
-
Semidefinite relaxations
for certifying robustness to adversarial examples
Aditi Raghunathan, Jacob Steinhardt and
Percy Liang
NeurIPS 2018
-
Certified Defenses Against Adversarial Examples
Aditi Raghunathan, Jacob Steinhardt and
Percy Liang
ICLR 2018
-
Learning mixture of gaussians with streaming data
Aditi Raghunathan, Prateek Jain and
Ravishankar Krishnaswamy
NeurIPS 2017
-
Estimating the unseen from multiple populations
Aditi Raghunathan, Greg Valiant and
James Zou
ICML 2017
-
Estimation from indirect supervision with linear moments
Aditi Raghunathan, Roy Frostig,
John Duchi and
Percy Liang
ICML 2016
-
A Reinforcement Learning approach to online learning of decision trees
Abhinav Garlapati*, Aditi Raghunathan*,
Vaishnavh Nagarajan* and
Balaraman Ravindran
EWRL 2015
-
Probabilitistic dependency networks for prediction and diagnostics
Narayanan U. Edakunni, Aditi Raghunathan, Abhishek Tripathi, John Handley and Fredric Roulland
TRB Annual Meeting 2014
PhD advisees
Undergraduate and Master's advisees
- Taeyoun Kim
- Charles Ding
- Rishi Shah
- Jerick Shi (co-advised with Vince Conitzer)
I am fortunate to also collaborate with several masters students and PhD students at CMU who I do not directly advise.
Alumni
- Suhas Kotha (MS 2024, now PhD student at Stanford)
- Janet Hsieh (MS 2024, now software engineer at Syllo)
- Aman Mehra (MS 2024, will be PhD student at MILA)
- Erik Jones (MS 2020, now PhD student at Berkeley)